content-type

Email Encoding: Setting Content-Type and HTML Special Characters

28

Is the process of encoding emails with special characters and symbols giving you a struggle?

It’s a problem many email developers deal with, especially if some text for the body of the email gets pasted in from a word processor. Symbols and characters may also cause display issues in emails if you’re coding emails for different languages. 

Setting the content-type is the most important factor connected to how email clients display text that includes special characters, including non-Latin-based languages such as those spoken in Asia as well as Hebrew, Arabic, and Greek. But email developers can also take matters into their own hands when it makes sense.

If you’re seeing red X’s, question marks in boxes, or random funky characters that shouldn’t be in your emails, we’re going to take a closer look at what’s happening and ways to fix it. Let’s start by answering some key questions. 

What is character encoding?

Character encoding is the process of writing or coding specific symbols in HTML so they appear as intended regardless of the device, web browser, or email client. This is done by assigning a number or code to When you don’t have proper email encoding, that’s when you see unexpected symbols and empty boxes.

A specific group of characters is contained in a character set (charset=” “  in HTML code). Each of those characters is represented by a piece of code that is used as a key to reproduce the character on a screen.

The Unicode encoding known as UTF-8 is the most popular and reliable way to define special characters and symbols on the web and in emails as well as other forms of electronic communication. You can set your entire email to use UTF-8 character encoding, which we’ll look at later.

But first, we should mention there is one way to make sure all special characters and symbols in your emails display as intended.

Bulletproof HTML email encoding

purple bullet icon

If you want to be 100% certain that all text renders correctly in the subject line and body of your emails, the safest solution is to convert special characters and symbols to their HTML entities.

As you likely already know, nearly every symbol and special character has its own entity name and number. When placed in the code, the entity produces the symbol or character.

These entities always begin with an ampersand and end with a semicolon. For example, the &nbsp; HTML entity is used for non-breaking spaces. Using HTML entities for greater and less than symbols (< and >) can be helpful because they could be confused with tags in the code. 

Here are some common special characters and their HTML entities:

CharacterEntity nameEntity number
& ampersand&amp;&#38;
® registered trademark&reg;&#174;
£ British pound&pound;&#163;
¡ inverted exclamation mark&iexcl;&#161;

There are also HTML entities for email encoding that entail symbols from non-Latin languages such as Chinese or Greek. Characters from these languages can also be represented by Unicode. Here are a few examples:

CharacterEntity numberUnicode
光 Chinese symbol for light&#20809;U+5149
א Hebrew letter aleph&#1488;U+05D0
Δ Greek letter delta&#916;U+0394
ॐ Sanskrit symbol for Om&#2384;U+0950

Even if you don’t have a global email marketing strategy, you never know when you may need to include a symbol through email encoding of special characters.

Of course, if you plan to convert special characters in an HTML email manually you’ll need to figure out the HTML entity. Thankfully, we’ve got a free tool for that!

Email on Acid by Sinch built an HTML character converter to make your job a little easier. Just paste in your special characters and symbols, hit a button, and get the HTML entities you need to convert them in the email body.

Try the Character Converter

What is content-type?

When taking on the task of email encoding, you can also tell browsers and email clients how you expect them to interpret different types of characters by setting the content-type. Even if you are converting individual special characters to HTML entities, it’s still smart to define the content-type of your email.

You set the content-type by choosing the appropriate character set. The two most common character sets are UTF-8 and ISO-8859-1. However, in almost all cases, you should be using UTF-8 as an email’s content-type.

Why is UTF-8 best for character encoding in email?

While there are many ways to encode characters, UTF-8 has become an international standard thanks to how comprehensive it is. UTF-8 is capable of encoding more than 1,112,000 different characters. That includes every written language, math symbols, musical notations, and even the emojis used in email marketing

The problem with using the ISO-8859-1 character set is that it only accounts for Latin characters and symbols, which excludes many Eastern symbols and glyphs. The image below shows how such characters fail to display correctly using ISO-8859-1 compared to UTF-8

UTF-8 vs ISO character encoding

The result on the right includes a bunch of jumbled text, which is the ISO-8859-1 interpretation of the symbols. Notice, however, when the HTML entity is used, special characters display correctly for both UTF-8 and ISO-8859-1. That’s what makes that method foolproof.

Where to designate content-type in emails

If you’re developing a website, you can designate content-type or specify the character set in a meta tag using code like this:

<meta http-equiv="Content-Type"  content="text/html charset=UTF-8" />

But as you may be aware, there are some major differences between web and email development. And this is yet another one. Email clients will ignore the content-type in the meta-tag. Instead, email clients always refer to what content-type is set in the email header.

The email header provides a bunch of technical information about the message, including the date, sender name (from:) and recipient (to:) as well as things like email authentication details. When an email client sees the character set UTF-8 defined in the head, it knows how to interpret characters throughout the message.

How to set content-type in email headers

Below you’ll see a code snippet of a particular email header that includes the content-type set to UTF-8 at the bottom:

Date: Wed, 15 Dec 2021 12:45:55 -0700
To: test@test.com
From: testfrom@test.com
Subject: UTF-8
Message-ID: <e06deabc923f3378e4b237a20be324cc@www.test.com>
X-Priority: 3
X-Mailer: EOAMailer 5.0.0
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/html; charset="UTF-8"

Setting content-type and defining a character set is important for the readability and accessibility of your emails. It ensures nothing breaks the reading pattern for a subscriber, whether the subscriber is reading the email herself or using a screen reader with emails.

That’s why Email on Acid by Sinch includes the ability to set or change the content-type in the accessibility settings of the Campaign Precheck automated workflow. With the click of a button, the platform makes sure the right code is added to the email header.

Setting content-type to UTF-8 in the Email on Acid platform.

Because email service providers (ESPs) set content-type in the header, we have another layer of complexity added to our email development. If necessary, contact your ESP and ask them what content-type they set in the header when sending the emails. Once you know the content-type, use that value in your HTML meta tag when designing the email.

Otherwise, you can always rely on the bulletproof method of converting special characters and symbols to their corresponding HTML entity.

Email client support for content-type

Another reason to exclusively use UTF-8 for email encoding is that it is widely supported by major email clients. Nearly every page on the internet uses UTF-8 for encoding as do email clients.

We ran tests to see what clients did when the content-type was specified in the header and found that Gmail was the only outlier. No matter what was designated in the email header, Gmail is the only client that automatically converts your text to UTF-8. That includes in iOS when people use Gmail on iPhones or iPads.

One interesting action we noticed: The web-based clients convert your text to the content-type character set before displaying it in a web browser. We were able to check this by viewing what content-type they were setting in their meta tags. As it turns out, most of them are using UTF-8.

Is your email encoding working?

If your email marketing campaigns tend to include text for a variety of languages, special characters, or symbols, how can you be confident they’re displaying correctly.

The answer? You’ve got to run tests and check out email previews to know for sure.

While you could send manual tests to different types of inboxes, devices, and browsers, Email on Acid provides a streamlined way to test your email encoding and a wide variety of other factors. That includes deliverability, inbox display, Outlook rendering issues, dark mode emails, and more.

Email on Acid’s mission is to simplify the complexities of email marketing. That’s why we create tools like the handy HTML character converter. And, it’s why we’re trusted by email geeks around the world.

Try the Character Converter

This post was updated in February 2022. It was also updated in January 2019 and February 2017 and first published in 2011.

Do More in Less Time with Email on Acid

Stop switching back and forth between platforms during pre-deployment and QA. With Email on Acid you can find and fix problems all in one place. Double check everything from content to accessibility and deliverability. Plus, with accurate Email Previews on the most popular clients and devices, you can confidently deliver email perfection every time.

Start for Free

Author: The Email on Acid Team

The Email on Acid content team is made up of digital marketers, content creators, and straight-up email geeks.

Connect with us on LinkedIn, follow us on Facebook, and tweet at @EmailonAcid on Twitter for more sweet stuff and great convos on email marketing.

Author: The Email on Acid Team

The Email on Acid content team is made up of digital marketers, content creators, and straight-up email geeks.

Connect with us on LinkedIn, follow us on Facebook, and tweet at @EmailonAcid on Twitter for more sweet stuff and great convos on email marketing.

28 thoughts on “Email Encoding: Setting Content-Type and HTML Special Characters”

  1. This is the kind of help we dream about. Thanks for this helpful and informative explanation of the real deal.

  2. Hi, I have been trying to sent a question via the Contact Us section but I get an error stating I don’t have authorisation. Please investigate.

    Many thanks,
    Keiron

  3. Thank you!

    I’ve got to send some emails in 32 languages and was worrying about charset hassles.

    🙂

  4. Great piece of info! It’s quite impressive post. Love this explanation. 🙂 Thanks for this allocation.

  5. After reading the article, I sent Chinese characters emails I’ve received with my aol webmail to a gmail account. Those characters that are wrongly display as ? in the aol webmail account still ends up as ? in the gmail account. Seems gmail is not the solution.

  6. Hi, Everyone..
    I am designing Pop3 client . I want character set of Body part. But in some cases for particular mail client I am getting nothing as a Charset value.
    I am worried what character set to be consider.
    from a mail client I am getting nothing as a character set value for any language body part.
    Help me to decide character set.!!

  7. Ironically, this page’s encoding is broken. Chinese characters are corrupted but accented latin ones are not, suggesting it’s being delivered as ISO-8859-1 even though it’s declaring UTF-8!

  8. I’ve just run into this with a blast vendor requesting we encode all emails using Western(ISO Latin 1) which I discovered is ISO-8859-1. I had always used UTF-8 and honestly never thought that it could matter.

    It’s strange because I cannot use HTML entities with #’s for them like this: © I can only use the entity name © Otherwise when they send tests my symbols come up as ? or something else strange.

    Thanks for this explanation.

  9. Hi,
    I am using REST in my application which is using Content-Type:application/x-www-form-urlencoded;
    When I am calling a post method in chrome its working fine , but when I am trying this in firefox.
    An extra content type added like
    Content-Type:application/x-www-form-urlencoded; charset=UTF-8;
    And then POST service is not working as expected. So could you tell me what should I need to do.

    Thanks,

  10. Good post. I study something more challenging on completely different blogs everyday. It is going to all the time be stimulating to learn content material from different writers and practice somewhat one thing from their store. I ggddckfdkdgd

  11. Thanks for article and comments, we understood that we were using a wrong encoding when sending out our emails.

  12. Thanks for this article. 6 years ago we did that wrong and receive emails with unreadable content. Do you have an idea how we can still read it?

  13. EoA Character Encoder is not the best tool. Years ago I discovered that I actually need three tools in one: 1) HTML character encoder; 2) invisible character cleaner/remover; 3) English language style improvement tool. Ideally, in a shape of NPM package which I can use in Gulp, with optional front end, driven by browserified version of it. That’s how detergent.io came to life. As a critique point, EoA Character Encoder encodes emoji incorrectly, doesn’t decode HTML and can’t adjust between HTML or XHTML closing slashes on BR’s. Detergent.io does all this. Another important point is, encoded entities should be named, not numeric, so you could read them. For example, £ is better than £ when you are checking text. That’s another difference between Detergent.io and EoA Character Converter.

  14. I didn’t see Apple Mail/Mac Mail referenced in the list. One recipient gets mail at sbcglobal.net and reads it on his desktop through Apple/Mac mail. Do you know how that would handle content type?

  15. I use Salesforce Email Studio (previously Exact Target). It does character conversion for you (inserts entities). The content type is set within HTML template (in the meta like for a web page; I believe this is for a web version of the email) and when building an email (this I believe is for the email header as you must set it regardless the meta in the template).
    I tested that HTML entities do depend on the character set settings (aka Language settings / email header meta) so entities from outside a character standard/set won’t show properly unless the character set is properly set up.
    This still depends on the tool you use – there is no point in setting anything else but the UTF-8.

  16. If email clients only pay attention to the content type set in th e header then how do they deal with multipart messages and in particular the different statements text/html and text/plain?
    How can one statement be a solution for all in multiparts?

    1. Hi Jon –

      Great question! There really is no *one* solution, but UTF-8 provides a solution for many. Per W3.org: “The more widely a character encoding is used, the better the chance that a browser [or in this case, email] will understand it.”

      UTF-8 is a good choice because it can support several languages, which means it can accommodate pages and forms that may have a mixture of those languages. It also reduces complexity when dealing with a multilingual site or application, because it eliminates the need for server-side logic to individually determine the encoding for each page or form submission.

      More info is available here: https://www.w3.org/International/questions/qa-choosing-encodings

      I hope that helps! Thanks for reading.

Comments are closed.