Email Encoding: Setting Content-Type and HTML Special Characters
Is the process of encoding emails with special characters and symbols giving you a struggle?
It’s a problem many email developers deal with, especially if some text for the body of the email gets pasted in from a word processor. Symbols and characters may also cause display issues in emails if you’re coding emails for different languages.
Setting the content-type is the most important factor connected to how email clients display text that includes special characters, including non-Latin-based languages such as those spoken in Asia as well as Hebrew, Arabic, and Greek. But email developers can also take matters into their own hands when it makes sense.
If you’re seeing red X’s, question marks in boxes, or random funky characters that shouldn’t be in your emails, we’re going to take a closer look at what’s happening and ways to fix it. Let’s start by answering some key questions.
What is character encoding?
Character encoding is the process of writing or coding specific symbols in HTML so they appear as intended regardless of the device, web browser, or email client. This is done by assigning a number or code to When you don’t have proper email encoding, that’s when you see unexpected symbols and empty boxes.
A specific group of characters is contained in a character set (charset=” “ in HTML code). Each of those characters is represented by a piece of code that is used as a key to reproduce the character on a screen.
The Unicode encoding known as UTF-8 is the most popular and reliable way to define special characters and symbols on the web and in emails as well as other forms of electronic communication. You can set your entire email to use UTF-8 character encoding, which we’ll look at later.
But first, we should mention there is one way to make sure all special characters and symbols in your emails display as intended.
Bulletproof HTML email encoding
If you want to be 100% certain that all text renders correctly in the subject line and body of your emails, the safest solution is to convert special characters and symbols to their HTML entities.
As you likely already know, nearly every symbol and special character has its own entity name and number. When placed in the code, the entity produces the symbol or character.
These entities always begin with an ampersand and end with a semicolon. For example, the HTML entity is used for non-breaking spaces. Using HTML entities for greater and less than symbols (< and >) can be helpful because they could be confused with tags in the code.
Here are some common special characters and their HTML entities:
|Character||Entity name||Entity number|
|® registered trademark||®||®|
|£ British pound||£||£|
|¡ inverted exclamation mark||¡||¡|
There are also HTML entities for email encoding that entail symbols from non-Latin languages such as Chinese or Greek. Characters from these languages can also be represented by Unicode. Here are a few examples:
|光 Chinese symbol for light||光||U+5149|
|א Hebrew letter aleph||א||U+05D0|
|Δ Greek letter delta||Δ||U+0394|
|ॐ Sanskrit symbol for Om||ॐ||U+0950|
Even if you don’t have a global email marketing strategy, you never know when you may need to include a symbol through email encoding of special characters.
Of course, if you plan to convert special characters in an HTML email manually you’ll need to figure out the HTML entity. Thankfully, we’ve got a free tool for that!
Email on Acid by Sinch built an HTML character converter to make your job a little easier. Just paste in your special characters and symbols, hit a button, and get the HTML entities you need to convert them in the email body.
What is content-type?
When taking on the task of email encoding, you can also tell browsers and email clients how you expect them to interpret different types of characters by setting the content-type. Even if you are converting individual special characters to HTML entities, it’s still smart to define the content-type of your email.
You set the content-type by choosing the appropriate character set. The two most common character sets are UTF-8 and ISO-8859-1. However, in almost all cases, you should be using UTF-8 as an email’s content-type.
Why is UTF-8 best for character encoding in email?
While there are many ways to encode characters, UTF-8 has become an international standard thanks to how comprehensive it is. UTF-8 is capable of encoding more than 1,112,000 different characters. That includes every written language, math symbols, musical notations, and even the emojis used in email marketing.
The problem with using the ISO-8859-1 character set is that it only accounts for Latin characters and symbols, which excludes many Eastern symbols and glyphs. The image below shows how such characters fail to display correctly using ISO-8859-1 compared to UTF-8
The result on the right includes a bunch of jumbled text, which is the ISO-8859-1 interpretation of the symbols. Notice, however, when the HTML entity is used, special characters display correctly for both UTF-8 and ISO-8859-1. That’s what makes that method foolproof.
Where to designate content-type in emails
If you’re developing a website, you can designate content-type or specify the character set in a meta tag using code like this:
<meta http-equiv="Content-Type" content="text/html charset=UTF-8" />
But as you may be aware, there are some major differences between web and email development. And this is yet another one. Email clients will ignore the content-type in the meta-tag. Instead, email clients always refer to what content-type is set in the email header.
The email header provides a bunch of technical information about the message, including the date, sender name (from:) and recipient (to:) as well as things like email authentication details. When an email client sees the character set UTF-8 defined in the head, it knows how to interpret characters throughout the message.
How to set content-type in email headers
Below you’ll see a code snippet of a particular email header that includes the content-type set to UTF-8 at the bottom:
Date: Wed, 15 Dec 2021 12:45:55 -0700 To: email@example.com From: firstname.lastname@example.org Subject: UTF-8 Message-ID: <email@example.com> X-Priority: 3 X-Mailer: EOAMailer 5.0.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/html; charset="UTF-8"
Setting content-type and defining a character set is important for the readability and accessibility of your emails. It ensures nothing breaks the reading pattern for a subscriber, whether the subscriber is reading the email herself or using a screen reader with emails.
That’s why Email on Acid by Sinch includes the ability to set or change the content-type in the accessibility settings of the Campaign Precheck automated workflow. With the click of a button, the platform makes sure the right code is added to the email header.
Because email service providers (ESPs) set content-type in the header, we have another layer of complexity added to our email development. If necessary, contact your ESP and ask them what content-type they set in the header when sending the emails. Once you know the content-type, use that value in your HTML meta tag when designing the email.
Otherwise, you can always rely on the bulletproof method of converting special characters and symbols to their corresponding HTML entity.
Email client support for content-type
Another reason to exclusively use UTF-8 for email encoding is that it is widely supported by major email clients. Nearly every page on the internet uses UTF-8 for encoding as do email clients.
We ran tests to see what clients did when the content-type was specified in the header and found that Gmail was the only outlier. No matter what was designated in the email header, Gmail is the only client that automatically converts your text to UTF-8. That includes in iOS when people use Gmail on iPhones or iPads.
One interesting action we noticed: The web-based clients convert your text to the content-type character set before displaying it in a web browser. We were able to check this by viewing what content-type they were setting in their meta tags. As it turns out, most of them are using UTF-8.
Is your email encoding working?
If your email marketing campaigns tend to include text for a variety of languages, special characters, or symbols, how can you be confident they’re displaying correctly.
The answer? You’ve got to run tests and check out email previews to know for sure.
While you could send manual tests to different types of inboxes, devices, and browsers, Email on Acid provides a streamlined way to test your email encoding and a wide variety of other factors. That includes deliverability, inbox display, Outlook rendering issues, dark mode emails, and more.
Email on Acid’s mission is to simplify the complexities of email marketing. That’s why we create tools like the handy HTML character converter. And, it’s why we’re trusted by email geeks around the world.
This post was updated in February 2022. It was also updated in January 2019 and February 2017 and first published in 2011.
Author: The Email on Acid Team
The Email on Acid content team is made up of digital marketers, content creators, and straight-up email geeks. Connect with us on LinkedIn, follow us on Facebook, and tweet at @EmailonAcid on Twitter for more sweet stuff and great convos on email marketing.