Open In App

Explain different kinds of character set available in HTML

Last Updated : 18 May, 2022
Comments
Improve
Suggest changes
Like Article
Like
Report

Explain the different kinds of character sets available in HTML?

Before looking at different kinds of character sets available in HTML, let us first know what character sets in HTML actually are.

HTML Character Sets: Have you ever wondered how the browser displays the numbers, alphabets, and other symbols precisely? It is possible with the help of a particular Character Set. 

Have you ever wondered how the browser displays numbers, alphabets, and other symbols precisely? It is possible with the help of a particular Character Set. 

It is specified inside the <meta> tag.

<meta charset="UTF-8">

Different kinds of character set available in HTML

There have been different character sets available over time for the web. Let's understand the different kinds of character sets available in HTML.

ASCII: The first and the most common character encoding format is ASCII(American Standard Code for Information Interchange). ASCII has defined 128 different alphanumeric characters that are numbers(0-9), lower(a-z) and upper case(A-Z) alphabet and some special characters like + - $ () @ etc. It represented 128 different characters as it used only 7 bits to store characters. The disadvantage with ASCII is that it excludes non-English letters.

Syntax:

<meta charset="ASCII">

The below table shows some of the 128 ASCII characters and their equivalent numbers -

        Char                Number                Description        
 32Space
!33Exclamatory mark
""34Quotation mark
#35Hash sign
$36dollar sign
%37Percent sign
&38ampersand
'39apostrophe
(40Left parenthesis
)41Right parenthesis
*42Asterisk
250Number 2
351Number 3
452Number 4
65AUppercase A
66BUppercase B
75KUppercase K
89YUppercase Y
90ZUppercase Z
97alowercase a
98blowercase b
107klowercase k
121ylowercase y
122zlowercase z
126~tilde

Example: This example shows how to use ASCII character set and the characters are printed using ASCII character set.

HTML
<!DOCTYPE html>
<html>
  
<head>
    <meta charset="ASCII">
    <title>ASCII character set</title>
    <link rel="stylesheet" href="style.css">
</head>

<body>
    <div>
        
<p>GeeksforGeeks</p>

        
<p>ASCII character set </p>

        
<p>! , [ , A </p>

    </div>
</body>

</html>

Output:

 

ISO-8859-1: The default character set used in HTML4. It supported 256 different character codes.  The ISO (International Standards Organization) defines the standard character sets for different languages/alphabets. It is an extension to ASCII with some additional international characters. For values 0 to 127, ISO-8859-1 is identical to ASCII and for values from 160 to 255, it is identical to UTF-8.

Note: The characters from 128 to 159 are not defined in ISO-8859-1.

Syntax: 

<meta charset="ISO-8859-1">

The below table shows some of the ISO-8859-1 characters and their equivalent numbers -

        Character                Entity Name                Entity Number                Description        
¢&cent; ¢cent
¦&brvbar;¦broken vertical bar
©&copy;©copyright
®&reg;®registered trademark
¼&frac14;¼fraction 1/4
Ë&Euml;Ëcapital e, umlaut mark
à&agrave;àsmall a , grave accent
þ&thorn;þsmall thorn, Icelandic

Example: This example shows how to use the ISO-8859-1 character set and the characters are printed using ISO-8859-1 character set.

HTML
<!DOCTYPE html>
<html>

<head>
    <meta charset="ISO-8859-1">
    <title>ISO-8859-1 character set</title>
    <link rel="stylesheet" href="style.css">
</head>

<body>
    <div>
        
<p>GeeksforGeeks</p>

        
<p>ISO-8859-1 character set</p>

        
<p>Ë , ¦ , þ</p>

    </div>
</body>

</html>

Output:

 

ANSI (Windows-1252): The ANSI(Windows-1252) was the default character set in Windows, up to the Windows95 and the most popular character set too in windows around 1985 to 1990. It is an extension of the ASCII character set and almost identical to ISO-88591-1. It uses 8 bits as it has to store 256 different characters. This character set is supported by almost all the browsers.

Syntax:

<meta charset="ANSI">

The below table shows some of the ANSI(Windows-1252) characters and their equivalent numbers -

        Character                Number                Entity Name                Description        
!33 Exclamatory mark
&38&amp;ampersand
048 digital zero
G71 Latin uppercase letter G
¼188&frac14;vulgar fraction one quarter
©169&copy;Copyright sign
þ254&thorn;Latin small letter thorn
ø248&oslash;Latin small letter 0 with stroke

Example: This example shows how to use ANSI character set and the characters are printed using ANSI character set.

HTML
<!DOCTYPE html>
<html>
  
<head>
    <meta charset="ANSI">
    <title>ANSI(Windows-1252) character set</title>
    <link rel="stylesheet" href="style.css">
</head>

<body>
    <div>
        
<p>GeeksforGeeks</p>

        
<p>ANSI(Windows-1252) character set</p>

       
<p>ø , &frac14; , þ</p>

    </div>
</body>

</html>

Output:

 

UTF-8:  The Unicode Standard was developed by the Unicode Consortium mainly the UTF-8 and UTF-16. The issue with other character sets is that they are limited, and are not compatible in a multilingual environment. It contains almost all the characters, punctuation, and symbols. Developers are encouraged to use the UTF-8 character set by the HTML5 specification.

Syntax: 

<meta charset="UTF-8">

The below table tells some of the UTF-8 character codes that are supported by HTML5 -

          Character Codes                    Hexadecimal                    Decimal          
Latin Extended-A0100-017F256-383
Greek and Coptic0370-03FF88-1023
Arrows2190-21FF8592-8703
Block ELements2580-259F9600-9631

Example: This example shows how to use UTF-8 character set and the characters are printed using UTF-8 character set.

HTML
<!DOCTYPE html>
<html>
  
<head>
    <meta charset="UTF-8">
    <title>UTF-8 character set</title>
    <link rel="stylesheet" href="style.css">
</head>

<body>
    <div>
        
<p>GeeksforGeeks</p>

        
<p>UTF-8 character set</p>

        
<p>Ͷ , ← , Ā </p>

    </div>
</body>

</html>

Output:

 

Article Tags :

Similar Reads