Java Tutorial

Basic Concepts

Object Oriented Concepts

Coming Tutorials

>Home>Java Tutorial>Unicode Character System in Java

Java Tutorial

Basic Concepts

Object Oriented Concepts

Coming Tutorials



Unicode Character System in Java

Unicode is a computing industry standard designed to consistently and uniquely encode characters used in written languages throughout the world. The Unicode standard uses hexadecimal to express a character. For example the value 0x0041 represents the Latin character A.

The ASCII character set contained limited number of characters. It doesn't have Japanese characters, and a host of other possible characters from various different languages of the world.

The idea behind Unicode was to create a single character set that included every reasonable character in all writing systems in the world. In fact, Unicode has a different way of thinking about characters, and you have to understand the Unicode way of thinking of things or nothing will make sense.

The Unicode standard was initially designed using 16 bits to encode characters because the primary machines were 16-bit PCs. When the specification for the Java language was created, the Unicode standard was accepted and the char primitive was defined as a 16-bit data type, with characters in the hexadecimal range from 0x0000 to 0xFFFF.

The latest version of Unicode contains a repertoire of more than 128,000 characters covering 135 modern and historic scripts, as well as multiple symbol sets.


Why java uses Unicode

The evolution of Java was the time when the Unicode standards had been defined for very smaller character set. Java was designed for using Unicode Transformed Format (UTF)-16, when the UTF-16 was designed. The ‘char’ data type in Java originally used for representing 16-bit Unicode. Therefore the size of the char data type in Java is 2 byte, and same for the C language is 1 byte. Hence Java uses Unicode standard.

Java Example to Display Unicode Characters in Java Program


class UnicodeDemo
{
public static void main(String[] args)
{
String pi = "\u03C0";
System.out.println(pi); //pi sign
System.out.println("\u0021"); // !
System.out.println("\u002B"); // +
System.out.println("\u0030"); // \u0030 to \u0039 represents 0 to 9 digits
System.out.println("\u0041"); // \u0041 to \u005A represents A to Z
System.out.println("\u0061"); // \u0061 to \u007A represents a to z
System.out.println("\u007B"); // {
System.out.println("\u007E"); //~
System.out.println("\u0040"); // @
System.out.println("\u00A5"); //yen sign
System.out.println("\u00B5"); // micro sign
  }
}

Share the article to help your friends