The ASCII (American Standard Code for Information Interchange), pronounced ‘ask-ee’, is strictly a seven-bit code based on the English alphabet. ASCII codes are used to represent alphanumeric data in computers, communications equipment and other related devices. The code was first published as a standard in 1967.

It was subsequently updated and published as ANSI X3.4-1968, then as ANSI X3.4-1977 and finally as ANSI X3.4-1986. Since it is a seven-bit code, it can at the most represent 128 characters.

It currently defines 95 printable characters including 26 upper-case letters (A to Z), 26 lower-case letters (a to z), 10 numerals (0 to 9) and 33 special characters including mathematical symbols, punctuation marks and space character.

In addition, it defines codes for 33 nonprinting, mostly obsolete control characters that affect how text is processed. With the exception of ‘carriage return’ and/or ‘line feed’, all other characters have been rendered obsolete by modern mark-up languages and communication protocols, the shift from text-based devices to graphical devices and the elimination of teleprinters, punch cards and paper tapes.

An eight-bit version of the ASCII code, known as US ASCII-8 or ASCII-8, has also been developed. The eight-bit version can represent a maximum of 256 characters.

Table 2.6 lists the ASCII codes for all 128 characters. When the ASCII code was introduced, many computers dealt with eight-bit groups (or bytes) as the smallest unit of information.

The eighth bit was commonly used as a parity bit for error detection on communication lines and other device-specific functions. Machines that did not use the parity bit typically set the eighth bit to ‘0’.


Looking at the structural features of the code as reflected in Table 2.6, we can see that the digits 0 to 9 are represented with their binary values prefixed with 0011. That is, numerals 0 to 9 are represented by binary sequences from 0011 0000 to 0011 1001 respectively.

Also, lower-case and upper-case letters differ in bit pattern by a single bit. While upper-case letters ‘A’ to ‘O’ are represented by 0100 0001 to 0100 1111, lower-case letters ‘a’ to ‘o’ are represented by 0110 0001 to 0110 1111.

Similarly, while upper-case letters ‘P’ to ‘Z’ are represented by 0101 0000 to 0101 1010, lower-case letters ‘p’ to ‘z’ are represented by 0111 0000 to 0111 1010. With widespread use of computer technology, many variants of the ASCII code have evolved over the years to facilitate the expression of non-English languages that use a Roman-based alphabet.

In some of these variants, all ASCII printable characters are identical to their seven-bit ASCII code representations. For example, the eight-bit standard ISO/IEC 8859 was developed as a true extension of ASCII, leaving the original character mapping intact in the process of inclusion of additional values.

This made possible representation of a broader range of languages. In spite of the standard suffering from incompatibilities and limitations, ISO-8859-1, its variant Windows-1252 and the original seven-bit ASCII continue to be the most common character encodings in use today.

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...