JavaScript charCodeAt(index) function
The charCodeAt(index)
method in JavaScript returns the Unicode (UTF-16) code of the character at a specified position (index) in a string. The method gives you the numeric representation of the character, which can be useful when you need to work with character codes, for example, when processing or comparing characters based on their Unicode values.
Syntax:
index
: The position of the character you want to retrieve the Unicode value for. This is a zero-based index, meaning that the first character is at index0
, the second at1
, and so on.
Return Value:
- It returns an integer representing the Unicode code unit of the character at the given index.
- If the
index
is out of range (e.g., negative or greater than the string's length), it returnsNaN
.
Example 1: Basic Usage
In this example:
'J'
has the Unicode value 74.'S'
has the Unicode value 83.
Example 2: Handling Out-of-Range Index
If the provided index
is outside the bounds of the string, charCodeAt()
returns NaN
.
Unicode Explanation:
- Unicode is a global character encoding standard that assigns a unique number (code point) to every character, symbol, and punctuation mark across all languages. JavaScript represents these using UTF-16 encoding.
- For example, the character
'A'
has a Unicode code point of 65, and'a'
has a code point of 97.
Example 3: Using charCodeAt()
to Get Character Codes
In this example, charCodeAt()
is used in a loop to print each character along with its Unicode code value.
Example 4: Comparing Characters Using charCodeAt()
You can use charCodeAt()
to compare the Unicode values of characters:
Supplement: UTF-16 Surrogate Pairs
JavaScript strings are stored as sequences of 16-bit units (UTF-16). For characters outside the Basic Multilingual Plane (BMP) (i.e., code points above 0xFFFF, like emojis or rare symbols), two 16-bit code units are used. charCodeAt()
will only return the first 16-bit unit in these cases.
For example:
If you need the full Unicode code point for characters that use surrogate pairs, you should use the codePointAt()
method introduced in ES6, which can handle such characters correctly.
Summary:
charCodeAt()
returns the Unicode value of the character at the specified index.- It returns
NaN
for invalid indices. - It provides the UTF-16 value of the character, which is essential for understanding and working with different character encodings in JavaScript.
- Use
codePointAt()
for handling characters outside the BMP (e.g., emojis).