The IETF states in the RFC 3492 standard that Punycode is one of the possible applications of a general encoding algorithm known as Bootstring. The Bootstring algorithm allows you to represent strings with a limited selection of elements. The development of the coding procedure is based on six principles:
- Integrity : With Bootstring each output string can be represented by a simplified string.
- Uniqueness : The classification of the output string and its Bootstring encoding is unambiguous. Each Punycode can be assigned exactly one ASCII equivalent and vice versa.
- Reversibility - Bootstring encoding can be undone without losing information.
- Efficiency : The encoded string is only minimally (sometimes not even) longer than the output string.
- Simplicity : Bootstring uses simple encoding and decoding algorithms.
- Readability : Only those characters that cannot be represented in the target character body are encoded. The rest of the characters remain the same.
Bootstring specifies Punycode according to the requirements of internationalized domain names. This should allow rendering Unicode characters with the basic characters allowed up to now.
We show this syntax below with the following example:
IDN : tiles-coruña
The azulejos-coruña IDN contains the letter? Ñ ?, not included within the characters previously allowed for domain names and which, therefore, must be encoded using Punycode to guarantee compatibility.
In the first step, the encoding process foresees a normalization of the output character string (thus, all uppercase letters are replaced by lowercase).
In the second step, all non-ASCII characters are removed, replacing them in the domain with their encoded form and separating them by a hyphen.
When encoding Internet addresses with Punycode, each resulting string is accompanied by the prefix ACE (short for ASCII Compatible Encoding):
ACE prefix : xn--
The ACE prefix ensures that domain names that contain hyphens are not misinterpreted as international domain names.
Finally, as a result coded for azulejos-coruña, we obtain:
ACE : xn - azulejos-corua-2nb