conference logo

Playlist "OpenChaos"



To many, Unicode is best known as a huge repository of characters.
This talk mostly deals with everything else.
Encoding them, comparing them, and telling them apart.
After presenting a short history of Unicode, the talk presents different ways
to encode Unicode characters and discusses their benefits drawbacks.
It goes on to discuss Unicode Normalization Forms, which are used to compare
strings containing separate codepoints that are logically equivalent.
Lastly there is some discussion on dealing with visually confuseable strings, in terms
of both the Unicode Confusable Mappings and certain special characters.