Synopsis
IRIs are essentially the Unicode superset of (ASCII) URIs, combined with an ASCII embedding via UTF-8.
The least you should know:
- RFC 3987 §3 (IRI/URI conversion)
- HTML4 B.2.1: Non-ASCII characters in URI attribute values
Browser support
IRIs and IDNs: Testing, Implementations, and Specification Evolvement (2007)
Controlling options:
- Firefox: Bug 124042 (2002), network.standard-url.{encode-utf8,escape-utf8}
- Internet Explorer: Tools → Internet Options → Advanced → Browsing → Always send URLs as UTF-8
- Opera: Tools → Preferences → Advanced → Network → Encode international Web addresses with UTF-8
Subtleties
- Normalization, comparison, bidirectional text
- Internationalized Domain Names (RFC 3490)
Implementation
See #2409.
