While working on a minor improvement for @swagspiration today, I found an few interesting facts about both the implementation of strings in JS which allows for emoji, and also a tricky behavior in Chrome (probably a bug).
The regex for a two-character pair where the first character is a head surrogate (and we’ll assume that the next character is a tail surrogate) looks like this, then:
/[\uD800-\uDFFF]./ // This matches emoji
and does not look like any of these:
/[\u1F300-\u1F6FF]/ // can only have four hex characters after \u /\x01[\xF3-\xF6][\x00-\xFF]/ // \x only works for ASCII-compatible characters, not bytes of a string /[\xD8-\xDF][\x00-\xFF]./ // Ditto.
Magical Shrinking Strings (in Chrome)
What happens, then, if you only remove the head surrogate from your string, and leave the tail surrogate? Let’s find out. Here’s doing exactly that in Firefox console (remember that without the ‘g’ modifier to the regex,
replace() only replaces the first match, which is the first non-newline character in the string for
That seems reasonable. The tail surrogate is non-printable, so it gets a default Unicode glyph showing its codepoint. Let’s also try this in Chrome.
WHOA! The whole string seems to have disappeared!
In fact, it’s not disappeared – all the remaining text is still there, the length property is correct,
charCodeAt() returns correct values, and removing the tail surrogate from the beginning causes the characters after it to correctly print again. In Chrome, text in a string ceases rendering immediately when a bare tail surrogate is encountered. This is true of the console, and also of the DOM. If you set the string above to be the text content of a text node, it would not show anything at all.
I’m unsure if this is a bug, but it is an interesting way of possibly hiding content in plain sight.
If you want emoji-safe string functions, I’ve been informed from Beau Gunderson that Lo-Dash has emoji-aware string handling. Also Beau has written an emoji aware module for Node if you don’t need all of Lo-Dash’s functionality.