Took me a while to figure it out, but it appears every instance of the letters '...

djtriptych · on May 28, 2019

Almost certainly a botched attempt to fix Unicode code points showing up in text.

For instance, the Euro symbol (U+20AC) is encoded in UTF-8 as the three bytes E2 82 AC. In Python (and other languages?) this can sometimes be misencoded as the 9-char string `\xe2\x82\xac`. The leading '\x' is a special sequence in Python to indicate a hex value.

Someone who didn't know what they were looking at might try to do a couple of heavy-handed replacements on article text to undo this. '\xe' is commonly seen in bad utf8 -> ascii translations because of how utf8 encodes code points.

See the specific Euro example here: https://en.wikipedia.org/wiki/UTF-8#Examples

And a popular answer on StackOverflow specifically on '\xe2' in python output: https://stackoverflow.com/questions/21639275/python-syntaxer...

sleepytimetea · on May 29, 2019

Your observation was so unexpected and unusual that it is actually more interesting than the long winded article itself. I had noticed the weird half eaten words but thought it was just a poorly edited website with typos.

The Unicode elimination explanation by another person replying to your comment was also quite interesting to read.

rdiddly · on May 28, 2019

Missed the edit deadline, but here's an update:

I've got the day off today, so I'm on this like a poorly-disciplined bloodhound. After searching GQ.com with all 26 letter combinations x_ it seems 'xb' and 'xe' have been removed, site-wide (edit: not site-wide), while none of the other combos are affected. My test words are listed below.

So, is this starting to ring a bell for anybody? Do xb and xe have anything in common? Defunct formatting codes? Emacs function keys, I only half-jokingly joked?

examination

Oxbridge

excuse

Disney XD

boxes

exfoliate

foxglove

exhale

exit

Jaguar XJ

Jaguar XK

axle

axman

Oxnard

exoskeleton

expose

exquisite

iPhone XR

coxswain

Final Fantasy XV

Maxwell

XXX

sexy

Olympus XZ-1

jetrink · on May 28, 2019

Wow, great work so far. This is really interesting.

My guess is that they were moving content out of some proprietary early-2000s CMS around 2015. Instead of carefully parsing the storage format and extracting the text, they dumped it and the output was peppered with garbage. To sanitize the output, they simply elided certain character sequences.

Further speculation, 'xb' and 'xe' (for 'beginning' and 'end') were control sequences marking the extent of something in the old CMS format

Edit: These people would be the ones to ask:

> The Software Engineering team at Condé Nast International (CNI) knew it needed an automated way to migrate the vast quantities of content, and it developed a tool to do just that, recognizing that no off-the-shelf tool could cope with the disparate set of content it was facing, spanning multiple territories, languages and content types. But to meet its hard three-month deadline of migrating the first territory, Germany, CNI also saw the need for additional resources who were experienced in key technologies, including Node.js and React, so it selected NearForm.

https://www.nearform.com/blog/case-study/accelerating-transf...

rdiddly · on May 29, 2019

Ha - I suspect you're looking in the right direction. Especially when they talk about that "hard three-month deadline." What is it with the arbitrary deadlines, people? The deadline happens once, but the mistake hangs around forever.

skykooler · on May 28, 2019

"xe" has occasionally been used as a gender-neutral pronoun. Possibly at some point GQ changed their style guide with regards to it, and some (presumably well-meaning) editor used a malformed search and replace on it.

zerocrates · on May 28, 2019

This makes some sort of sense but then wouldn't we be seeing "etheyrtion" instead of "ertion"? Even assuming a find/replace properly targeted the word "xe" we don't have any "replace" happening, which doesn't really track.

kbutler · on May 28, 2019

Not every instance:

next context exterior exhaustion exploded FedEx

mankyd · on May 28, 2019

ex vs xe.

kbutler · on May 28, 2019

Woops, yes, transposed when I was trying to determine when it may have been introduced.

archive.org has only "red on" (archive since 2015, though the article is dated September 2003).

google books finds a 2016 book with this article containing "red oxen".

And gq.com elsewhere has the word "oxen" September 2015: https://www.gq.com/gallery/andrew-moore-photography-book-dir....

Earliest instance I found was February 2003 https://www.gq.com/story/michael-paterniti-surfing-2003 "ercise". There were many others, through at least May 2015 https://www.gq.com/story/old-people-are-robbing-more-banks "safe deposit bos"

Failed to post another edit: Missing-xe disease affects more conde nast publications in that time frame than just GQ as well - for example, teenvogue talks about "ercise" a lot, including https://www.teenvogue.com/story/simple-ways-to-excercise (note the "exCercise")

rdiddly · on May 28, 2019

Holy cow. So it has to have been done recently then, I guess, and probably globally? Which might lend some credence to the "pronoun" theory.

I searched for "bos" and turned up quite a few:

https://duckduckgo.com/?q=%22bos%22+site%3Awww.gq.com

Edit: found a new one - remixes/remis:

https://duckduckgo.com/?q=remis+site%3Awww.gq.com

dec0dedab0de · on May 28, 2019

so does "fos" instead of foxes. This is kind of silly.

https://duckduckgo.com/?q=%22fos%22+site%3Awww.gq.com&kp=1&i...