For quite some time (at least months, maybe a year or so), I've been getting the occasional email where what should clearly be a "." in a link has become "..": http://www.dreamwidth..org and the like. I'd assumed it was just a reasonably common typo for a reason I couldn't quite fathom, and never wondered much about it.
Until today, where I had a brief exchange of emails with the people who'd sent one such duff link, and they reported that nobody else had had that problem. So I figured it may be something on my end, and dug a little deeper.
Have a look at this chunk of raw email content (edited to preserve style while removing private data):
2xEuthkrNGNbo&sa=3DD&usg=uehcAIUEQhRCSmhnsT_uAKXIUEhbhtahtdTH" styl= e=3D"color:inherit;text-decoration:inherit"><br>http://www.indelicates.com/= NotHere/IAmNotTheFileEither.txt</a></span></p><p style=3D"padding:0px;margi= n:0px;font-size:11pt;font-family:Arial;height:11pt;direction:ltr"></p><p st= yle=3D"padding:0px;margin:0px;font-size:11pt;font-family:Arial;direction:lt= r">Both are zipped folders that contain the tracks, digital artwork and cre= dits. If you are unsure which version to download then go for the Mp3s.<br>= <br>This album has a story, which you can read here:<br><span style=3D"colo= r:rgb(17,85,204);text-decoration:underline"><a href=3D"https://www.google.c= om/url?q=3Dhttp://www.google.com/url?q%3Dhttp%253A%252F%252Fwww.indelicates= ..com%252Felevator%252F1.html%26amp;sa%3DD%26amp;sntz%3D1%26amp;usg%3DAFQjCN= H19duvjmV80u8vgfp_dWO16zdz6w&sa=3DD&usg=3DAFQjCNGMpiSpcpd7bOcZlcorQ= 26jkoROUA" style=3D"color:inherit;text-decoration:inherit">http://www.indel= icates.com/elevator/1.html</a></span></p><p style=3D"padding:0px;margin:0px= ;font-size:11pt;font-family:Arial;direction:ltr"><br>As ever, our PR budget= is paid in kind with the kindness of strangers. We believe that these thin= gs are possible without gatekeepers and central authorities - but we are su= stained in that belief only by your willingness to go along with them. As s= uch, anything you can do to help spread the word about this crowdfunding ef= fort will really, really help us out. Please share the link to:</p><p style= =3D"padding:0px;margin:0px;font-size:11pt;font-family:Arial;height:11pt;dir= ection:ltr"></p><p style=3D"padding:0px;margin:0px;font-size:11pt;font-fami=
This is HTML in a single dense block with line continuations marked with "=". As there's no reason not to, the lines are all the same length, a reasonably standard 76 characters. All the lines, that is, except one, which noticeably sticks out. And what's that at the start of the line? A double dot where there ought to be a single one. There are other examples in this email, while all the other dots are untouched.
It looks like somewhere in the various hops emails take on their way to my inbox, something is switching any "." in the first column of a line to "..".
Now begins the task of working out who runs that server and what the frell they're doing with my email (although I have suspicions as to the likely culprit). But. Wow. Good bug.
no subject
Date: 4 Sep 2015 01:50 pm (UTC)And why!?! would be pretty prominent in my mind! I can understand processing the header, but the body? Even google-style data-stripmining shouldn't need you to rewrite the contents, you're just making work for yourself and inviting the bugs in.
no subject
Date: 4 Sep 2015 02:18 pm (UTC)no subject
Date: 4 Sep 2015 07:22 pm (UTC)no subject
Date: 23 Sep 2015 01:11 pm (UTC)no subject
Date: 4 Sep 2015 03:00 pm (UTC)The phrase "dot stuffing" rang a vague bell in my head when I read this, and indeed, something relevant-looking is specified in RFC 5321 §4.5.2. To summarise: when mail is transferred by SMTP, sending a single "." on a line by itself is taken to terminate the message. Therefore, to make it possible to send emails one of whose actual intended lines of content consists of a "." by itself, the message is transformed during transmission to prefix an extra "." to any line starting with ".".
The idea, of course, is that the receiving SMTP agent undoes the transformation and the mail content is restored to its original state. But here, it looks very much as if some receiving SMTP agent along the chain has forgotten to undo the dot-stuffing.
no subject
Date: 4 Sep 2015 03:41 pm (UTC)Ooh, good thought! I had wondered if it might be something like this from my hazy memories of interacting with SMTP servers using a Telnet client, but I'd assumed if there were anything special going on, it would only apply to lines that contained only ".", not any that started with a "." whether or not there were further characters.
no subject
Date: 4 Sep 2015 03:51 pm (UTC)Certainly if you convert "." lines into "..", then you have to convert ".." lines in turn to something else, and that in turn and so on, otherwise you end up with some line contents which can either represent itself or be an escaped version of a different line, meaning the unstuffer doesn't know which one to turn it back into.
But yes, they could perfectly well have ruled that dot-stuffing applied to any line consisting of one or more dots and nothing else, rather than to any line starting with a dot even if a non-dot follows it. As long as the policy is consistent between all implementations, either (or many points in between) would work fine. I suppose they must have decided that mandating the very easiest of the options would give the best chance of nobody messing it up, and "dot-stuff any line starting with a dot" is plausibly that.
no subject
Date: 6 Sep 2015 06:14 pm (UTC)This is the email abyss, swallower of sysadmin souls.
no subject
Date: 23 Sep 2015 01:13 pm (UTC)no subject
Date: 4 Sep 2015 05:20 pm (UTC)no subject
Date: 5 Sep 2015 07:37 pm (UTC)