A Tale of Two File Names: deconstructing the checksum in Windows 8.3 file names

WalterGR · on June 10, 2015

    There are so many layers of abstraction in the Windows API
    that it’s a miracle that anyone could maintain it - which probably
    explains the increasing level of bloat in new versions of Windows.

Are there fewer layers of abstraction in other popular OSEs?

acdha · on June 10, 2015

Yes – it's a product of the way Windows was actually two operating systems, the Windows 3/95/98/ME lineage and the newer OS/2 / NT lineage.

The NT side had the advantage of being designed at all rather than growing organically by overworked developers hacking in whatever they needed right now, and had a number of assumptions (e.g. not starting with the 16-bit API real-mode model) which avoided some gnarly hacks.

The problem was compatibility: most of the apps had been developed on 16-bit Windows 3 or, later, Win95 and at the time Microsoft's dominance was far from a given so they were pathologically afraid of breaking compatibility, which meant that the Windows “platform” included a lot of weird semi-or-undocumented corners designed to avoid breaking specific apps and the NT side had to reimplement bugwards-compatible versions of most of them to avoid breaking shipped apps.

(This might seem excessive – and I would generally agree – but it's important to remember that much of the damage was done in the pre-internet era when shipping updates to software meant putting a box in the main with a pile of floppies or, for the really rich people, CDs. Getting someone to upgrade to a version of an app which didn't rely on an implementation quirk could take many years.)

Raymond Chen has written about this extensively at http://blogs.msdn.com/b/oldnewthing/ and one of my favorite examples is the Shell Folders registry key:

http://blogs.msdn.com/b/oldnewthing/archive/2003/11/03/55532...

The closest you come to this on another mainstream OS is OS X, where they maintained Carbon (i.e. the supported subset of the classic Mac OS APIs) on top of the modern core but that was both more limited and was rapidly deprecated because Apple is far less concerned with breaking backwards compatibility than Microsoft used to be.

coldpie · on June 10, 2015

> growing organically by overworked developers hacking in whatever they needed right now

One of my personal favorite examples of this is the GetRandomRgn function in Windows's windowing API.

https://msdn.microsoft.com/en-us/library/windows/desktop/dd1...

http://blogs.msdn.com/b/oldnewthing/archive/2012/06/18/10321...

pjc50 · on June 10, 2015

Given that Microsoft are going to "continuous deployment" with Win10, I think they ought to have taken the opportunity to do one last breaking change, possibly behind an install-time option or compatibility layer: ditch the CP/M legacy.

That is, change the path separator to '/', ditch the drive letters, and the magic file names like CON, and the 8.3 names.

It sounds like a huge breaking change but code that is already using the shell path functions https://msdn.microsoft.com/en-us/library/windows/desktop/bb7... would work transparently.

WalterGR · on June 10, 2015

    change the path separator to '/'

I believe that the Windows API supports '/' as a path separator and that it's user mode apps that tend not to. I can't find a good citation for that, though.

pjc50 · on June 10, 2015

The DOS command line uses '/' to deliniate arguments. So I've just realised that that would have to change, a more serious problem.

WalterGR · on June 10, 2015

Support for / as a path separator exists for some built-in commands, but it's very spotty. For example:

    E:>cd ./work/depot/code
    E:\Work\depot\code>cd /
    The syntax of the command is incorrect.
    E:\Work\depot\code>cd e:/
    E:>dir /
    Invalid switch - "".
    E:>dir ./
    Invalid switch - "".
    E:>dir e:/
    Invalid switch - "".

ynik · on June 10, 2015

Most built-in commands will accept / if you put the path in quotes:

    c:\>dir "d:/"
    Volume in drive D is HDD
    ...

This is quite useful when writing cross-platform makefiles :)

WalterGR · on June 10, 2015

I can't believe I didn't know that. Thanks.

Real bummer that you have to specify the drive letter. (At least on Win7, where I just tried it.)

The behavior is kind of interesting for 'dir' at least:

    E:>dir "/"

     Directory of \\

    File Not Found

    E:>dir  \\

    The filename, directory name, or volume label syntax is incorrect.

Same result for «dir "//"».

(For those not experienced with the Windows command prompt, seeing "Directory of \\" is unexpected, and seeing it means that something interesting must be going on.)

stephengillie · on June 10, 2015

I vaguely remember something about MS coders already using / to delineate arguments and \ for file paths; when they realized UNIX used \ for that function and they'd have to interoperate over a network, it was already too late.

WalterGR · on June 10, 2015

I don't think it was about networking, necessarily. Here are a couple references:

http://superuser.com/questions/176388/why-does-windows-use-b...

    MS-DOS 1.0 (which did not support directories at all) was already
    using / to introduce command-line options.  It took this usage
    of / from CP/M, which took it from VMS.

http://blogs.msdn.com/b/larryosterman/archive/2005/06/24/432... goes into more detail.

acdha · on June 10, 2015

This stuff is a thicket of special cases:

“File I/O functions in the Windows API convert "/" to "\" as part of converting the name to an NT-style name, except when using the "\\?\" prefix as detailed in the following sections.”

https://msdn.microsoft.com/en-us/library/windows/desktop/aa3...

WalterGR · on June 10, 2015

    This stuff is a thicket of special cases:

Is it? That's the only special case I find on that page. Are there specific API functions that have other special cases?

acdha · on June 10, 2015

I was thinking of the special device names, unicode and length handling depending on whether you use \\?\ – that's the best way to write safe code except when it isn't:

“Many but not all file I/O APIs support "\\?\"; you should look at the reference topic for each API to be sure.”

acdha · on June 10, 2015

I'd expect that this will happen for Metro apps first – those are all modern and since most of them are supposed to run on e.g. ARM anyway legacy code compatibility matters considerably less.

I'd be surprised if they changed the path separator but the device filenames and FAT-compatible short names seem worth doing for security alone.

WalterGR · on June 10, 2015

Thanks. I think that some backwards compatibility can be handled by more code and not necessarily more layers of abstraction, but I get your point.

It seems like there are lots of layers in Linux distributions because of the Unix philosophy. For example, to put a window on a screen, aren't the layers something like this:

App -> window manager -> desktop environment -> X -> graphics driver (-> hardware)

Is that a good approximation? What does that look like on e.g. Windows and Mac OS?

Is the situation on Linux similar for audio? From my armchair, my impression is that audio on Linux is/was... complicated.

KMag · on June 10, 2015

> App -> window manager -> desktop environment -> X -> graphics driver (-> hardware)

No, the window manager isn't in that chain. You can kill the window manager and your applications can keep running and drawing to the screen, but you can't move or resize windows. Also, instead of "Desktop environment" you mean GUI widget toolkit.

WalterGR · on June 10, 2015

    Also, instead of "Desktop environment" you mean GUI widget toolkit.

By desktop environment I meant e.g. Gnome or KDE. Aren't they separate from the widget toolkits?

Dylan16807 · on June 10, 2015

Gnome or KDE is just a package including a window manager and a bunch of mostly-standalone programs that give you a toolbar, file browser, settings panels, etc. It's not a layer that anything goes through. You don't even have to use the window manager it comes with.

KMag · on June 10, 2015

> By desktop environment I meant e.g. Gnome or KDE. Aren't they separate from the widget toolkits?

I was being charitable. The desktop environment doesn't belong in the graph you drew. The GUI widgets belong in the place where you wrote "desktop environment". The charitable assumption is that you reasonably conflated the GUI widget toolkits for desktop environments, since the very popular KDE and Gnome are strongly tied to Qt and gtk, respectively.

wldcordeiro · on June 10, 2015

I believe so. I think Gnome has GTK and KDE uses Qt for their toolkits.

nitrogen · on June 10, 2015

Audio with ALSA was very simple, but didn't support hot-plugging. So rather than improve ALSA, PulseAudio was created. When you add legacy OSS to the mix, which was deprecated probably 10+years ago, yeah, it's complicated. It didn't have to be, though, if the good parts of PulseAudio were implemented as ALSA modules instead.

Someone · on June 10, 2015

I think applications can make X calls themselves

There definitely is a network somewhere in between (possibly almost optimized away in many cases)

voidlogic · on June 10, 2015

>Are there fewer layers of abstraction in other popular OSEs?

This is application servers, not OSs, and its old, but is a pretty famous example:

https://ma.ttias.be/system-calls-in-apache-linux-vs-iis-wind...

WalterGR · on June 10, 2015

Oh, that's right. I remember that. Good example - thanks!

jstanley · on June 10, 2015

Really interesting article, sounds like a fun adventure. Couple of points:

1.) "0x12b9b0a5. This equals 314159269 in decimal. Yep, that’s the first 8 digits of pi right there" -- actually, it's not. The first 8 digits of pi are 3.14159265, it's not clear why the Microsoft developers ended with a 9 instead of a 5, perhaps a mistake? perhaps to help with coprime-ness?

2.) The "ANSI C" version uses mbstowcs which isn't any kind of ANSI C function I've ever heard of.

Fantastic article, really enjoyed reading, many thanks.

Quackmatic · on June 10, 2015

> The first 8 digits of pi are 3.14159265

Those are the first 9 digits if you count 3 - the 9 at the end is the 9th digit, not the 8th, so the first 8 digits are correct. You're correct about the ANSI C part though, it's actually C99 - I've corrected that part. Thanks.

jstanley · on June 10, 2015

pi begins: 3.141592653589 - I miscounted 8 instead of 9, but the number is still wrong.

EDIT: Oh, I see what you mean. He's saying the first 8 digits of that are the first 8 digits of pi, although the 9th is incorrect. Got it.

to3m · on June 10, 2015

It claims to be "standard C", which usually these days means ISO C rather than ANSI C. For details about mbstowcs, see the C99 standard, section 7.20.8.1.

poizan42 · on June 10, 2015

> I’d need to write a device driver to call it, which is both something I’ve never touched at all before, and something that’s nigh impossible without access to the DDK or NDK.

Well the DDK is called WDK now, but anyways how does one not have access to it? It's right here: https://msdn.microsoft.com/en-us/windows/hardware/hh852365

rplnt · on June 10, 2015

> and I certainly wouldn’t be able to get access to the NT source code.

I thought that Windows 2000 source leaked. Would seem like an easier path. Though certainly not better (for either the author or a reader).

userbinator · on June 10, 2015

What was leaked was most of NT4, and only part of 2k. The extra checksum was introduced in 2k, but I don't think the leaked source has that part.

ape4 · on June 10, 2015

Can this checksum routine be contributed to ReactOS. Or is it not allowed since he disassembled Windows.

RandomBK · on June 10, 2015

The routine itself would be copyrighted, but someone should be able to make a clean room reimplementation of the original algorithm, assuming its not protected by patents. (IANAL)

WalterGR · on June 10, 2015

    make a clean room reimplementation

Which, as far as I understand it, could be someone posting a human-language description of the algorithm here (based on the blog post,) then ReactOS writing code based on that description. (IANAL.)

ape4 · on June 10, 2015

Including the crazy constants.

tryp · on June 10, 2015

A few years ago I implemented the storage system for a special-purpose diagnostic camera. The specification defined (very long) filenames for the saved images using a timestamp and some other data. I used a mostly off-the-shelf microcontroller/NAND/USB mass storage reference implementation, hooked it up a side-channel to the FPGA, and had everything working pretty nicely. Until the test harness that just continuously commanded pictures to be taken reached 105 iterations. After that, the camera timed out waiting on the storage subsystem to store the image.

The problem turned out to be the code that found the 8.3 filename: it did the longfi~1.bin, checked to see if that file existed and if so, incremented to longfi~2.bin, then checked that... but never did the checksum trick described here, just kept iterating. (bear in mind this was a tiny 8-bit microcontroller that didn't have the RAM to just read all the directory entries at once and keep them around for comparison) Finding the proper 8.3 filename this way took longer than the timeout period after 104 collisions.

Of course, we only cared about the long filename and never saw the 8.3 filename, so my fix was simply to use an appropriate hash of the long filename to ensure a good probability of uniqueness.

TheLoneWolfling · on June 10, 2015

Ah yes, the wonders of accidentally quadratic functions.

userbinator · on June 10, 2015

The question that immediately comes to mind is what happens if the checksum collides?

Quackmatic · on June 10, 2015

If the checksums collide, then the number after the tilde is incremented again, (eg. SOBC84~2.ASP). This time, it won't stop at ~4, so you can go up to ~10 and beyond. The file name will be shortened accordingly to fit the number in (eg. SOBC8~10.ASP). This was tested on Windows 7 x64.

netsec_burn · on June 10, 2015

And now the last question, how far can you take that logic? What happens at 1,000,000 (one million file names generated) when there are no more characters left to remove from the left side?

Segmentation fault?

albinoloverats · on June 10, 2015

Can FAT support that many files in a single directory?

userbinator · on June 10, 2015

A directory is stored as a linked list of clusters like regular files, and the 32-bit filesize field is irrelevant for a directory, so it theoretically could be as big as the whole volume - just keep adding clusters to the chain.

Filesystem drivers may give up long before then, and access to such a huge directory would be very slow, but there's nothing in the filesystem structures itself that would prevent it. I've written FAT code for an embedded device that I can confidently say would have no problems with large directories, since it'll just keep following the cluster chain to the end.

netsec_burn · on June 10, 2015

Actually, it doesn't look like it. It might need to be a race condition done with a program (delete the old files to keep it going to a million). Might be tricky.

Quackmatic · on June 10, 2015

I've tried it. It works (this is on NTFS) and the end result is a little anticlimactic: https://usn.pw/blog/gen/2015/06/09/filenames/#follow-on-coll...

Quackmatic · on June 10, 2015

That's worth trying. Inducing a bug in kernel code could be interesting.