Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It should be noted that Windows can use either forward or backward slashes nowadays. In fact, even File Explorer allows you to use an unholy mix of the two.


Interestingly, "unholy mixes" of slashes only always work properly in old Windows paths! I found this out when debugging an open source project where '/file.txt' (hardcoded as a forward slash) was appended to a "\\?\" path.

Microsoft introduced "extended paths" to accommodate Unicode and paths longer than 260 characters (i.e. "C:\<256-char string>"). These can be prefixed by "\\?\", which sadly does not support slash mixes. Rust's standard library `canonicalize` function, for example, returns "\\?\" paths!

(* EDIT: I originally wrote some historical inaccuracies, see ynik's reply below for the corrections. Thank you ynik!)

From the docs: "File I/O functions in the Windows API convert "/" to "\" as part of converting the name to an NT-style name, except when using the "\\?\" prefix." - https://docs.microsoft.com/en-us/windows/desktop/FileIO/nami...


The "\\?\" extended-length paths have existed for a long time; I think since Windows 2000 at least. The Windows kernel and NTFS have always supported paths up to 32k characters; only the Win32 path normalization logic was limited to MAX_PATH. The "\\?\" prefix skips the normalization logic, thus allowing apps to use the full path length.

The Windows 10 change in 2016 is something different: if an application opts in (via manifest) AND the LongPathsEnabled registry key is set, long paths will be supported by some (not all) Windows API functions even without the "\\?\" prefix. These new long paths support a mixture of slashes just fine.


Thanks for pointing these out, very interesting! I amended the parent so others aren't misinformed (with credit to you). I discovered this because the Rust standard library's `canonicalize` function actually returns paths in extended-length path syntax. With credit to Rust, their docs actually point out that you must append paths that are backslash-delimited: https://doc.rust-lang.org/std/fs/fn.canonicalize.html


To be extra pedantic, it's 32k code units, so some characters take up two.


Pretty sure all of Microsoft's own stuff is using UTF-16, so all characters take two bytes, regardless.


Indeed. Although technically it's UCS-2 as the low level stuff doesn't understand modern unicode; it just treats it all as a stream of wide chars. It's up to applications to interpret it.

According to this[0] the path limit is nearer 32739*2 bytes as a drive like "C:" gets expanded to "\Device\HarddiskVolume1".

[0] https://stackoverflow.com/questions/15262110/what-happens-in...


It's 64 kilobytes, more or less. Some characters are 2 bytes and some are 4. Also you can use invalid unicode.

And when first implemented it was UCS-2.


Only characters from the Basic Multilingual Plane. If you put an emoji in a path, for example (allowed), it will be 4 bytes.


Except if it's a compound emoji, which takes multiple codepoints -> more than 4 bytes. So really the takeaway is that "character" is highly ambiguous nowadays and shouldn't be used anymore, use one of these instead: * UTF-n code unit * Unicode code point * Grapheme Cluster * Extended Grapheme Cluster For the latter one also needs to specify the version of Unicode they're talking about, since every new compound emojis changes the definition of "Extended Grapheme Cluster".


This is some incredible level of compatibility. Cudos to MS stance on that subject.


As I recall, so could / would DOS - at the Int21 syscall level.

The SWITCHAR option was present to set a value which COMMAND.COM (and other DOS utils) consulted such that one could make it honour '-' as the flag leader, and then accept '/' at COMMAND.COM (and other DOS utils).

Unfortunately not all third party utils honoured that, and so it usually ended up not being worth flipping it in to Unix compatible mode.

That said, one could take advantage of the recognition of '/' in programs one wrote, e.g. C code (so avoiding "directory\\file.txt").


INT 21h, function 37h could be used to get and set the switch character for MS-DOS, in addition to requiring the use of '/DEV/' for device filenames:

    * AL=0, return switch character in DL
    * AL=1, set switch character to character in DL
    * AL=2, return '/DEV/' setting in DL
    * AL=3, DL=0 if '/DEV/' required, DL<>0 if not.
Okay, why do I remember this stuff? I haven't programmed MS-DOS for over 25 years!


Every Microsoft operating system going back to early versions of MS-DOS can use / as the separator.

This is not necessarily true of the "user space" of Microsoft systems. Paths are not only traversed by the "kernel", but also manipulated by applications like the Explorer shell, which may or may not like forward slashes.


It's partially the opposite. Some system calls only accept backlash, but user space programs and libraries may abstract this. E.g. LoadLibrary, which is similar to execve/dlopen.


As I recall at least some parts of the WinAPI had trouble if you used long paths (via "\\?\") and forward slash. I don't know if that's still true.


Indeed. "Windows API" is very broad; it encompasses all sorts of user-space middleware, some of which could very plausibly munge paths in ignorance of forward slashes.


I accidentally typed the a mix once and noticed it when I hit enter. I honestly expected a blue screen... and I probabbly would have accepted that as the proper outcome of crossing the streams.

But it worked. I'm still not sure how I feel about it working...


Now mount your NTFS volume on Linux and create a directory named "foo\bar", then a directory "foo" and a directory "bar" inside "foo".


What happens?


Bill Torvalds materialises and unlocks the Linus Gates.


I think "foo\bar" should resolve to a directory named "foar", for "foo<backspace>ar". ;)


Not sure. I'd expect the "foo\bar" created on that NTFS volume while mounfed from Linux to be effectively unreachable when accessed from windows (at least while the command line?). Or not. Surprise me.


You can create file names containing special characters from ntfs-3g that make it nearly impossible to access or remove from within Windows (such as those containing a "?" character). I'm sure there's a way, but neither the command prompt nor Windows Explorer know what to do. Opening such files from a variety of test applications seems to fail as well.

According to the ntfs-3g FAQ, this is expected behavior[1] and is a consequence of NTFS' namespace implementation. Allegedly, such files could be removed via a Samba client, but I've never tried.

I confess that outside mounting the disk under Linux, I know of no first party solution to remove or rename such files; but, I'm also not especially knowledgeable on Windows internals.

[1] https://web.archive.org/web/20090207135653/http://ntfs-3g.or...


There used to be a bug in the GatorBox Mac Localtalk-to-Ethernet NFS bridge that could somehow trick Unix into putting slashes into file names via NFS, which appeared to work fine, but then down the line Unix "restore" would totally shit itself.

That was because Macs at the time (1991 or so) allowed you to use slashes (and spaces of course, but not colons, which it used a a path separator), and of course those silly Mac people, being touchy feely humans instead of hard core nerds, would dare to name files with dates like "My Spreadsheet 01/02/1991".

https://en.wikipedia.org/wiki/GatorBox

Unix-Haters Handbook

https://archive.org/stream/TheUnixHatersHandbook/ugh_djvu.tx...

Don't Touch That Slash!

UFS allows any character in a filename except for the slash (/) and the ASCII NUL character. (Some versions of Unix allow ASCII characters with the high-bit, bit 8, set. Others don't.)

This feature is great — especially in versions of Unix based on Berkeley's Fast File System, which allows filenames longer than 14 characters. It means that you are free to construct informative, easy-to-understand filenames like these:

1992 Sales Report

Personnel File: Verne, Jules

rt005mfkbgkw0 . cp

Unfortunately, the rest of Unix isn't as tolerant. Of the filenames shown above, only rt005mfkbgkw0.cp will work with the majority of Unix utili- ties (which generally can't tolerate spaces in filenames).

However, don't fret: Unix will let you construct filenames that have control characters or graphics symbols in them. (Some versions will even let you build files that have no name at all.) This can be a great security feature — especially if you have control keys on your keyboard that other people don't have on theirs. That's right: you can literally create files with names that other people can't access. It sort of makes up for the lack of serious security access controls in the rest of Unix.

Recall that Unix does place one hard-and-fast restriction on filenames: they may never, ever contain the magic slash character (/), since the Unix kernel uses the slash to denote subdirectories. To enforce this requirement, the Unix kernel simply will never let you create a filename that has a slash in it. (However, you can have a filename with the 0200 bit set, which does list on some versions of Unix as a slash character.)

Never? Well, hardly ever.

    Date: Mon, 8 Jan 90 18:41:57 PST 
    From: sun!wrs!yuba!steve@decwrl.dec.com (Steve Sekiguchi) 
    Subject: Info-Mac Digest V8 #3 5 

    I've got a rather difficult problem here. We've got a Gator Box run- 
    ning the NFS/AFP conversion. We use this to hook up Macs and 
    Suns. With the Sun as a AppleShare File server. All of this works 
    great! 

    Now here is the problem, Macs are allowed to create files on the Sun/ 
    Unix fileserver with a "/" in the filename. This is great until you try 
    to restore one of these files from your "dump" tapes, "restore" core 
    dumps when it runs into a file with a "/" in the filename. As far as I 
    can tell the "dump" tape is fine. 

    Does anyone have a suggestion for getting the files off the backup 
    tape? 

    Thanks in Advance, 

    Steven Sekiguchi Wind River Systems 

    sun!wrs!steve, steve@wrs.com Emeryville CA, 94608
Apparently Sun's circa 1990 NFS server (which runs inside the kernel) assumed that an NFS client would never, ever send a filename that had a slash inside it and thus didn't bother to check for the illegal character. We're surprised that the files got written to the dump tape at all. (Then again, perhaps they didn't. There's really no way to tell for sure, is there now?)


It is possible to create garbage filenames with cygwin.


I'd forgotten about Cygwin! Next time I think about it, that might be capable of removing said files as well (although, I guess that qualifies as a 3rd party tool).

I have it installed on my Windows machine. I ought to give it a try.


Works without issue. Just went to the Store, downloaded Ubuntu LTS 18.4 and tried it.

Screenshot: https://1drv.ms/u/s!AqSptDJA-vctqs4OGewT6Qa-PWBvbw


I think the intention was that 'foo/bar' and 'foo' would be in the same directory, therefore when typing C:\foo/bar into the file explorer you would have an ambiguous path. Perhaps I misunderstood.


You can't access or modify "foo\bar", because it always resolves to "foo" \ "bar".


No tab completion in shells for forward slashes though :(


I think tab completion for forward slashes works in powershell


Yes I'd noticed this once or twice and momentarily wondered why there wasn't a problem. I hadn't realised it was now officially ok


It's useful to distinguish between command line and GUI apps vs. the operating system itself.

Forward slashes and backslashes have been supported as path delimiters all the way back to MS-DOS 2.0 when hierarchical directories were first introduced. The DOS and Windows file APIs never cared which you used.

It was only command line and GUI apps that (rightly or wrongly) preferred one or the other. The only new thing here is that some of these apps have finally started to recognize either delimiter.

That is a tricky thing for a command line app where the wrong slash may be interpreted in a way the user did not mean. For GUI programs there was never a good reason to not accept both, e.g. in a file path field.


I accidentally found out something mind blowing about Mac OS/X recently:

The Finder apparently lets you use "/" in file names!

I expect you won't believe me, so try it.

Now look at the file in the shell with "ls".

Go figure!

Now who's unholy???


Classic Mac OS allowed filenames to contain every character except the directory-delimiter, ":".

When Mac OS X came around, it included a POSIX API and kernel (so applications expect to be able to use ":" in filenames), but used the same filesystem as Classic Mac OS which didn't allow ":" but did allow the POSIX directory separator "/", which POSIX apps wouldn't try to use in a filename. And so at some layer they were switched around - if you try to create a filename with "/" in the GUI, it's presented as ":" to POSIX; if a POSIX app creates a filename with ":" it shows up as "/" in the GUI.


It should also be noted that you should NOT get in the habit of doing this because the command prompt is not really amused by you doing this. (Try running foo/bar.exe.)


Try it again with Powershell...


I confess, I use them interchangeably and don’t know which one is “correct”


Just as long as you don't say backslash when saying a URL... that is one of the things that really brings out the pedant in me...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: