Normalisation is expressly done with the composition of version 3.1 for compatib...

GlitchMr · on Dec 31, 2021

The `filename-sanitizer` library you have linked has the following comment.

                # FIXME: improve HFS+ handling, because it does not use the standard NFD. It's
                # close, but it's not exactly the same thing.
                'hfs+': (255, 'characters', 'utf-16', 'NFD'),

I wonder what does that mean...

lilyball · on Jan 1, 2022

The technote linked by the parent has a note saying

> The characters with codes in the range u+2000 through u+2FFF are punctuation, symbols, dingbats, arrows, box drawing, etc. The u+24xx block, for example, has single characters for things like "(a)". The characters in this range are not fully decomposed; they are left unchanged in HFS Plus strings. This allows strings in Mac OS encodings to be converted to Unicode and back without loss of information. This is not unnatural since a user would not necessarily expect a dingbat "(a)" to be equivalent to the three character sequence "(", "a", ")" in a file name.

> The characters in the range u+F900 through u+FAFF are CJK compatibility ideographs, and are not decomposed in HFS Plus strings.

The bit about the u+24xx block is misleading, the decomposition of the characters I looked at there (such as ⒜) are compatibility canonicalizations. However the CJK compatibility ideographs is a working example. U+F902 (車) decomposes to U+8ECA (車) regardless of normalization form but the technote says these must not be decomposed.