Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Presumably one of those encodings is the underlying representation, and accessing it in any other representation is going to cause a horrible hit to performance as it thunks between the two.

Which does not usually matter, you're accessing it in a specific representation because you need it in that representation, usually for IO. The "horrible hit" is one you'll have to eat either way. And if you're baking the implementation details of your internal strings into your IO… god help your soul.

That aside, there's not much of a horrible hit unless you're preallocating the whole output string every time. Swift has iterators/iterables built in and I may be mistaken but I believe Swift does the sane thing and exposes noalloc iterable views, you're paying for some bit-twiddling (for the transcoding itself) and stack-allocated int8/int16/int32. Not sure how good Swift's compiler is, but I know Rust's can turn such iterations into the equivalent of the corresponding C loop, there's little to no overhead.

> In practice it feels much better to define a encoding for the platform

Why? What does that give you, aside from exposing broken implementation details as the type's public interface and having unfixable string for a decade (see: Java and everything Microsoft, because they exposed strings as being O(1)-indexed UCS2 code units early on).

> Ideally, the encoding would be specified in the type - e.g. String<UTF16>

That's crazy talk, why would you encode the implementation detail of the string's internal encoding in the type interface? I can think of a hundred things I'd put there, but the internal encoding?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: