Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Out of curiosity, have you tried the UTF-8 decoder capability and stress test?

https://www.w3.org/2001/06/utf-8-wrong/UTF-8-test.html



No, why?

Thunderbird will display SJIS emails just fine. The problem with attachments is when some adds a ZIP with SJIS filenames, but then it's not Thunderbird's problem but whatever tool you use to decompress it.

Regarding Python, the default behaviour when decoding and invalid UTF-8 strings is to raise an exception. But your comment made me research it and I just found that there is a way to replace invalid bytes with U+FFFD, so I will try it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: