Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Reverse Engineering Snapchat (Part II): Debofuscating the Undeobfuscatable (hot3eed.github.io)
295 points by 3eed on June 21, 2020 | hide | past | favorite | 61 comments


This level of API obfuscation reminds me of forever ago when MSN Messenger figured out AOL's AIM API, so MSN Messenger could send AIM messages, which annoyed AOL. AOL would make API changes to break MSN, but MSN would update the client and stay ahead. Eventually to make the API uncloneable, AOL changed their payload to exploit a buffer overrun in their own AIM clients that wouldn't be in the MSN clients.

https://nplusonemag.com/issue-19/essays/chat-wars/


I think the most important, and this article left it out, is why exactly this makes the API uncloneable - why couldn't MSN just emulate the buffer overflow behavior like it was doing with everything so far?

As the article says, the client also responded with some code. What I think was happening: the client was responding with portions of its own executable memory, which could be checked by AOL servers.

That way for MSN to emulate that behavior, it would need to have the AIM client's executable code inside itself, which would be an easy win in a copyright lawsuit.


Why not just send copy written code as part of the payload?


Especially trademark violations are very effective for this. For example the original GameBoy used it as DRM. The cartridge had to contain a Nintendo(R) logo which was displayed on boot to work, a legal deterrent for publishing unlicensed games that still works to this day.


Except that the use of copyrighted and trademarked data for means of enabling interoperability has been ruled fair use in the Sega v. Accolade[1] case. So I believe Nintendo's use of the logo in this way is not much more than snake oil.

[1] https://en.wikipedia.org/wiki/Sega_v._Accolade


Gentle reminder that the USA isn't the only jurisdiction. In countries without fair use, for example, this wouldn't even be able to be applied.


That case was decided after the initial release of the Game Boy, so it wasn't an unreasonable thing to try at the time.


The TrackIR API does something similar to lock out unauthorised third party client applications.


Interesting time that was. I don't believe that any of these internet giants would ship a feature that is effectively a hack, in this day and age.

HTC and Palm also engaged in the back-and-forth, when Palm attempted to get their OS to sync with iTunes.

https://www.wired.com/2009/10/palm-pre-itunes/


You will be scared to find out that a lot of Fintech has webscraping as an accepted part of their stack...


Yup, not only as accepted part of their stack but also offered as a product that sometimes users need to input their bank details in 3rd party applications from some fintechs.

If you look under the hood there is a lot of grey areas being exploited by fintech, all around...


Very interesting. I think this would likely lead to lawsuits today, under a complaint violating DMCA.


Is there legal precedent for copyrighting APIs?


That's actually the central issue behind the Supreme Court case battle between Oracle and Google right now: whether or not you can copyright apis


Indeed, an answer of "no, they're not copyrightable" would leave the world generally how it is today. An answer of "yes, all existing APIs are copyrightable" would be tremendously impactful in all sorts of ways I can't even imagine. Presumably someone would immediately sue somebody else because of a tenuous claim to ownership of, say, HTTP or some JavaScript extension.

Microsoft has filed an amibus brief for the "not copyrightable side," as has the EFF, IBM, Red Hat, and a team of 83 computer scientists.

You should probably note the folks on the "yes, copyrightable" side for future reference as well, including Dolby, the Motion Picture Alliance, SAS, the DoJ, the Recording Industry Association of America, and also 4 CS professors (Dr Spafford of Purdue, Dr. Ding of UC Davis, Dr Hollaar at Utah H, and Dr. Porter at maryland U).


This thread is about web or network service APIs, which, thanks to the CFAA, have broad leeway to dictate what client software you are legally allowed to use to speak to it. It's a grey area and some real bullshit, IMO.

You are talking about programmatic APIs, which is a horse of a different color: a copyright issue, which is still being figured out.

It's annoying that we overload the same term for both things.


That's going to be a really big decision in our world of software. I hope that SCOTUS doesn't side with the devil.


Referring to the other side in a debate as "the devil," aside from being hyperbolic to the point of inducing an eye roll in every reasonable person within earshot, is exactly how the US ended up with a reality TV show host in the White House.


Can't AOL use some kinda session token for this ? Super confused.


Hey OP, since you're here:

I find this pretty hard to follow. Would you be open to writing a longform version of this aimed at the tutorial level?

Reading between the lines, I would guess you're trying to demonstrate that you really know what you're doing. Maybe as a proof of concept for possible employment opportunities. If so, that's great! Good luck.

But if I were interested in reverse engineering some other app, I don't think I could understand what you've done well enough to use these techniques on that app. Except maybe the breakpointing within `fuck_debug`, that was pretty slick and easy to follow.


If reading even the first part of this series doesn't help, read this beginner tutorial I recently wrote: https://yasoob.me/posts/reverse-engineering-android-apps-apk... It starts you off with the basics and uses smali.

After that you can explore this tutorial on frida: https://securitygrind.com/bypassing-android-ssl-pinning-with... These two techniques will give you some more basic knowledge of how app reversing is done. :)


It's true, these posts are for intermediate and upper reverse engineers. It would really take a book to explain it from the ground up it like someone here mentioned. I suggest getting some background in assembly, then reading the OWASP guide (link in my previous HN post), and persistence.


Obviously not the OP but I think that a longform version of this would be an entire book/college level course. I wish I could learn how to reverse state of the art obfuscation in a single, long post but that's just not how it works.


I would pay for that book.


I found it fairly reasonable, although you'd have to have a general idea of the subject beforehand. I read it as a being aimed at reverse engineers who are looking for some general techniques to bypass common anti-debugging/obfuscation features rather than "how to reverse engineer apps 101".


"Reasonable" is a stretch, "interesting" is the right word. Personally I'd put this in the "Oh, huh" box along with quantum crypto. It's interesting, it's complex and it's got way too many engineering hours behind it... but ultimately for 99% of people or even 99% of computer scientists or HN readers, it's just fascinating trivia.

I absolutely appreciate these posts, this guy spent WEEKS delving into the depths of SnapChat just for the joy of discovery.

Maybe a good classification would be that part 1 is detailing a number of obfuscation techniques and the key thing to take away is that all of them CAN be bypassed.


It's easier to follow if you read part I of the series first:

https://hot3eed.github.io/2020/06/18/snap_p1_obfuscations.ht...


+1. Need a simpler version if possible.


Both iOS and recent Androids have by now a form of app attestation: the server can tell if the caller is the legitimate app or not (with good enough confidence - as everything, it's not unbreakable).

Doesn't that make obfuscation kind of pointless? Even if your knock-off app knows everything about the API of the original service, it won't be able to use it because it is not the genuine app or maybe it is but it is not running in a real iOS/Android device.

Or maybe this is only meant to include non-Android certified phones (= China)?


DeviceCheck on iOS support iOS 11 and up. Which would cut off 7% of users[1], a bit extreme. But when the time comes when you don't have to cut off anyone, it'll be very interesting to see what'll happen on iOS. Someone will bypass it? Death of reverse engineering? Who knows. On Android, an HN user mentioned in the previous post that it's a solved problem[2].

[1]: https://developer.apple.com/support/app-store/ [2]: https://magiskmanager.com/


seems like something having a rooted os would fix pretty quickly


Seems like the creator of Magisk Manager could not get around Android's implementation: https://twitter.com/topjohnwu/status/1245956080779198464?s=1...


I tried adding safetynet attestation on launch for all Android clients and ran into rate limit pretty fast. (iirc it's about 10k/hr)

Devicecheck have no such problem though, but it doesn't really feel designed for the use case - you need to implement an anti replay system yourself.


As someone who wrote similar obfuscators (manually) back in 2003-2006 to protect a few indie games distributed on PocketPC (ARM7/WinCE) I found it quite conforting to see that the techniques are still similar.

I wonder about something, how long did it take?


For fuckup_debugging, can't you use hardware breakpoints instead?

Also, why not patch the binary? I think iteratively patching out protections (in a repeatable, versioned way) would be my approach. It is then applicable to other binaries as well.


Hardware breakpoints are a little complicated on iOS. And patching the binary would of course only work if no other code verified the validity of the page you touched.


Are hardware breakpoints even possible on iOS? And correct, you can't patch the binary because there many anti-tampering measures, you could probably bypass those, but that's going a different route.


Not the OP, but I can answer I guess. Hardware breakpoints are very limited (number of breakpoints you can put). Usually when you are debugging a decent target, number of breakpoints you use easily reach 50-60.


No doubt, but it's better than pausing every time. I guess with scripting it isn't really different.



For MBA, there's also Arybo[1] from Quarkslab. Never used it and seeing the reference to SSPAM, I assume the author is aware of the tool.

[1] https://github.com/quarkslab/arybo


I came across Arybo while working on the binary but I can't remember why I didn't use it, this is vague memory now. Anyway it does the job in one go, I added an edit.


Shouldn’t you be able to find any code that scans for breakpoints easily and patch it to be blind?


Normally it is more like calculating hash from code piece, then xor result with constant and jump. (In general cases, never reversed snap)

So usually there is nothing to patch.


Can you calculate that hash (of the original binary) yourself and patch THAT in the function?


I’m surprised that Snapchat doesn’t check for the mere presence of a debugger and instead tries to look for breakpoints. Or perhaps you’ve already found and patched those checks out?


It does check for a debugger. But that would be through sysctl, or the csops sys call, which would be trivial to patch and a single point of failure.


Anyone else picture Deebo from “Friday” (Zeus from “No Holds Barred”) smashing apart source code after reading the title?

Prediction: Just me.

By the way, love both articles. Thanks for taking the time to share.


I wonder if the Android version uses the same technique and if not, if it would be harder/easier to break


The title is misspelled (s/Debofusc/Deobfusc/).


[flagged]


I'm not sure what you're asking. Obfsucating means making the code unreadable/unintelligible. Think minimfy js on steroids. Debofuscating just means undoing the obfuscation.


'Debo' or 'Deob'?


I can't find any formal definitions after a quick google search, but there seem to be a ton of uses of "debofuscating" in different research papers.


It's just a typo, and we're all human. Maybe focus on the really interesting research instead?


i was just clarifying the poster above lol


It's a typo in the article title, relax. The <title> has the correct spelling.


He obfuscated the title!


That guy gets it


Nice contribution to the discussion /s


Maybe you don't understand Latin root words as prefixes and suffixes, in which case I highly recommend doing a bit of research into it. It really makes the English language more understandable when you can parse words based on their roots rather than on rote memorization.


They were commenting on "bofuscate" vs. "obfuscate". I suspect they've got a healthy understanding of what the prefix "de" means here.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: