JSLinux rewritten to be human readable, hand deobfuscated and annotated

alevskaya · on Nov 3, 2015

Author of the repo here. I didn't expect to see this come up again, it was a weekend project done years ago while I was studying x86 internals and emulation.

There are much cleaner emulators to study in JS: for the x86 family, I recommend v86 https://github.com/copy/v86

For general emulation, RISC architectures are much simpler to understand, jor1k https://github.com/s-macke/jor1k performs very well, and can simulate the OpenRISC 1000 architecture as well as RISC-V (32bit only). There's also the 64bit ANGEL emulator for RISC-V http://riscv.org/angel/

All of these are probably more useful pedagogically than my hand-unrolled jslinux repo, unless you're particularly interested in that emulator.

I would love to see such js emulators one-day used in operating-systems courses, allowing one to step through bootloaders and kernel init code to get a real feel for the code, along with other dynamic illustrations of the machine state and various kernel data structures on the same page. If I weren't devoted full-time to synthetic biology I might be tempted to try such a thing myself!

s-macke · on Nov 3, 2015

Did you know, that he rewrote his code to support asm.js lately. Fabrice Bellard asked me to update my small benchmark.

https://github.com/s-macke/jor1k/wiki/Benchmark-with-other-e...

mtrn · on Nov 3, 2015

Very impressive for a weekend project. May I ask, how do you approach these kind of problems? Do you focus for 12h, do you make hand-written notes, do you use any IDE or code analysis tools, whiteboards - or what is your method in general?

alevskaya · on Nov 3, 2015

Reverse engineering binary or obfuscated code does take long chunks of unbroken time and a lot of attention. You have to fill up your short term memory with many structures seen or guessed at in order to begin making connections between them. Making diagrams can help, but mostly it’s reasoning through things step by step while jumping around the code and learning to recognize patterns. Sometimes little scripts can be useful for globally altering the code or testing an idea.

In the guess of a cpu emulator, it’s actually much simpler, there’s extremely comprehensive documentation on how the 386 is supposed to behave. In this case I simply had to recognize the clever tricks used to emulate such behavior efficiently in software, e.g. using XOR operations to efficiently simulate the behavior of the translation lookaside buffer (TLB). It was mostly just reformatting, documenting and renaming things intelligently.

I have to confess that I find it all therapeutic in a way. Normally I read papers and do experiments and attempt to design things in biology, where nothing is clear at all, ever. We have the genome, which looks simple, but no way to simulate anything usefully (microscopic physics is hard and traverses a vast temporal hierarchy in “running” a cell). We’re left with uncertain measurements of the behavior of enzymes and molecules to guess at what’s going on inside the stochastic, dynamic maelstrom of a cell. It’s still reverse engineering, but it’s so much slower to do on an alien, practically uncomputable architecture than in a synthetic universe made by human minds to be understandable by human minds.

mtrn · on Nov 3, 2015

Thanks for your great reply. I thought that understanding complex cell behaviour and an obfuscated OS in Javascript that runs in the browser might share some reverse engineer challenges. It is good to get some perspective, because sometimes even medium to large size code projects start to look like maelstroms.

sedeki · on Nov 3, 2015

I'm also wondering about this.

denniskane · on Nov 3, 2015

On the somewhat related note of system-level things that can be done in web browsers... here is a project that I've been working on for several years now that is meant to be an entire desktop userland: https://yc-prototype.appspot.com

(I just submitted it to YC in the recent application batch)

There is a Terminal application with a fairly complete shell implementation.

I am currently working on integrating the *nix command line environment by bringing in PNaCl implementations of vim and python. So far, I have vim working as normal in my local development system, and python integration is underway. I should be getting a decent numpy/matplotlib developer environment up and running in the coming weeks.

It is Chrome-only, due to the dependency on the HTML5 Filesystem API and PNaCl.

ps. If you are trying to use the Terminal and it doesn't seem to respond to key presses, a simple refresh of the page should do the trick. There is a weird bug that I have yet to start tracking down.

dang · on Nov 3, 2015

Discussed at the time: https://news.ycombinator.com/item?id=5400185.

ck2 · on Nov 3, 2015

Still blows my mind when I play with it: http://bellard.org/jslinux/

rplnt · on Nov 3, 2015

Did I just run gcc in my browser?

s-macke · on Nov 3, 2015

Yes, you did. Try this website to believe it ;)

http://s-macke.github.io/jor1k/demos/compile.html

kelvin0 · on Nov 3, 2015

I wonder why Fabrice hasn't already done this. Too busy? Does anyone know?