Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: 100 LOC Ruby forward proxy using only standard libraries (github.com/jamesmoriarty)
115 points by moriarty_ on Jan 18, 2021 | hide | past | favorite | 37 comments


I’m going to be that jerk today: why is it noteworthy that you can do this? It reminds me of the “HTTP server in one line” which essentially executes a pre-made HTTP server that’s part of Python’s standard library. Writing glue code that is small is a sort of expected feature. Writing one without depending on your stdlib is what’s hard.

Edit: just checked on one of my projects. Looks like about 600 lines of C for an HTTP server that only uses stdlib’s memory allocation and string manipulation functions as well as socket IO. For an additional 20 or so it works over HTTPS. That was rather difficult to write and I’m sure has at least two bug I am not seeing. See http.c and server.c in https://github.com/ipartola/hawkeye


I don’t think you’re a jerk, I just want to offer a different perspective.

I appreciate projects like this as a newbie. People say, “you can learn a lot by reading other people’s code.”

Unfortunately, most of the projects I try to learn from are quite large. It can be overwhelming to dissect. A project like this which sticks to the language (with few or additional libraries) allows me to pick apart what everything does.


And that's what these projects are great for. I have always appreciate simple things like this that allow me to learn new features of a language, or in this case the stdlib that comes with a language, without introducing too much cruft. I am still learning new tech and these kinds of demos are a great way to learn.

I am just being overly pedantic about the 100 LOC thing because it's been used over decades as a kind of badge of honor/bragging rights. If you implement something that really relies on nothing in N LOC then it's warranted. If it's just a teaching tool, like this is, I feel that it's better to present it as "Show HN: how I learned about HTTP support in the Ruby stdlib and wrote a toy proxy server". The README for this project actually makes it sound like this is something you could use in production, which you absolutely should not.


Yes, you are a jerk. Why? To quote / paraphrase nobel laurate Richard P. Feynman - you do not understand what you cannot build yourself. No better way to learn than to build your own proxy or you name it. See https://github.com/danistefanovic/build-your-own-x


Agreed. I am not at all against people learning how to do it. But if that's the purpose then it's probably better to present it as such (like the resources at your link does). But the claim that all you need is 100 LOC to make a proxy server work is a bit bogus when you rely on Ruby's standard library. Look at the code at https://github.com/jamesmoriarty/forward-proxy/blob/main/lib.... As expected, it doesn't parse headers but uses stdlib's code to do that. Good, less likely to have bugs. But that means that you can claim that you implemented an HTTP header parser in one line of code:

  req_headers = Hash[req.header.map { |k, v| [k, v.first] }]
It's good that you figured out how to do that, and it's useful. But that is not a 1 LOC HTTP header parser. It's an invocation of an opaque header parser that you do not understand. You might not even know what an HTTP header looks like and still write and use the above code.

Interestingly, above that code they write headers by hand. Those at least to me don't look right: they don't escape header values or handle multi-line headers. Probably should have used stdlib to write that part of the code.

I really don't mean to pick on this particular project. I just don't really get impressed with "X in Y LOC" when all it's doing is invoking the services of a standard library. I guess when you are coming from the NPM world where you need 1000 dependencies to concatenate two strings it seems really cool to only do something with the tools included. But better ecosystems allow you to do a lot more with the batteries included, which means your LOC will be really low until you do something more interesting.


I think the argument is that if you're depending heavily on a "batteries included" standard library of a high-level language, you in fact aren't really building it yourself.


Or at the very least that using the sort of badge of honor "in N LOC" isn't warranted.


Eh, who says it needs to be hard? As long as it solves a problem, it seems useful to me. And there may be people out there who aren't super familiar with Ruby or forward proxies who can learn a lot from this implementation.

Which isn't to say doing hard stuff is bad either. I've written simple servers in C as well, and those are fun projects that you can learn a lot doing. But I don't think that there's anything wrong with posting stuff like this.


I am not saying it has to be hard. Probably shouldn't have to be. I am saying that why is it noteworthy when it's just the exact glue code you'd expect? There is like 10k LOC that this is using underneath so the 100 LOC is very misleading here.


I suppose some people might be surprised that it can be done so easily in Ruby, or seeing that it's a short example gives a reason to check it out rather than if they'd assumed it was a much larger codebase they wouldn't have time to learn from. And all code is to some extent reliant on all sorts of hidden boilerplate - even a simple listen() call in C triggers a huge chain of function calls behind the scenes. And naturally higher level languages have higher levels of abstraction.


Agreed. Until you’ve programmed something with a magnetized needle and a steady hand, it’s not from scratch. I think somewhere between the “here is an HTTP proxy in 0 lines of code because I just used nginx/haproxy/varnish and wrote no code”, “here it is in 10 lines of code using Twisted/Tornado/Flask/whatever framework”, “in 100 lines using Ruby’s or Python’s stdlib”, and “I wrote it in C/Rust/Go using nothing but language primitives” the real work happens. Where that line is, is subjective because we are all at different places in what we know already and what we are currently learning. To my the only issue would be if people who know more try to misrepresent something to people who know less. As long as the goods are correctly labeled, it all has a place: we all need to know how to configure nginx, how to proxy a request using Twisted or Ruby’s standard library, and how to write a basic web server. I just think we should be careful not to mix them up.


I had to Google what a forward proxy was. Apparently it's just a regular proxy server? Never heard it called that before, might be helpful to add that to the readme.

https://smartproxy.com/blog/the-difference-between-a-reverse...


There are also up, down, charmed, and strange proxies, with obvious applications in quantum computing, nonrepudiation, and dairy farming. Strange proxies were thought to be purely theoretical until an accident involving a rubber band, a liquid lunch, and a particle accelerator collided a transparent squid with varnish at relativistic velocities and the resulting core dump subsequently examined for overflows.


I think preprending “forward” was meant to avoid confusion. Reverse proxy and Forward proxy, but now you’re wondering if there’s 3 types of proxies.


I think people add 'forward' because reverse proxies are now a lot more common than traditional proxies, so people will assume reverse proxy if you just say 'proxy'.


I also landed on similar pages and now I am confused about SOCK5 proxy vs HTTP proxy vs SSL proxy as outlines in the links above.


A reverse proxy waits for a request from another server and responds to the calling server. You can think of a reverse proxy as the "deep backend" in a web stack.

One of the first and most powerful reverse proxies was mod_perl running under httpd. Typically 5 or 6 mod_perl instances can handle requests from 300 normal httpd children.

So when somebody says "forward proxy" they usually mean "not a reverse proxy."


Mod_perl doesn't feel like a reverse proxy to me, it's just executing Perl in the Apache process directly. Fastcgi, on the other hand, is passing requests to persistently running standalone Perl instances, over a pipe, Unix domain socket, or tcp connection. Which feels more like a reverse proxy.


Mod_perl would more accurately be called a “commonly reverse proxied app server.” Typical setup was to have another Apache instance in front of it handling static requests and reverse proxying app requests.


Touting as 100 LOC using only stdlib:

* may show that Ruby can do a lot on its own.

* may show that a developer doesn't have to depend on another gem just to get the job done.

The Ruby stdlib is powerful, and depending on another gem could bite you if it were to become unmaintained at some point.

However, low LOC and use of stdlib isn't always the goal for development in-general.

I want to develop solutions quickly that work well and are easy to maintain. If it's easy to learn also, even better.

In terms of the size of the code, even in text form, there's a lot I could do. I could reduce lines by replacing EOL with semicolon. I could remove unnecessary spaces. I could use shortest variable names. I could store a compressed version of the code in the file and then do eval of the decompressed version. Those don't make it a better solution, unless the goal is to obfuscate and/or minify it.

I know that's not the point, though, and I'm mentioning it only so that people will think about LOC as what it is- just a metric.


It's using something called WEBrick. It's a "default gem", so it sounds like it comes with Ruby.

"WEBrick is an HTTP server toolkit that can be configured as an HTTPS server, a proxy server, and a virtual-host server."

So I'm reading it as 100 loc of setup to use a proxy toolkit to invoke a proxy.


Actually, as of Ruby 3, webrick is no longer a "bundled gem". You'll need to add it to your Gemfile and install it in order to use it.


I made a similar proxy using Go in 2019: https://github.com/alexrsagen/alexproxy

It's got a couple more features than OP, check out the README.


I think yours does a lot more than OPs.

Go was made for this kind of problems. It is very easy to write efficient networking code on go that can be read and understood by mere mortals.


I think this is just a tech demonstration, but I wonder how much performance this proxy has in a benchmark against some similar small (in terms of LOC) C/Rust proxy.

BTW, Github counts 149 SLOC just for this file: https://github.com/jamesmoriarty/forward-proxy/blob/769b6424...


The thread pool added 35 LOC but can be replaced with 1 LOC `Thread.new`.


Here's a Go version in 41 lines of source code I wrote in 2016:

https://github.com/gophergala2016/goxy/blob/master/goxy.go

And a forward proxy that supports middleware like adding CORS headers, logging, etc...

https://github.com/montanaflynn/roxy


This is really nice boiled down to the very essentials. I am now tempted to write a Python clone and see how much different it turns out to be in terms of code and complexity while only relying on stdlib.


Nice demo. Useful for small projects and teaching.

I don't expect the performance to be great, considering it is opening up a new connection to the forwarded host for every request. https://github.com/jamesmoriarty/forward-proxy/blob/bc7d9ec1...


I like how you shutdown the threads (with throw), it was a nice touch. This whole project shows the power of IO.copy_stream imo.


A little bit more than 100 LoC (`cloc lib bin` gives me 147 lines of Ruby) but still nice work.


That should be `cloc exe lib`. `bin` has distribution-related auto generated scripts. The actual entry point script is in `lib`.

From the sloc count on GitHub just the server file alone is listed as 149.

This also uses WEBrick to do the HTTP parsing. The webrick gem is no longer bundled with Ruby as of 3.0.0.


It looks like it forwards TCP. Forwarding UDP would be useful also.


Forwarding UDP is a bit more complicated since you will have to keep track of "sessions" unless you are relying data between exactly two points.


Possibly a dumb question, but would it not make sense to only deal with IP for a forward proxy?


You should rewrite it in Rust.


This was downvoted, but I have seen some _extremly_ fast networking software in rust recently and wouldn't mind seeing some simple examples.

(I myself would love to code in rust - once the spec and implementation become more stable. Given the current pace, maybe in 2030?)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: