Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Clojure never got the data science crowd even though the language is genuinely good for it. Always felt like a distribution problem more than a technical one.


In this very post you can see why: the dplyr code is just so much more readable. Like a lot of python, dplyr reads almost like pseudocode: take this dataset, select the columns that start with "bill", then filter so that bill_length is less than 30. So simple and so little fluff!


> is just so much more readable

I thought that too before I learned Clojure, now I find them equally readable.


I'm very familiar with Clojure, but even I can't make a good argument that:

    (tc/select-rows ds #(> (% "year") 2008))
is more, or at least as, intuitive as:

    filter(ds, year > 2008)
as cited above. I think there's a good argument to be made that Clojure's data processing abilities, particularly around immutable data, make a compelling case in spite of the syntax. The REPL is great too, and the JVM is fast. But I still to this day imagine infix comparisons in my head and then mentally move the comparator to the front of the list to make sure I get it right.


I am really not in data science, and I have decent Clojure experience. Is there a reason anyone would pick Clojure over something like K? From what I understand, those array languages are really good for writing safe but efficient code on rectangular data.


How about this?

    (filter ds (> year 2008))
That's a trivial Clojure macro to make work if it's what you find "intuitive."


Julia's Tidier.jl ecosystem is getting there too. It uses macros to mimic this 'special' evaluation framework of R, so the code is also readable in a similar way.


Unfortunately, having to mess around with a JVM is a tough sell for a lot of data analysis folks. I'm not saying it's rational or right, but a lot of people hear "JVM" and they go "no thank you". Personally I think it's a non-issue, but you have to meet people where they are.


The irony given the mess of Python setup where there are companies whose business is to solve Python tooling.


Oh, I completely agree. Like I said, it's not rational, but it is what it is.


I dunno, if you can slog through the Python ecosystem then the JVM is starting to look not so bad. Plus with Clojure you don't need to deal with the headache and heartache that is Maven.


I think that's true for only a limited subset of programs, though. The Clojure lib ecosystem is nowhere near the size of the broader Java ecosystem, so you frequently end up pulling Maven deps to plug holes anyway.


That is the goal of a polyglot runtime, and why Clojure was designed to be a hosted language that embraces the platform, unlike others that make their tiny island.


Uhhh, yes, but I was trying to convey to the parent that most real-world Clojure programs won't isolate you from Maven.


It's unfortunate, but people's associations with Java the lang bleed into their beliefs about the JVM, one of the most heavily-optimized VMs on the planet.

There's some historical cruft (especially the memory model), but picking the JVM as a target is a great decision (especially with Graal offering even more options).


Exactly, especially because there isn't THE JVM, rather a bunch of versions each with their own approaches to GC, JIT, JIT caches, ahead of time compilation.

Only .NET follows up on it at scale.


Meanwhile, I find it very annoying to deal with the litany of Python versions and the distinction between global packages and user packages, and needing to manage virtual environments just to run scripts. That being said, I am not an expert but that's always been my experience when I need to do anything Python related.


idk, I don't think I've had to do anything beyond install the JVM to work with Clojure. I'm not really a fan of the clj commands flag choices though (-M, -X, etc. all make no sense)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: