I got stuck into txr lisp a few years ago, and wrote some code for parsing bank statements. It has a builtin DSL (the @ syntax) which is great for text processing (better than perl IMO).
It's also a fun language to learn, with lots of cool features such as combinators, continuations, macros, pattern matching, range objects, JSON objects, etc.
It's more powerful than python, R or julia and would make a great successor to those languages for data science, but there are very few libraries ATM so that's a long journey. Here are some I wrote (including a zsh script for browsing the manual): https://github.com/vapniks/txr-libs
IMO a good first step would be to use the txr FFI to write a library for Apache arrow: https://arrow.apache.org/
Nice to see you here. I was just casually looking through the txr-libs there and just randomly spotted this, the CSV library, in a let form:
(fieldrx (regex-compile ;; using a regex object here messes with emacs syntax highlighting
"([^,\"\']*(\"([^\"]|\\\\\")*\"|\'([^\']|\\\\\')*\'|)[^,\"\']*)+"))
I understand that here using the #/.../ regex literal object didn't look right in your Emacs, so you dynamically compiled from a string literal.
But that now happens each time you call the function.
Lisp has your back here; we can hoist the compilation of the regex to load time using load-time:
(fieldrx (load-time (regex-compile "...RE...")))
load-time makes a difference in interpreted code, not just compiled files. Even in interpreted code, the regex-compile expression will be evaluated once only and then the cached value used:
Thanks for that! Do you have advice for learning it? How did you get started, and what made you decide to stick with it instead of other languages?
I'm happy to try to use it for Advent of Code 2024, but it would help if I could at least get oriented so I'm not going in totally cold (and thereby likely to fall off the wagon after 3 days because each one takes 2 hours in an unfamiliar language).
The data science use case is particularly interesting and not something I had considered here. Arrow bindings help immensely. Of course at work I'll probably never get a chance to use anything other than Python and R for many years to come; maybe Julia is the most exotic I think I'd ever be able to go, and that's only if I ever hit a performance bottleneck in Python that can't be remediated with Numba.
That gives a flavor of how those kinds of problems can be approached.
The solutions are all self-contained; they don't share any AoC-specific library or anything, and were developed with a throwaway program mindset, where we just solve that problem and don't try to produce anything that is reusable other than by copy paste into the next program.
If you already know a lisp such as scheme, much of it should already be familiar to you. You can program in a lisp-1/scheme like way using square brackets, and a lisp-2/common lisp way using round brackets.
The thing that really makes txr lisp different is the pattern language DSL, and that was what motivated me to learn it because I had a specific use case (processing my personal bank statements).
The manpage is huge due to the large amount of builtin functions and features to make up for the lack of external libraries, but there is a lot of overlap with other lisps.
I learnt by reading the manpage, looking at examples on stackoverflow and rosetta stone, and playing around with it in the REPL. The pattern matching language takes some getting used to, and it helps to have a good understanding of what its doing under the hood, otherwise you can get very frustrated.
Also it uses non-standard regular expressions which lack features such as repetition operators and capture groups, but you can work around this (e.g. by using the pattern language).
I recommend finding some non-time-critical use case to motivate you, and work on it in your spare time. Learn how to use the txr repl & command syntax, and create a directory containing some example text files to quickly test things and get a better understanding of how it works. Also create a file for keeping notes about txr & its quirks which you can easily refer to.
Just to clarify, the pattern language is a more powerful alternative to using regexps (but you can mix them). My bank statements are pdf's which can be converted to ascii using pdftotext, however this destroys the structure of the documents which makes extracting data using regexps (even pcre's) very difficult, but much easier using txr.