I got stuck into txr lisp a few years ago, and wrote some code for parsing bank ...

kazinator · on Dec 10, 2023

Nice to see you here. I was just casually looking through the txr-libs there and just randomly spotted this, the CSV library, in a let form:

  (fieldrx (regex-compile ;; using a regex object here messes with emacs syntax highlighting
             "([^,\"\']*(\"([^\"]|\\\\\")*\"|\'([^\']|\\\\\')*\'|)[^,\"\']*)+"))

I understand that here using the #/.../ regex literal object didn't look right in your Emacs, so you dynamically compiled from a string literal.

But that now happens each time you call the function.

Lisp has your back here; we can hoist the compilation of the regex to load time using load-time:

  (fieldrx (load-time (regex-compile "...RE...")))

load-time makes a difference in interpreted code, not just compiled files. Even in interpreted code, the regex-compile expression will be evaluated once only and then the cached value used:

  1> (dotimes (i 5) (prinl "hi"))
  "hi"
  "hi"
  "hi"
  "hi"
  "hi"
  nil
  2> (dotimes (i 5) (load-time (prinl "hi")))
  "hi"
  nil

kazinator · on Dec 11, 2023

Of course, you can also just initialize a top-level variable with the object returned by regex-compile and use that.

vcdimension · on Dec 11, 2023

Cheers Kaz, I'll make the changes. There's still quite a few TODO's in that code that I intend on fixing up at some point.

nerdponx · on Dec 8, 2023

Thanks for that! Do you have advice for learning it? How did you get started, and what made you decide to stick with it instead of other languages?

I'm happy to try to use it for Advent of Code 2024, but it would help if I could at least get oriented so I'm not going in totally cold (and thereby likely to fall off the wagon after 3 days because each one takes 2 hours in an unfamiliar language).

The data science use case is particularly interesting and not something I had considered here. Arrow bindings help immensely. Of course at work I'll probably never get a chance to use anything other than Python and R for many years to come; maybe Julia is the most exotic I think I'd ever be able to go, and that's only if I ever hit a performance bottleneck in Python that can't be remediated with Numba.

kazinator · on Dec 8, 2023

There are some Advent of Code 2021 solutions in TXR:

https://www.kylheku.com/cgit/advent/tree

That gives a flavor of how those kinds of problems can be approached.

The solutions are all self-contained; they don't share any AoC-specific library or anything, and were developed with a throwaway program mindset, where we just solve that problem and don't try to produce anything that is reusable other than by copy paste into the next program.

vcdimension · on Dec 9, 2023

If you already know a lisp such as scheme, much of it should already be familiar to you. You can program in a lisp-1/scheme like way using square brackets, and a lisp-2/common lisp way using round brackets. The thing that really makes txr lisp different is the pattern language DSL, and that was what motivated me to learn it because I had a specific use case (processing my personal bank statements). The manpage is huge due to the large amount of builtin functions and features to make up for the lack of external libraries, but there is a lot of overlap with other lisps. I learnt by reading the manpage, looking at examples on stackoverflow and rosetta stone, and playing around with it in the REPL. The pattern matching language takes some getting used to, and it helps to have a good understanding of what its doing under the hood, otherwise you can get very frustrated. Also it uses non-standard regular expressions which lack features such as repetition operators and capture groups, but you can work around this (e.g. by using the pattern language). I recommend finding some non-time-critical use case to motivate you, and work on it in your spare time. Learn how to use the txr repl & command syntax, and create a directory containing some example text files to quickly test things and get a better understanding of how it works. Also create a file for keeping notes about txr & its quirks which you can easily refer to.

vcdimension · on Dec 11, 2023

Just to clarify, the pattern language is a more powerful alternative to using regexps (but you can mix them). My bank statements are pdf's which can be converted to ascii using pdftotext, however this destroys the structure of the documents which makes extracting data using regexps (even pcre's) very difficult, but much easier using txr.