Have you tried lua-cjson, out of curiosity? They use a cjson.null (or something like that) value for nulls in JSON. But yeah agree with what you've said in general, if I had to do some API JSON manipulation Lua isn't what I'd jump to, even though I'm pretty familiar with it. Do Python or JS actually take that much longer to start up for a script? Node starts up pretty quickly on my system.
Now that I think of it, cjson.null might be workable even with data-transformation libraries, since it's some Lua meta-voodoo-value that can't come in the input from other systems. But then, of course, there's the problem that other data-exchange libs invent their own voodoo-values for null, like ngx.null. Precisely because they need it for actual integration use-cases.
As for the performance, I happen to use an underpowered machine, which is how I discovered the drastic difference with Lua. However, performance still matters beyond just ‘pretty quickly’ if you consider desktop software, especially productivity software—which has to be snappy. Scripting your productivity software is a textbook case for dynamic languages, and almost the exact niche for Lua. Even on my machine, there was a moment when I had to make sure that Alfred actually called my Lua script—because the results popped up immediately (despite going through the command-line interpreter, not Luajit). Meanwhile, people are making productivity software with Electron or Python, which to me verges on ridiculous.
Yeah, I definitely care about performance more in that tradeoff. In my experience when performance matters, these details about using such a null representation and what different representations lend themselves also tends to matter. vs. gluing libraries to libraries willy-nilly. At that point this null issue is far from the main thing that comes up bc. you care about data layout enough that you have control over that representation.
One example I'm running into in practice these days is--I'm using chipmunk2d game physics engine in a library. It does heap allocations, while all the component data for other stuff in my game tends go be well managed into pools that I can walk through contiguously. Even if I use a custom allocator for it, it uses pointers to refer between objects internally so I can't "move" instances of its eg. `cpBody` object around in memory. Would've been so nice to just own its data and let the library be layout-agnostic but alas.
But yeah those are the data-data glues I've been interested in--libraries that kind of own representation but not location I guess.
For your particular issue though, I think it really is bc. Lua is a language that out of the box doesn't tend to own representation, to the point of not eg. having its own class system etc, and it mostly shines when used in ways that play into that strength (eg. you want to wrap your own object ontology for ppl to script). The best libraries for Lua have tended to the invent their own representation and expect you to care about and own the mappings between libraries. esp. for JSON, the impedance mismatch with Lua tables puts it in this weird uncanny valley.
Tbh my fav method of using JSON is to not have an in memory document format at all and just treat it as reader / writer objects directly into domain data so that you don't keep paying this intermediatw representation conversion cost. eg. https://github.com/beached/daw_json_link is interesting here In my own game stuff I just have game components accept reader/writer objects to read from / write to and each component tends to have domain logic about what it means to be missing something, what some defaults are, etc.
> when performance matters, these details about using such a null representation and what different representations lend themselves also tends to matter. vs. gluing libraries to libraries willy-nilly
Well, Lua juuuust about hits the sweet spot of a good ‘layman’ dynamic language with great performance, and my hope is that perhaps it could be moved a bit to the generic-scripting-language side without losing performance. Basically, since libraries invent magic values to use instead of null, presumably Lua could provide such a value for them all to agree on.
> for JSON, the impedance mismatch with Lua tables puts it in this weird uncanny valley
Not seeing the mismatch here: to my knowledge, tables can be employed as (untyped) arrays or dictionaries, exactly the structures in JSON—and what I've long been using in PHP, JS and Python. If I'm not forgetting something, null is the only thing missing from making this triumvirate an integrated quartet.
> Tbh my fav method of using JSON is to not have an in memory document format at all and just treat it as reader / writer objects directly into domain data so that you don't keep paying this intermediate representation conversion cost
As we're discussing this on a post about a Lispy thing, I personally can't back your approach, for I'm lately buying into functional transformations big-time. Here I'm more on the convenience side: it seems to work fast enough for interactive desktop cases, though I've heard that the in-place method works wonders for busy web apps. In fact, afaik some high-level languages/environments kinda do the in-place thing implicitly by not copying strings between structures or when extracting substrings, and possibly by doing COW. Dunno if Lua does any of that (afaik it copies pointers between structures, not the data—this behavior is pretty much expected of runtimes these days).
However, I occasionally do wish that Clojure-style lazy structures were more widely employed, so I could use transformations without worrying that they might crunch some stuff needlessly.
https://luafun.github.io/intro.html is the best I've found for performant stream-y transforms. But I tend to just believe in always writing for-loops because they are how computers actually work and IMO it's easier to reason about what they are doing (of course, encapsulating logic per element and so on into their own functions). C++'s STL stuff I think is actually also good / ok here and one exception I make because you can see the code generated and things get inlined, and things like `std::remove_if` or `std::lower_bound` are actually well implemented. The rest of the functional shenanigans are cool I think for their theory (eg. as far as applicative functors for parsing and whatnot); but I think if you care about performance, the data layout in memory for cache utilization seems to be what matters now. Clojure is ok yeah but IMO you should know that it's impl'd with structural sharing + GC and decide that that's the behavior you want. Here's eg. a library that does persistent data structures of that sort in C++: https://sinusoid.es/immer/ (no transients for maps though)
Re: the Lua / JS mismatch -- I think the main thing is that you kind of have to decide if a table is an array or not, and in a lot of cases there's no sensible thing here necessarily, especially if there are a lot of `nil`s in the array -- so is it a sparse JSON array or an object with those keys (stringified -- which is gnarly) or what? etc. You can see settings for this in lua-cjson [1], and the fact that the settings exist is the issue. It's not as bad for pure JSON as much as when you start wanting to do things like generate a diff from changes in the Lua side (that's when you start getting sparse arrays) and apply those to the same document in JS, or the other way round and so on. I wrote a real-time automatic diff sync'ing thing once that was something like this; and that was the main place where there the impedance mismatch came up. nil-null basically didn't matter -- I decided on one conversion and stuck with it (I basically decided we would not use `null` for anything -- just use something that represents the actual semantic thing you want instead).
Also FWIW, I don't find CL super 'functional'. Scheme and Clojure a little bit, sure. The most functional stuff I feel like is Haskell / ML. If it's just about closures--I feel like every language except C, Zig and Ada has closures these days.
Lua strings are pretty simple -- just an array and immutable (there's no COW because there is no W) and always interned. You have to build a new string with the modification you want. For a 'builder' approach usually the pattern is to accumulate a table then call `table.join`. Usually I manage fine with `str = str .. 'new stuff'` though unless it's in some hot code path.