Generating code is fine, if the generated code strictly never evolves independen...

mcv · on Aug 3, 2023

Generated code is fine if it's newly generated on every build. If you're going to have to maintain the generated code, it's not generated code anymore, but duplicated code.

sanderjd · on Aug 4, 2023

Sure. I consider this a restatement of what I said, and thus inarguably right :)

rubicks · on Aug 3, 2023

Seconded. How many in this thread have found generated code in source control? My trophy case includes artifacts produced by: flex, bison, gperf, swig, and one particularly nasty CORBA stub generator.

lanstin · on Aug 3, 2023

No Perl?

raincole · on Aug 3, 2023

Yes and the original article is about how duplicated code is ok. The discussion finally went the full circle.

mcv · on Aug 3, 2023

The original article isn't very convincing though. I mean, I fully believe the single abstract super controller was a bad idea, but there are far better options than that and duplicate code. He's just comparing two of the worst ways to do it.

pphysch · on Aug 3, 2023

> But if you want to make even a single tiny modification to one of the generated files, you're busted, you need a different solution.

Not totally true, if you can robustly express your tiny change as a `sed` or `awk` script, you can just append to the generator pipeline. Speaking from experience, do not condone, etc.

dmoy · on Aug 3, 2023

I think GP means "make a tiny change [after generation, outside of the generator, and persist that change independent of the generator code]", which is where all the demons are waiting

Modifying the generator itself to do something different every time, and doing GP's stated "regenerate and throw away the old stuff" is in line

pphysch · on Aug 3, 2023

It's not modifying the generator. The generator may be a proprietary black box. It's wrapping the generator in a bash script that pipes the result through AWK, etc.

sanderjd · on Aug 4, 2023

As other commenters have noted, if the awk script is just a pure function of the output of the black-box generator to a new output, then I would consider this a modification to the generator, and no problemo.

However, if your awk script requires the current state of the generated code as input in addition to the output of the black-box generator, and tries to reconcile a diff between the two things, then yep, I consider that busted.

xamolxix · on Aug 3, 2023

> It's wrapping the generator in a bash script that pipes the result through AWK, etc.

Which is itself a generator

dmoy · on Aug 3, 2023

Sure, that's orthogonal. If you wrap the generator in your build system and still always regenerate, it's effectively the same. And also, I think, not what GP was talking about

pphysch · on Aug 3, 2023

Pedantic. There's a world of difference between grokking a new code generation DSL+codebase and a shell one-liner that fixes a string that is obviously invalid.

Since the issue is the maintenance of such systems, it is absolutely relevant.

sanderjd · on Aug 4, 2023

No thank you! I don't enjoy fighting dragons :)

ilyt · on Aug 3, 2023

> For instance generating libraries from .proto files (or other declarative schema definition solutions) works really well.

...does it ? Generated ones always feel being mismatched with the language paradigms. Maybe that's just my nightmares of dealing with MS Graph generated vomit hose of a library...

sanderjd · on Aug 4, 2023

Sure, that's true, I'm a heavy user of the standard protobuf library in python, and you certainly won't catch me singing its praises for its style.

But that's a different (and less important) kind of problem. It does not exhibit the huge issue with generated-and-then-modified code where you have to maintain all the generated code rather than just the source from which it was generated.

ilyt · on Aug 7, 2023

It's trading wasting time by few developers manually writing client, for wasting time of tens of thousands of developers that use said client that doesn't fit language well.

It is IMO very bad tradeoff.

bandrami · on Aug 4, 2023

As a Lisp guy I find this entire discussion weird

sanderjd · on Aug 4, 2023

Ha, yeah, though I would say that the lisp solution has a different downside: it's really nice to be able to see what the post-generation code all looks like. None of lisps I've used have made that as easy for their macro expansions as I would like.

bandrami · on Aug 4, 2023

There was an editor (for cmucl maybe?) that would macroexpand in a tooltip on hover and macroexpand-1 on right click (or maybe the opposite) on an s-expression. I'm surprised something like that didn't make it into slime, though you can I think macroexpand to the minibuffer. But, yeah, that's why it rewards doing macros in small pieces.

deterministic · on Aug 5, 2023

I absolutely prefer code generation over macros. It is a general solution that works for all languages, databases, protocols etc. And you can easily inspect the code generated.

amboo7 · on Aug 6, 2023

You can wrap a code generator by a macro.

deterministic · on Aug 8, 2023

Why would you want to do that? That would be adding unnecessary compile time overhead. And (again) code generation works for any language/framework/OS/… Not just for Lisp.

amboo7 · on Aug 8, 2023

You can handle any language with a read-time parser, then work with ASTs, pretty-print the result in another language. In between, it's just Lisp.

deterministic · on Aug 8, 2023

Ahhh you mean using Lisp to write the code generator. Yep makes sense.

kazinator · on Aug 8, 2023

Code generation isn't well defined other than it's a calculation in which the end product is a certain language. The input could be any format and the transformation could be anything.

Macro processing is one way to constrain and refine the concept of code generation, in a particular direction.

Arbitrary code generating programs have the problem that they don't play along. Alice has a code generator and so does Bob. Both of these generate the same language. Alice and Bob want to work on the same files of the same project, each using their code generator. The only way it can work is if their code generators recognize only certain delineated syntax and pass everything else through, so that Alice's code generator can be applied first to code that also needs Bob's generator or vice versa. Suppose Bob uses his generator in such a way that Alice syntax comes out of a construct. If the Alice generator has run first, that won't work. Running both generators repeatedly, until a fixed point is reached, might work. This will likely not scale nicely beyond a small number of code generators.

If Alice's and Bob's code generation are only doing text substitution, it would be a lot better if Alice and Bob used a common textual preprocessor and wrote their respective parts as macros. Then their work integrates and is expanded by a single application of the tool, and any number of team members can write macros independently.

It depends on the preprocesor.

Recently, I converted a C-preprocessed file into code generation.

Here I was merging similar code with a macro:

https://www.kylheku.com/cgit/txr/commit/?id=eb576809489a7b1a...

Then, short time later, I retract my decision and it gets generated:

https://www.kylheku.com/cgit/txr/commit/?id=feb8eefb2fc97e5f...

C macros don't have good tooling. They are not supported well under debugging, and even basic text navigation tools don't work well with them. Vim with tags will jump to the macro definition, but not to the definition of a function defined with a macro.

No. I'm not gonna sit there writing another tool to augment ctags so that the editor will know that sha256_init is written by the chksum_impl(sha256, SHA256_t, "SHA-256", SHA256_DIGEST_LENGTH, HA256_init, SHA256_update, SHA256_final); macro call, and jump to that line.

So the switch isn't motivated by a philosophical position on macros versus code gen, but by the concrete specifics of the state of the tooling, and the nature of the desired transformation. (E.g. are we generating large boilerplate spanning multiple functions? Or are we making syntactic sugar for walking over a list?)

This is a Lisp project; I'm not macro ignorant.

Other Lisp people have used code generation at the implementation level. The CLISP project dates back to the late 1980's, long before there was C99. Many of the C sources have .d suffixes, and are preprocessed by a script called "varbrace", which turns mixed declarations and statements into C90: statements before declarations (by emitting brace-enclosed blocks). E.g. puts("hello"); int foo = 42; becomes puts("hello"); { int foo = 42; ... }. There is no way you could do that with C macros, in any way that would be remotely reasonable, and not stray so far from the objective as to be comical.

Code generation is never off the table when we deal with languages that don't have good macros. But new language should be making provisions so that users don't have to resort to it. When language users resort to code gen, that's a sign that the language has failed in some way.

E.g. C failed for Bjarne Stroustrup by not providing support for OOP, or good enough macros to do it nicely, so he wrote a "C with Classes" code generator. Not everyone agrees; plenty still use "C without Classes" forty years later.