The future of programming with certified program synthesis

Animats · on July 6, 2021

OK, so you write pre and post conditions for a function, and the function gets generated.

OK, so now we're programming at the specification level. I've written formal specifications for an operating system. They're harder to write than code for non-trivial programs.

Some things are trivial to specify but hard to code. A hash table, for example. It's easy to specify what a hash table does, but that gives no guidance on how to implement one. You might get out something that does a linear search. File systems and databases are similarly not too hard to specify - the specification describes the abstract brute-force approach.

They need to get much further along before announcing victory.

vidarh · on July 6, 2021

You indirectly get at an important point:

A complete specification must include best/worst/anortised performance constraints, and a lot of other details we often leave unstated.

And then it gets a lot harder...

qayxc · on July 6, 2021

A complete specification also must include assumptions about external systems and limits.

Here the assumptions are infinite stack (or item count < stack size limit) and a malloc that never fails.

Incomplete specs are just as bad as buggy code. I'd even argue that incomplete specs are worse in this scenario, because the "proven correctness" gives a false sense of security.

amw-zero · on July 6, 2021

Performance is not a part of behavior. Specification is about specifying logical behavior and that's it. I don't know of any specification framework that supports saying anything about performance.

mannykannot · on July 6, 2021

This is a very relevant point about program synthesis in general, but if we look at an example given here, the linked_list spatial predicate, the specification is at the much lower level of abstraction of pointers and memory addresses. I would guess that a corresponding specification for a hash table would at least preclude candidate solutions that use linear searching - though there are other ways to make hash tables inefficient, such as a poor hashing function or an inefficient way to handle collisions.

The purpose of this article seems to be to introduce us to separation logic for reasoning about programs with pointers, and in particular the separating conjunct, which "restores compositionality of analysis" (which is lost when spatial assertions assert that some particular property holds over the entire heap.) It goes on to explain that "As it turns out, the compositionality and modularity of separation logic means that it can actually be quite easily adapted to fit into a deductive-reasoning-based automated synthesis strategy."

The title, therefore, should not be read as a claim about the future of programming in general.

Animats · on July 6, 2021

In predicate calculus specification languages, you can write

    X such that P(x)

where P is a test predicate. Doesn't mean you have any idea how to generate X. A long time ago, when this idea was going around for the first time, I was using a formal specification language called SPECIAL. I wrote the statement of Fermat's Last Theorem in Special to demonstrate that merely being able to formally specify something doesn't necessarily lead to a solution.

mannykannot · on July 7, 2021

Indeed, but as far as I can tell, the particular specification language here is at a much lower level of abstraction, one where we have only slightly-abstracted pointers, and consequently, I suspect, a specification using this particular logic will have no choice but to say how to generate X (or perhaps I should say that the intention is that one should say how to generate X, even if, formally, one can avoid doing so.)

dexen · on July 6, 2021

In programming, coding is a tool for discovering the under-specified parts of the specification.

benlivengood · on July 6, 2021

> Some things are trivial to specify but hard to code. A hash table, for example. It's easy to specify what a hash table does, but that gives no guidance on how to implement one. You might get out something that does a linear search. File systems and databases are similarly not too hard to specify - the specification describes the abstract brute-force approach.

One of the next steps for formally proven software development is automated analysis of algorithms. The same procedure to generate a proof of a correct implementation of a hash table could be extended to take an axiom of randomness (which would be an optimization the metaprogrammer could supply, breaking the true worst-case analysis based on their trust of the hash function and potential adversaries) for the hashing function in order to prove the amortized search/insertion time to be O(1) per element assuming the implementation is free to resize and rehash the table as it grows. The synthesizer could further search for implementations that minimize amortized/average cost using the available axioms and predicates, and benchmark them on a particular architecture much like JIT runtimes do now.

amw-zero · on July 6, 2021

There's just always a "who watches the watcher" recursive paradox with formal specification / verification. Who ensures the specification is a good one?

I believe it is just that - a logical paradox. It cannot be answered. We will always have to be double-checking the specifications, perhaps via model checking.

All that's to say, I agree with you, it's not like this is going to solve all problems just by using formal specs.

YeGoblynQueenne · on July 6, 2021

>> Some things are trivial to specify but hard to code.

Well, that's the point of automating the coding effort. The specification guides a search for a program that satisfies it. In your hash table example there's no need to tell the synthesiser how to code the hash table, only what the hash table has to do.

The search of course can get very expensive and that's an important bottleneck in program synthesis. But, in principle, if the specification is correct, then a correct program is somewhere in the program search space and a search will eventually find it.

YeGoblynQueenne · on July 6, 2021

Did I accidentally say something offensive in the above comment?

sbierwagen · on July 6, 2021

Just reading the thread now, and the silent downvoters might be objecting to the "eventually" part. "Find the busy beaver program for a 20 state Turing machine" is easy to specify, but could take quadrillions of years to return a result.

YeGoblynQueenne · on July 7, 2021

Thanks, that could be the reason. But it's also a truism. The only guarantee that can be given for a search procedure is that if a target exists, the search will find it. An infinite search space is an infinite search space and there is no search that can search it in finite time.

Anyway that's the rules of the game for search-based program synthesis. The best that can be done is to guarantee that the search space contains the program. It's an important bottleneck and in my opinion is part of what has kept the field back from fully achieving its goals (which are diverse- and there are diverse sub-fields also).

fjfaase · on July 6, 2021

The code snippet is incorrect. I do not see a check on malloc returning null. In general that would be okay, but in an embedded environment with limited memory, that would be a serious problem.

Also, using recursion is bad, because copying a long list will lead to a stack overflow.

omnicognate · on July 6, 2021

It's never OK. Machines with lots of memory run out of memory too.

To present a piece of research in such trollish terms, accusing "stubborn C developers" of causing security vulnerabilities, and then present an example of code for which "we didn't even have to bother reasoning about how exactly it did its pointer manipulations" that segfaults if it runs out of memory (it dereferences the result of malloc and writes to that location 2 lines down) is beyond hilarious.

bsenftner · on July 6, 2021

The tone and example code is ludicrous. If the expressed attitude is how they perceive what they are doing, their attitudes are going to be the reason their project fails.

bestouff · on July 6, 2021

On Linux machines malloc()'s returning NULL doesn't mean you're out of memory, it means you're out of address space which is very different.

omnicognate · on July 6, 2021

In a system with virtual memory there are a variety of limits you can hit that will prevent your process allocating more memory and cause malloc to return zero. It's normal to informally lump all these together as "running out of memory" when the distinction doesn't matter, as here.

Address space size is one of these limits, not the only one. Hitting it is rare on 64-bit operating systems, common on 32-bit ones.

kaba0 · on July 6, 2021

Unfortunately, I’ve yet to come across a major application on linux that handles this condition gracefully.

fjfaase · on July 6, 2021

If I understand it correctly, it is even worse, because malloc could return not NULL and still throw an exception when you try to write to the 'allocated' memory. See for example https://linuxembedded.fr/2020/01/overcommit-memory-in-linux

jbluepolarbear · on July 6, 2021

The first line doesn’t check if is the void **r pointer is null before the dereference. The incorrectness starts there.

Jweb_Guru · on July 6, 2021

If the precondition states that the pointer is valid (which I expect it does), then this is not incorrect. If it does not state anything about the pointer, then just checking for NULL isn't sufficient. It'd have to explicitly state "the pointer is either valid for the duration of the function, or NULL" for this to be a worthwhile check to perform.

gopiandcode · on July 6, 2021

Yes, that is correct. The precondition for the function states that the pointer should not be null.

The specification for the listcopy is listed slightly lower down on the article as follows:

  // r :-> x ** linked_list(x,S)
  void listcopy(void **r)
  // r :-> y ** linked_list(y,S) ** linked_list(x,S)

As the precondition explicitly states that r points to a value x, it can not be NULL, so no NULL checks are needed.

I'm not sure what the sibling post is trying to imply with recursive calls though.

vidarh · on July 6, 2021

> I'm not sure what the sibling post is trying to imply with recursive calls though.

I think that is based on a misunderstanding of what it being passed in at the point listcopy() is called in the body of listcopy(). The pointer passed in there can't be NULL due to how it is obtained, so no further checks is necessary.

omnicognate · on July 6, 2021

AFAICT the recursion doesn't mean the function can segfault due to a null pointer. It does however mean that it will cause a stack overflow if the list is sufficiently large, which makes it a bad implementation. It can also segfault if malloc returns zero as discussed above.

Apart from these serious runtime behaviour flaws the code is also violently unreadable, which you strangely use as a way of attacking C programmers generally, as if this code is representative of human-coded C. This implementation would fail code review for numerous reasons anywhere I've ever worked.

The article itself looks detailed and very interesting and I'll have a proper read of it if I get the time. You didn't need to make it so aggressive and clickbaity.

thesuperbigfrog · on July 6, 2021

The problem with generated code is that it is rarely easy to understand or maintain. Suppose you need to add new features or fix a bug, to do so you will need all of the original specifications and code generation tools. Will those tools work on other platforms? Do they support all of the low level needs of C? For example, do they support bit-level layout for structs or inline assembly?

I believe that Rust will make more inroads to replace C in many domains. This will eliminate many of C's known issues that Certified Program Synthesis seeks to address.

If you need formal verification or proofs, SPARK https://en.wikipedia.org/wiki/SPARK_(programming_language) is a mature and effective tool with a proven track record. It lets developers write code that is easy to understand and maintain compared to generated code and can formally prove the correctness of programs.

gravypod · on July 6, 2021

There's nothing wrong with generated code if it's never manually modified. Think of binaries: we never manually edit them but they are essentially "generated".

More modern build systems automate this process of hiding the intermediate steps. One great example of this is the way Thrift and gRPC. You never edit the stubs or server code. You just implement an interface.

hzhou321 · on July 6, 2021

What's wrong is that the code is being generated, shown, pushed to repository, and distributed (for the purpose of human editing). With a compiler, the binary output is not shown, not maintained. The source code is. The generated code is no longer source code. Giving it an appearance of source code and even distributed as source code, is wrong.

The source code for the mentioned example is:

    ...
    [copy linked list: annotations]
    ...

I am not saying that the compiled/generated output shouldn't be human readable -- I strongly support readable intermediate outputs. But we need always recognize what is the real source code and what is the tool. Confusing them inflates the complexity and promotes wrong attitude, such as not taking responsibility for one's code or not caring about understanding the code.

gravypod · on July 6, 2021

> The generated code is no longer source code. Giving it an appearance of source code and even distributed as source code, is wrong.

Yes, exactly.

amw-zero · on July 6, 2021

this did occur to me while reading the post - why not just generate assembly from the specifications? What's the point of the intermediate translation to C?

curryst · on July 6, 2021

Assembly is specific to the architecture, potentially in non-trivial ways. They either have to manually write support for each CPU architecture, or they need a higher level abstraction language. C is basically that higher level abstraction language.

The tooling also helps. C compilers do some pretty fancy optimizations. Again, they can either re-implement that inside their tool, or they can just output C and use the existing tools.

hzhou321 · on July 6, 2021

So each tool can stack on top of another tool.

thesuperbigfrog · on July 6, 2021

>> There's nothing wrong with generated code if it's never manually modified.

Agreed. The issue is that I have never encountered code that did not need to be changed later.

I have encountered generated code for which the code generation system was no longer available and it was a tremendous pain to understand and modify.

gravypod · on July 6, 2021

The reason this is a pain is the projects doing that were already starting from a broken position. The code gen should be entirely managed by the build system. You should never need to modify or look at the generated code. If you need to change the generation output you should modify the "compiler" (tool doing the generating) or the input source you are using to generate.

Examples:

If you want to add an int32 into a protobuf you modify the .proto file with your message.

If you want to change `int32` to `int32_t` instead of `int` to make the code generator more multi-arch friendly you'd change the protoc plugin for your language.

"I have never encountered code that did not need to be changed" is the wrong mindset to have here. You're thinking of the generated code as source code instead of an internal build step (same thing as Clang or GCC's IR).

Also if you're in a situation where "code generation system was no longer available" then your build would never succeed. This should be like saying "gcc was no longer available". It shouldn't be possible and in reality the generated code shouldn't even be checked into the source tree.

thesuperbigfrog · on July 6, 2021

>> "I have never encountered code that did not need to be changed" is the wrong mindset to have here. You're thinking of the generated code as source code instead of an internal build step (same thing as Clang or GCC's IR).

>> Also if you're in a situation where "code generation system was no longer available" then your build would never succeed. This should be like saying "gcc was no longer available". It shouldn't be possible and in reality the generated code shouldn't even be checked into the source tree.

No. All I had was the generated code. It was not a "wrong mindset", it was the reality of the situation.

The generated code was all that was available. The code generation system was proprietary and ran on a VAX. I did not have the model specifications used to generate the code. I did not have the proprietary code generation system. I did not have a VAX system to do the code generation. All I had was the generated code because that was all that was available.

The generated code needed to be ported to a different machine architecture and new operating system (Unix). Compilers for the generated code were available and some of the generated code would build fine while other parts needed to be adapted for machine word size, endianness, etc.

>> the projects doing that were already starting from a broken position

Sometimes you don't get to choose the situation. Maintaining legacy code and updating it to run in modern environments is not easy, but it does pay well.

I completely agree that the original specification and code generation system are the "one true source" and that the generated source code should not have been kept, but those decisions were made long before I entered the picture.

swiley · on July 6, 2021

It's odd, people freak out all the time with hiring saying "hiring a poor candidate does more damage than hiring no one" but spewing ML generated noise directly into your repo is somehow considered reasonable.

jhgb · on July 6, 2021

> Suppose you need to add new features or fix a bug, to do so you will need all of the original specifications and code generation tools.

So, nothing changes compared to using existing high level language compilers, then?

thesuperbigfrog · on July 6, 2021

>> So, nothing changes compared to using existing high level language compilers, then?

On the contrary, it acts as an additional layer of complexity.

Looking at the example generated code snippet from the article:

  void listcopy(void \*r) {
    void *x2 = *r;
    if (x2 == NULL) {
      return;
    } else {
      int vx22 = *(int *)x2;
      void *nxtx22 = *((void \*)x2+1);
      *r = nxtx22;
      listcopy(r);
      void *y12 = *(void \*)r;
      void *y2 = (void *) malloc(2 * sizeof(void *));
      *r = y2;
      *((int *)y2) = vx22;
      *((void \*)y2+1) = y12;
      return;
    }
  }

If you don't have the specification for listcopy or the code generation tools, but only the code shown and you need to make changes then you have a more difficult situation than normal, human-written code.

jhgb · on July 6, 2021

So, as I said, nothing changes compared to using existing high level language compilers. You wouldn't normally hand-edit .S files from your C compiler either, why would you do it with this intermediate form?

thesuperbigfrog · on July 6, 2021

Sometimes you don't get to choose the situation: the intermediate form / generated source code might be all that is available.

See https://news.ycombinator.com/item?id=27750554

jhgb · on July 6, 2021

True, but this situation has always existed with any higher level languages. Ultimately someone will have to write patches for binaries, even.

thesuperbigfrog · on July 6, 2021

>> Ultimately someone will have to write patches for binaries, even.

Yes, but it is not as easy to patch binaries compared to patching the source code.

amw-zero · on July 6, 2021

> Suppose you need to add new features or fix a bug, to do so you will need all of the original specifications and code generation tools.

Yes, that is the point of code generation. You had a positive sentiment towards Rust - Rust has a compiler. In order to work with Rust, you have to have "all of the original specifications and code generation tools." Source code is just the specification of the program, and the compiler is the code generation tool.

Why do you trust compilers any more than this?

thesuperbigfrog · on July 6, 2021

>> Source code is just the specification of the program, and the compiler is the code generation tool.

Compilers generate machine code. When I said "all of the original specifications and code generation tools", I was referring to tools that generate SOURCE code using models or specification languages such as:

https://en.wikipedia.org/wiki/Business_Process_Model_and_Not...

https://en.wikipedia.org/wiki/Systems_Modeling_Language

https://en.wikipedia.org/wiki/Unified_Modeling_Language

These tools use models or specifications as input and generate source code output.

Frequently these are expensive, proprietary tools that are intended to let semi-technical domain experts build solutions without needing to know how to program computers. They are the previous iteration of "no code" / "low code" tools.

What happens if the model tool company goes out of business?

What happens if the model tool company gets acquired and the new owner decides to raise the prices too much and some of their customers cannot afford to pay?

What happens to all of the generated source code produced by these tools when the business using the generated source code needs changes made, but no longer has the models or tools?

dragonwriter · on July 6, 2021

> Suppose you need to add new features or fix a bug, to do so you will need all of the original specifications and code generation tools.

Specs = source code.

Code generation tools = compiler.

How is this different from any other code? If you need to modify it, you need the source and a compiler.

thesuperbigfrog · on July 6, 2021

>> How is this different from any other code?

Specs =

https://en.wikipedia.org/wiki/Business_Process_Model_and_Not...

https://en.wikipedia.org/wiki/Systems_Modeling_Language

https://en.wikipedia.org/wiki/Unified_Modeling_Language

Code generation tools =

https://bpmnmatrix.github.io/

https://model-based-systems-engineering.com/sysml-tools/

https://en.wikipedia.org/wiki/List_of_Unified_Modeling_Langu...

dragonwriter · on July 6, 2021

Listing a bunch of specification languages and toolchains is not, at all, an answer to how needing to have both the source code and tooling for something coded from specs is different from having to do the same for any compiled language.

thesuperbigfrog · on July 6, 2021

What happens if the model tool company goes out of business?

What happens if the model tool company gets acquired and the new owner decides to raise the prices too much and some of their customers cannot afford to pay?

What happens to all of the generated source code produced by these tools when the business using the generated source code needs changes made, but no longer has the models or tools?

Yes, you need both specs and tools. Vendor lock-in can really bite, especially if your business depends on tools that are no longer available.

dragonwriter · on July 6, 2021

So, the problem isn't “generated code”, but “obscure, proprietary toolchain stacks”.

Which is also a problem for compiles code in not-widely-supported forms (language/target combo).

thesuperbigfrog · on July 6, 2021

“obscure, proprietary toolchain stacks” are half of the problem.

Understanding, updating, and maintaining algorithmically-generated source code is the other half of the problem.

If a human is writing the code, hopefully variables have sensible names, there is logical organization of the code, and there are some comments to explain.

(Of course, some humans write code that is worse than algorithmically-generated code. // shudder)

dragonwriter · on July 6, 2021

> Understanding, updating, and maintaining algorithmically-generated source code is the other half of the problem.

There is no such thing: algorithmically generated code that isn’t in final executable form is not source code, its an intermediate representation. The fact that its in a language that some source is written in doesn't change that.

If you are maintaining anything other than the source form, you are doing it wrong (and if the tool is half-baked so you have to manually massage the generated code, that's definitely a problem with the tool.)

rustybolt · on July 6, 2021

When I first read the snippet, I thought "What a terrible piece of code.". Later on, I read: "In fact, the code written above was actually entirely automatically synthesised by our tool, along with a proof certificate, formally guaranteeing its correctness."

So it looks like the code is terrible, and they only give an example for a simple usecase. So I'm not very enthousiastic about using this technique in practice.

The rest of the post looks very informative though, although I haven't completely read it yet.

zozbot234 · on July 6, 2021

> So it looks like the code is terrible, and they only give an example for a simple usecase.

Yes, I'm very skeptical about this being "the future of coding!" (quoting the OP). These tools typically get stuck as soon as one needs to deal with non-trivial loops or recursive code, because it's not clear how to "synthesize" the correct loop invariant or inductive hypothesis; in particular, this cannot be done "compositionally" or "step-by-step" as in the simple case OP demonstrates. Typically, a developer has to step in and provide the right invariant so that the tool can get unstuck.

Proof synthesis has exact the same issue, so generating proof-carrying code is also something that cannot thus far be automated except in very simple cases.

Having said that, separation logic can be a valuable low-level tool and has been used to assess the correctness of, e.g. Rust borrow checking, which aims to provide lightweight, composable guarantees that can scale up to encompass larger programs. Even there though, we're nowhere near an actual proof, and soundness issues have in fact cropped up in the past as new features got added.

YeGoblynQueenne · on July 6, 2021

>> These tools typically get stuck as soon as one needs to deal with non-trivial loops or recursive code, because it's not clear how to "synthesize" the correct loop invariant or inductive hypothesis; in particular, this cannot be done "compositionally" or "step-by-step" as in the simple case OP demonstrates.

By "inductive hypothesis" do you mean the inductive case in a recursion?

If so, I'm curious, why do you say this? My experience is that the problem with learning recursion is constructing the "base case", particuarly when there is no example of it. Do you have specific examples where the inductive case is hard to find when learning from specifications?

pjmlp · on July 6, 2021

Every tool is welcomed for the scenarios where there is no workaround as to keep writing C code, but for everything else, the best is not to write it to start with.

In any case very interesting paper, however I feel such kind of tooling can only be enforced in scenarios where code certification is already taking place, e.g. MISRA and AUTOSAR.

amw-zero · on July 6, 2021

I can't comment on whether or not this will definitely be the future of programming, though I do think it's interesting.

The more interesting thing to me is just the idea of separation logic. The functional programming community has this idea that functional programming is the only sound programming model from a logical perspective, because it is very historically rooted in math and logic. Turing machines, by contrast, are much more 'made-up' constructions - they have weird, non-mathematical sounding components like heads and tapes, etc.

But, for whatever reason, we can produce extremely fast and small processors that pretty closely resemble Turing machines, and C-style imperative, memory-based code flows naturally from that. And, with separation logic, it can also be firmly rooted in math and logic.

That was always my feeling with the functional programming zealots, like they were just ignoring other possible universes. The lambda calculus is one formally sound model of computation, sure, but surely it's not the only one.

protoman3000 · on July 6, 2021

This is all exciting and relevant progress, but the technique does not seem to converge towards a real solution in the long run.

We developed higher-level languages like C to specify and generate machine code easier and with less errors. Then it turned out with the higher level language it's easy to make mistakes, too, so let's put another layer of specification language on top to avoid that (e.g. let's do Java). Eventually it will turn out that it's not easy to avoid mistakes in that language as well, so we put another layer on top, and again and again.

The gist is that computation is not an easy task and we can't avoid the halting problem, but still it is very good to have tools available that make negotiating through bug space easier.

YeGoblynQueenne · on July 6, 2021

>> Eventually it will turn out that it's not easy to avoid mistakes in that language as well, so we put another layer on top, and again and again.

That is a risk: if the specification is the entire program written in the specification language, then the same opportunities to make mistakes present themselves when writing the specification, as when writing the program by hand.

In this case though the specification language has the advantage of being purely declarative: it's only a statement of pre- and post-conditions without any assumptions about implementation. Such a specification leaves little room for mistakes.

amw-zero · on July 6, 2021

Are you aware that a huge percentage of accepted math proofs have errors in them? https://lamport.azurewebsites.net/tla/proof-statistics.html.

The worm at the core of all formal verification is that we mistakenly believe that a result is absolutely correct because formal logic was employed. Logic can be subtly wrong, even though the argument appears correct. The very act of using logic does not imply actual correctness.

This is not a dismissal of the field - just a warning. There are many mistakes that can still be made when formally specifying and verifying. To believe otherwise is pure hubris.

YeGoblynQueenne · on July 8, 2021

Sorry for not replying. I don't feel comfortable replying to comments that begin with turns of phrase like "are you aware", as in your comment or "do you realise" etc. I find this style of conversation coercive and adversarial and it puts my back up, and I don't want to have internet conversations where I'm made to feel that way.

makerofthings · on July 6, 2021

That example they use to show how bad C code is and how complicated pointers can be. That’s just bad C, really bad. And how do I know if he removed a line? You could ask the same question about code in any language.

tluyben2 · on July 6, 2021

The example is called listcopy, so assuming it does that, I don't think there are many modern languages where you could remove a line and have other (experienced) programmers not pick up on that. Unless that line had nothing to do with the actual task, but that's not the case here.

Agreed it is bad code though. A good C example and then synthesised code of the same task would have been better imho.

comex · on July 6, 2021

> The example is called listcopy, so assuming it does that, I don't think there are many modern languages where you could remove a line and have other (experienced) programmers not pick up on that.

Thing is, if the hypothesis is that the complexities of pointers make C harder to read than other languages, this example does not prove that, because its use of pointer features is completely gratuitous.

Below I attempted to write a saner version of the same C function. I removed the pointer arithmetic and casts in favor of just using a struct, not because I wanted to avoid those features for demonstration purposes, but because that's what any sane C code would do. I also changed it from accepting a pointer-to-pointer, where the pointee is originally the input value and is then replaced with the output value, to just taking the input as an argument and returning the output – again, because that's the simple and obvious choice.

As it happens, the resulting code's use of pointers is limited enough that it could be directly transliterated to Java or any other high-level imperative language. Change the `malloc` to a `new`, fix up the syntax, and you're done. Therefore, whatever lack of clarity remains, it can't be due to anything peculiar to C. (I'd say at this point that the code is clear enough that most people would notice if you removed a line, but YMMV. To be fair, my version also shares some flaws with the original such as not checking for allocation failure.)

That said, there are plenty of algorithms which are complex or subtle enough that even in a well-written implementation, someone might not notice a line being removed. But that can happen in any language, at least to some extent.

    struct list {
        int value;
        struct list *next;
    };

    struct list *
    listcopy(struct list *list) {
        if (list != NULL) {
            struct list *copied = malloc(sizeof(*list));
            copied->value = list->value;
            copied->next = listcopy(list->next);
            return copied;
        } else {
            return NULL;
        }
    }

dataflow · on July 6, 2021

Looks great. One question though (and this isn't criticism of the research, but just about the presentation/intended audience): why would anyone who's been so 'stubborn' (as they say) so far about sticking with C finally have a change of heart because of this? They clearly prefer to write C themselves, otherwise they'd have already moved on, right? It's not like having a C++-to-C or Rust-to-C compiler would've changed the game for them.

tluyben2 · on July 6, 2021

(not one of researchers but I submitted this because I like the research)

I have not tried Rust for a while so this might have changed, but in the tiny embedded devices I work in, where your 'free' memory is for instance 24kb and the speed of the MCU is a few mhz, there really is only C/asm. You need to mess around with every single byte to make it fit and fast enough. There are surprisingly many of these MCU's around everywhere around us. And you (usually) cannot update their firmware remotely (or at all actually), so it's important it doesn't have massive bugs.

dataflow · on July 6, 2021

People say this all the time, and maybe it's just my naivete, but I just don't understand why at least something like C++ is not an option in embedded spaces. Sure you might need to avoid operator new() and exceptions or whatever, but you still have so much more power than in plain C even with half the language removed, and avoid so many of the C pitfalls just by that. I know people kept telling me you "need" C for a kernel... and I just ignored them and wrote drivers and even a basic from-scratch kernel in C++, and they all worked just fine, and with so much less pain than C. And honestly, if C++ somehow (I don't understand how, but let's say somehow) has too much overhead... why would another language compiled to C be any better? It just doesn't make sense to me.

tluyben2 · on July 6, 2021

C++ is no issue but, to make very small binaries packed with the features we need, you need a lot of fiddling and the end it does not look like c++. Not to mention that most embedded devs (at least in Asian, Africa and eu embedded corps which make deployments of millions of mcu's, not home arduino users: there might be more of those) work mostly with C. I think most are waiting to skip a revolution to something like Rust but that does not fit yet. So this solution gives at least some handles to add verification.

That said, there are more of these projects. I think there is an F* project too.

> why would another language compiled to C be any better? It just doesn't make sense to me.

It seems that the formalization in the article allows for very low level operation which you would use in C while not writing C. As said, sure you can do that in C++ but then you gain little or nothing. I have not tried this, only posted it because I would like for something robust like it to exist.

Now we use Coq (or more recent idris) to write things and hand translate them to mcu asm/c.

Animats · on July 6, 2021

You can use C++ on even an Arduino. In fact, the Arduino development development is full C++. You just have to stick to the non-Clib features.

kaba0 · on July 6, 2021

Messing around with every single byte is a property of low-level languages. There is absolutely nothing C does differently than C++ or Rust. Hell, it may even be slower due to not having proper abstraction powers and programmers thus lazily choose the easier route with eg. linked lists. Also, due to rust’s semantics it can get better optimizations due to no pointer aliasing.

tluyben2 · on July 6, 2021

I will try it again then. What is your experience with it in very small devices? As I read online still, the binaries would not fit in the mcus we use. Maybe I am missing something there which has changed or I just missed.

I would love to see a realistic (not hello world) example on a 32kb mcu. So with significant amount of function points (we do decryption, signing, otp generation, display driver, keyboard, and lot more and people told me and I read online this would not work well with Rust but this is maybe outdated or maybe the code will just look like C? I don't know examples of anything realistic): think that would turn me and more people.

kaba0 · on July 6, 2021

If binary size is that important, then there is even better options than C: Zig.

tluyben2 · on July 6, 2021

Well, every 0.01$ counts for these components as orders are in the millions or 10s of millions. Everything that can be cut gets cut. And now lead times count as well as there are shortages and some factories have many waifers of these low spec ones to sell.

I tried Zig on Linux and it is nice: any restricted embedded examples? I will try both Rust and zig when I get home to see if it could even work.

pjmlp · on July 6, 2021

Not really, there are options available for those that care to look around.

https://www.mikroe.com/mikropascal

https://www.mikroe.com/mikrobasic

Mikroe is just one possible vendor, there are others.

tluyben2 · on July 6, 2021

Sure, you can: Forth works too. But C is the standard (in my world anyway) so it would be c++ or rust. C++ already is used but Rust I have not seen yet.

pjmlp · on July 6, 2021

Sure but you worded it as universal condition, hence my remark.

If anything, just to raise awareness for others.

Even C++ alone is already an improvement, provided it isn't just C compiled with a C++ compiler, e.g. AUTOSAR.

tluyben2 · on July 6, 2021

Yep, I agree but we never could make that fit. When it fit, it was just compiled with a c++ compiler pretty much. But I have not met an experienced c++ embedded dev in my industries: all are c/asm because it is always the cheapest (and thus most underpowered) we can get away with given the consumption demands.

gradschool · on July 6, 2021

Devil's advocate question for a person who likes this research: What formal definition of a "program synthesizer" would exclude C compilers from being program synthesizers? Similarly, what precisely distinguishes C or any other programming language from a "specification" language?

shannongreen · on July 6, 2021

As someone who has to write C for their day job, I am a little bit offended by the introduction of this article, but I guess that's the point.

The code presented is trash and doesn't follow best practices - using the type system, descriptive variable names, etc. Void pointers for everything, are you serious? Of course it's hard to understand. Revealing that it's the generated code doesn't make me any more comfortable with it.

Yes, I get that you shouldn't care what the generated code looks like if you trust your input, but I don't find the predicate any easier to understand. How do I know that's correct? What if I need to debug this generated output?

pjmlp · on July 6, 2021

I have seen code written like that by humans, namely offshored projects.

dang · on July 6, 2021

I've taken "Goodbye C developers" out of the title above, since it's pretty linkbaity.

kazinator · on July 6, 2021

> a simple, standard, procedure to copy a linked list:

Using the same argument for the input and output, when making a copy, is idiotic, not standard.

Well, maybe idiotic is standard, but that's another discussion.

Copying linked lists is not a widespread practice in C programs, by the way; if we don't count C programs that are Lisp interpreters. It creates memory handling problems. In this example, the nodes carry data consisting of integer values. These are referenced by pointers. The copy is shallow-copying these pointers, which could be an issue. If the integers are dynamically allocated, who cleans them up? They have to persist as long as either list is still reachable.

The average embedded C programmer's imagination doesn't even extend to the idea that a linked list would ever be copied.

"Like whaddya mean? The list nodes are embedded in the task control blocks ... you'd have to duplicate the actual processes themselves. It makes no sense."

Idiomatic C code uses structs for linked lists, not void * pointers to binary cells consisting of two void * pointers, one of which is actually an int value.

I also don't understand why the example function is casting (void **) r when r already has that type.

Here is something more reasonable, still with the shallow-copy issue of the pointers-to-integer, and no handling of a null out of malloc. The code is very easy to follow:

  struct cons {
    int *car;
    struct cell *cdr;
  };

  struct cons *listcopy(struct cons *list)
  {
    if (list == 0) {
      return list;
    } else {
      struct cons *ncell = malloc(sizeof *ncell);
      ncell->car = list->car;
      ncell->cdr = listcopy(list->cdr);
      return ncell;
    }
  }

If the lists can be long, the stack growth can be a problem; this isn't tail recursive. You want something iterative. One possibility is a two pass version:

  struct cons *listcopy(struct cons *list)
  {
    struct cons *copy = 0;
    struct cons *out = 0;
    struct cons *cdr;

    /* walk list, allocating new nodes, pushing
     * them onto copy stack
     */
    for (; list; list = list->cdr) {
      struct cons *ncell = malloc(sizeof *ncell);
      ncell->car = list->car;
      ncell->cdr = copy;
      copy = ncell;
    }
  
   /* Reverse copy stack in place by popping
    * all nodes and pushing them onto the out
    * stack.
    */
   for (; copy; copy = cdr) {
     cdr = copy->cdr;
     copy->cdr = out;
     out = copy;
   }

   return out;
 }

Here we can separately verify whether the copy stack is correctly produced, and then correctly reversed into out.

The average seasoned C pro will probably jump to this sort of solution, but not use any cons/car/cdr terminology he or she never heard of:

  struct cons *listcopy(struct cons *list)
  {
    struct cons *copy = 0;
    struct cons **pcdr = &copy;

    for (; list; list = list->cdr) {
      struct cons *ncell = malloc(sizeof *ncell);
      ncell->car = list->car;
      ncell->cdr = 0;
      *pcdr = ncell;
      pcdr = &ncell->cdr;
    } 
  
    return copy;
  }

Using a dummy node on the stack to serve as the list parent, we can notationally eliminate the ** pointer:

  struct cons *listcopy(struct cons *list)
  { 
    struct cons dummy = { 0 }, *tail = &dummy;
  
    for (; list; list = list->cdr) {
      struct cons *ncell = malloc(sizeof *ncell);
      ncell->car = list->car;
      ncell->cdr = 0;
      tail->cdr = ncell;
      tail = ncell;
    } 
  
    return dummy.cdr;
  }

sdevonoes · on July 6, 2021

I don't get it. The "spatial predicated" spec given in the post is just code, at another level, but just code. You just move the complexity one level up. Sure for some scenarios writing the spec is way easier than writing plain code; but for other scenarios the spec will be as complex as the problem requires it... and so we are again with the same problem "how to write correct specs (or code)".

We already tried this in the past:

machine code -> assembly -> C -> Haskell -> ...

YeGoblynQueenne · on July 6, 2021

That's very nice but what I always want to see in program synthesis work is a proof of soundness and completeness: does the synthesiser provabl produce only correct programs; and can it produce every correct program?

The abstract of the (POPL) paper claims that "the derived programs are ... correct by construction" but what about the transformations to the specification, particularly the post-conditions? That's a bit scary. When and how can post-conditions change?

api · on July 6, 2021

Or you could, you know, use Rust, Go, C#, Java, or some other modern language.

C does need to die, but until fairly recently there were not many viable replacements. D isn't bad but isn't "better enough."

eyelovewe · on July 6, 2021

Java and C# are not capable of programming hardware, in the sense that C and C++ are. There’s nothing “nonmodern” about C or C++ and LLVM was written in such a language that compiles to literal processor instructions. The JVM and dot net runtimes were also written in a lower level language with semantics for memory access via pointers. Despite all the hullabaloo about dangers and footguns, embedded dev, HALs, device drivers, OS schedulers and sundry all are written in C or C++. I’m not even bringing up garbage collection but it’s also a whole big dealbreaker for any low level systems programming. One cannot use malloc inside of boot code or inside of a low latency system call, etc

It turns out that Rust has gained some attention and is capable of operating in this space. The inertia of Linux kernel code needing to inter link and being largely in C is one reason why you don’t see random bits of Rust in Linux, but this is not to preclude designing low level systems in Rust.

deterministic · on July 7, 2021

You can’t write high performance code in Java/C#. That’s why the JVM is not written in Java.

api · on July 7, 2021

You can in Rust.

mikewarot · on July 6, 2021

As you state the problem, this is the wrong solution. All programs have bugs, all compilers have bugs. Bits randomly flip in memory, disk, etc. Stop blaming the wrong things.

You ran a program, you gave it all your resources, it got confused, and ran amok. It's YOUR fault.

You shouldn't have given it all your resources. You should have only given it access to the minimum set of resources resources required to do the job.

Now, generating provable correct code does have its merits, the article should have started with those.

deterministic · on July 7, 2021

Nope. CompCert (a C compiler) has zero bugs. It is formally proven correct down to assembler level.

mikewarot · on July 7, 2021

I just cringe at the thought of making that bold of a claim on anything computer related, let alone something as complex as a compiler.

So, as long as the specifications, and CPU documentation are correct, it will work as intended, a high percentage of the time. (Random bit flips, disk errors, etc. are still a thing)

deterministic · on July 15, 2021

I highly recommend learning more about what formal software verification actually is and what it guarantees and what it doesn’t.