Except the Bazel syntax is really good. Compiling a few C++ files into a binary ...

SolarNet · on Oct 29, 2019

Yes but there is a hundred megabytes of system behind that. Tens of thousands of lines of starlark code, and tens (maybe hundreds) of thousands of lines of java code besides.

A lot of work goes into those 4 lines. This is evidenced by the simple question of, how do you set up a custom compiler for that code properly? The answer being a couple hundred lines of starlark code, for what in other build systems is effectively `CC=clang` (to be fair bazel allows this but does not support or encourage it, especially for anything besides toy builds).

I like Bazel (a lot!) but you are oversimplifying it here.

HereBeBeasties · on Oct 29, 2019

I have to ask, "so what?"

Provided the abstraction doesn't leak I really don't care, in the same way that end users of Microsoft Word don't care about the algorithms that kern their text, about how DirectWrite does subpixel rendering, or anything else.

Yes, it's all fiercely large and complex. The original user is pointing out that this is all hidden for simple projects, waiting in the wings for when you need it, and therefore doesn't matter.

SolarNet · on Oct 29, 2019

Yes but it's minimizing the complexity of the cliff. The original author they were responding to was describing the static/dynamic discrepancy, and how it requires a lot of work.

The author I was replying points out how simple the syntax is. But that's missing the point. I provided a direct counter example:

Adding a new compiler to bazel is very complicated, and by new compiler I include something as simple as asserting "all of my C++ code is c++17" at a project level is (when done the correct way, and not one of the many hacky prone to failure ways) a 100+ line toolchain definition (because C++17 should be treated as an entirely new compiler from the perspective of bazel; treating it as a command line argument gets you in a bad place where you hit weird errors).

To draw an analogy, it's like saying that C++ code like:

    some_matrix[i, j];

is great syntax and cross platform! When in reality it involves (in user space code) 3 complex classes (at a minimum!), overloading the comma operator(!), implicitly casting from builtin types, and probably hundreds of lines of template code (if you want any sort of extensibility). True it is cross platform and great syntax. But it obfuscates the amount of code and understanding required to do anything but what the most basic syntax allows, for example extending the system in any way.

flohofwoe · on Oct 29, 2019

Simple things like this are simple in all C/C++ build systems, the question is how the complex things are handled (multi-platform support, detecting and selecting different versions of the same compiler, cross-compiling, IDE support, multi-stage builds where source files are generated in the build which need to be inputs to later build stages, custom build steps, or bootstrapping tools which need to be used as compilers or code-generators for later build stages, etc etc etc...)

adgasf · on Oct 29, 2019

OP was suggesting that systems like Bazel are harder to get started with but pay off with larger, more complex builds. However, I am claiming that Bazel is also good for the simple cases.

cma · on Oct 29, 2019

It might be that stuff (that is still beginner stuff) a bit beyond the first step is harder.

Hello world is simple in python, but it is still hard to get started writing real-time code in it even if hello world might be your first step.

bsder · on Oct 29, 2019

The syntax is almost never the issue.

The issue is that declarative build systems ALWAYS need an escape hatch for the exceptions--and the exceptions grow over time.

We've been here before. It was called Make. And then it was called Ant. And then Maven. And Google still built their own system. And they will build another again.

Nobody ever learns the lesson that a build system is a program--period. Sorry to burst your bubble: there will be circular dependencies--you will have to deal with it. There will be two different C compilers on two different architectures. You will have to deal with it. Libraries will be in the wrong place--you will have to deal with it. These libraries come from the system but these libraries come from an artifact--you will have to deal with it.

The only way to deal with it is to have a genuine programming language underneath.

laurentlb · on Oct 29, 2019

(I work on Bazel)

> The only way to deal with it is to have a genuine programming language underneath.

I disagree. Having a full programming language is great for the flexibility, but it causes lots of issues at scale. Restrictions in Bazel are critical for having a good performance on a large codebase, and for keeping the codebase maintainable.

With restrictions, we can provide strong guarantees (e.g. determinism, hermeticity), get better tooling, we can query the graph (without running the tools), we can track all accesses to files, etc. We can also make large-scale changes across the codebase.

Note also that Bazel language is not truly declarative, but it encourages a separation between the declarations (BUILD files) and the logic (bzl files).

bsder · on Oct 30, 2019

> Restrictions in Bazel are critical for having a good performance on a large codebase, and for keeping the codebase maintainable.

Note the vocabulary--"restrictions". Your build system isn't solving a technical problem--it's solving a political one and trying to cloak it in technical terms.

We already have a problem. Your build system is now IN THE WAY if I'm not at scale. Any build system that makes people take notice of it is an a priori failure.

Thanks for posting this though. I've had a nagging irritation with so many of these "scalable" things, and this is the first time it has really coalesced that "scale" is almost always intertwined with "political control".

EricBurnett · on Oct 29, 2019

You say that like bazel hasn't already been proven out on a few billion LoC :). You're quite right that exceptions will exist - your job is to either fix them or squeeze them into the build system. Both are well trod paths. You're quite right that these edge cases aren't _easy_ - but in what system would they be?

> there will be circular dependencies--you will have to deal with it.

Add them both to the same compilation unit (cc_library, whatever). Or extract an ABI (e.g. Turbine) and compile against that.

> There will be two different C compilers on two different architectures. You will have to deal with it.

Poke around https://github.com/tensorflow/tensorflow/tree/master/third_p... . I see arm and x86, Windows and many variants of Linux, multiple versions of GCC, multiple versions of clang, multiple versions of cuda, python 2 and 3, and names I don't even recognize.

See also https://docs.bazel.build/versions/master/platforms-intro.htm....

> Libraries will be in the wrong place--you will have to deal with it.

Just write the path to the file, or write a build rule that makes it visible in a standard location, or make it part of the platform definition, or use something like https://docs.bazel.build/versions/master/be/workspace.html#b... to alias it into your workspace. (Not recommended, but supported.)

> Libraries will be in the wrong place--you will have to deal with it.

Platform-specific library paths are very common. These days it's probably better specified as part of your "platform", but https://docs.bazel.build/versions/master/be/functions.html#s... is a convenient way to make in-place selectors.

I won't pretend bazel can express _everything_, but there's little you can't hack in _somehow_ with sufficient motivation, and moving towards the bazel-recommended patterns brings growing peace-of-mind (and faster builds).

(Disclaimer: Googler, work closely with the bazel team. I like bazel).

bsder · on Oct 30, 2019

> You say that like bazel hasn't already been proven out on a few billion LoC

Gradle is in use on a couple billion LoC and still sucks.

Bazel is like so many other tools--it works great if it owns everything, but when it has to cooperate, it falls down hard.

mzs · on Oct 29, 2019

I guess but this is also portable and more people know how to extend it:

    $ cat Makefile 
    SRCS = foo.cpp bar.cpp baz.cpp
    OBJS = $(SRCS:.cpp=.o)
    PROG = app
    EXT  =
    
    .PHONEY = all clean
    
    all : $(PROG)
    
    LDLIBS = $(OBJS)
    
    $(PROG) : $(PROG:$(EXT)=.cc) $(OBJS)
    
    clean :
            rm -f $(PROG) $(OBJS)
    $ make SRCS="`echo *.cpp`" clean all
    rm -f app a.o b.o c.o
    c++ -O2 -pipe -c a.cpp -o a.o
    c++ -O2 -pipe -c b.cpp -o b.o
    c++ -O2 -pipe -c c.cpp -o c.o
    c++ -O2 -pipe  app.cc a.o b.o c.o -o app
    $

EricBurnett · on Oct 29, 2019

I love this example, but more because it shows just how obtuse make is even for simple programs. There's one self-explanatory line in this whole program (`SRCS = foo.cpp bar.cpp baz.cpp`), and everything else is magic you'd just have to cargo-cult forward.

I _think_ that it's also broken, if I copy-paste it: your comment renders with spaces in the `clean :` stanza, but I believe make requires that to be a tab character?

While certainly simplistic, the bazel example shows one obscure feature (glob) that's both still fairly obvious, and unnecessary for a direct comparison to your example. The rest reads clean, and could be replicated/modified by someone with no clue fairly straightforwardly.

Don't get me wrong, bazel BUILD files will often hide a whole lot of magic. But the benefit is, for a newcomer the files they have to interact with on a day-to-day basis are friendly. Take tensorflow, one of the most complicated bazel repos I know - browse to a random leaf BUILD file and you'll probably find it quite readable. (I randomly clicked to https://github.com/tensorflow/tensorflow/blob/master/tensorf... , seems pretty clear - especially when I consider it's compiling kernels for GPUs, which I know nothing about.)

(Disclaimer: Googler, work closely with the bazel team. I like bazel.)

mzs · on Oct 30, 2019

The `SRCS = foo.cpp bar.cpp baz.cpp` was my fav part. I like to put every file used into a Makefile instead of using globs, but my example also showed how without editing the file someone can build with globs instead ;) Similar approach if you need the program to be named winApp.exe instead... That's another 'beauty' of make, you don't have to edit the Makefile.

There's really not much cargo cult in my example. Everything would be in the most rudimentary doc or tutorial, certainly less reading than if you needed to use git for the first time.

And yes there is supposed to be a tab there, some copy-paste thing munged that.

pjmlp · on Oct 29, 2019

If that was the case, autoconf and autotools would have never been a thing.

The "beauty" of UNIX falls apart when trying to code across all of them.

For example, I cannot even remember if that Makefile would work 100% with a POSIX make.

mzs · on Oct 30, 2019

That Makefile works with BSD and GNU make, no extensions used.

Bazel isn't really nicer than the autotools when you need something complicated, if it even supports your platform. Here's a fun bazel bug report: https://github.com/bazelbuild/bazel/issues/4350#issuecomment...

I've always managed to avoid autotools unless it was used before with things like LDFLAGS='-lsocket -lns' make ...

pjmlp · on Oct 30, 2019

For me being POSIX compliant goes a little beyond BSD and GNU make.

mzs · on Oct 30, 2019

What does posix compliant* even mean for make? The posix standard doesn't even require support for c++! That's a VERY portable Makefile. It would work on any make from the late seventies (assuming you also had a c++ compiler and system Makefiles that knew how to use it). It only uses a small subset of the rules in the standard.

* https://pubs.opengroup.org/onlinepubs/009695399/utilities/ma...

pjmlp · on Oct 30, 2019

That is what I mean by POSIX compliant, so I stand corrected. Thanks.

foota · on Oct 29, 2019

The difference being that interpreting makefiles requires tons of experience whereas the equivalent build rule above is obvious to anyone.

nicoburns · on Oct 29, 2019

Are you kidding? I have no clue what half of those lines are doing. And only any idea about the other half because I've come across make before. I also highly doubt that this will run on windows, whereas the Bazel build probably will.

joshuamorton · on Oct 29, 2019

Although I'd strongly suggest not doing that bazel works better if you have 1 library per header (or source file for non c langs). It helps to have tools to manage deps for your though.

adgasf · on Oct 29, 2019

> bazel works better if you have 1 library per header

What do you mean?

klipt · on Oct 29, 2019

You could have a cc_binary for your main foo_main.cc file that depends on cc_library targets separately specified for each bar_lib.h/bar_lib.cc.

That makes the graph more granular, e.g. if you just update the foo_main.cc you don't need to recompile the libraries. Or you can reuse bar_lib in a different binary.

grandmczeb · on Oct 29, 2019

> if you just update the foo_main.cc you don't need to recompile the libraries.

Are you sure this is required? I just tried with a test repo and Bazel only performed two actions after I updated foo_main.cpp (compile foo_main.cpp and link app.)

joshuamorton · on Oct 29, 2019

Try two versions:

Foo.cc that depends on a a lib that's glob of a.h, b.h, c.h, d.h.

Foo.cc that depends on a_lib, b_lib, etc.

Modify a. In the first case, you'll have to recompile all of abcd. In the second, only a.

SolarNet · on Oct 29, 2019

I just tested this and I think you are incorrect. Bazel will appear as if it's rebuilding all of them (it lists a dozen files), but it's really just checking the caches for them. Try running `bazel build -s` and watch what commands it actually runs.

Note that adding and removing files (including through a glob) does always cause a rebuild though (because it's a change of the rule). This is a deficiency of bazel.

joshuamorton · on Oct 29, 2019

You can skip that if you use libraries, it won't even check the cache or need to relink. That's faster, much faster at scale.

The other nice thing is you get dead code elimination for free.

SolarNet · on Oct 29, 2019

Sure, but my point was that Bazel is doing file/include analysis so medium sized multi-header libraries are not penalized.

joshuamorton · on Oct 29, 2019

That's not correct, or at least it's not always correct. C may have some special support, but other languages don't.

The other place this is obvious is with tests. If you have a unit test per source, a change to any source will return all tests, splitting reduces this.

SolarNet · on Oct 29, 2019

Then that is a problem with those language's implementations. Bazel allows for it, it is on those language implementations to provide support. Notably I am not aware of many languages that easily allow access to their dependency graphs (especially without third party tools anyway). To me this seems more of an issue with the languages in question than with Bazel.

joshuamorton · on Oct 29, 2019

But again, bazel doesn't want you to do that. The more information bazel has, the more it can do for you. If you stick to 1-lib-per-source-file, dead code elimination, across your entire codebase, however large it is, can be done by a single reverse dependency query across every language, even those that don't have compilers.

In other words, correctly using bazel gives you access to all the cool magic linkers do without having to do the work to implement it. And it adds value even for the ones that do (like C++, again: you'll run fewer tests).

SolarNet · on Oct 29, 2019

I mean sure, I'm not necessarily disagreeing that many people could do with fewer source files per library. But I also think you are pathologically misusing the tool along the lines of creating an NPM library for the left-pad function.

Bazel has other tools for mitigating excessive test running (like test size/time descriptions and parallel test running) running too many tests has never been a problem I have encountered with even my dozen source file bazel libraries. Bazel also has smart test runners that can split up the tests in a test binary and run it in parallel, and I don't have to write a dozen lines for every C++ library.

joshuamorton · on Oct 29, 2019

> I mean sure, I'm not necessarily disagreeing that many people could do with fewer source files per library. But I also think you are pathologically misusing the tool along the lines of creating an NPM library for the left-pad function.

I'm literally quoting Google's best practices for using bazel/blaze.

> Bazel has other tools for mitigating excessive test running (like test size/time descriptions and parallel test running) running too many tests has never been a problem I have encountered with even my dozen source file bazel libraries. Bazel also has smart test runners that can split up the tests in a test binary and run it in parallel, and I don't have to write a dozen lines for every C++ library.

Right, but here's the key thing: You're still running the test. My way, you just don't, and you lose nothing. You use less CPU, and less time.

SolarNet · on Oct 29, 2019

I mean the best practices on the bazel website include:

> To use fine-grained dependencies to allow parallelism and incrementality.

> To keep dependencies well-encapsulated.

Like if I need 5 source files in a library to keep it well encapsulated, I'm doing that instead of making 5 libraries that are a rat's nest of inter-dependencies. And like the headers and repeating all the deps and specific command line arguments and so on would be unreadable.

joshuamorton · on Oct 30, 2019

Then you aren't keeping your dependencies fine grained.

> Like if I need 5 source files in a library to keep it well encapsulated, I'm doing that instead of making 5 libraries that are a rat's nest of inter-dependencies.

If you can't form your dependency tree into a DAG, you have larger design issues. This is yet another thing that bazel does a good job of uncovering. Libraries with cyclic dependencies aren't well encapsulated, and you should refactor to remove that.

I recognize that at a small scale this doesn't matter. But to be frank, there are parts of my job that I literally would be unable to accomplish if my coworkers did what you suggest.

SolarNet · on Oct 30, 2019

Sure I'll just go ahead and stick my fly weight class and it's provider class in two different libraries. And have one of them expose their shared internal header that is not meant for public usage. That's not going to cause any issues at all (sarcasm).

One source file per library and one source file per object is an example of two policies that will conflict here (and I'm not trading code organization for build organization when I can just as easily use the features intended for this situation).

Meanwhile in the real world limiting the accessibility of header files not meant for public use prevents people from depending on things they shouldn't. Organizing libraries on abstraction boundaries regardless of code organization allows for more flexible organization of code (e.g. for readability and documentation). And so on.

This is why these feature exists and why Google projects like both Tensorflow and Skia don't follow the practices you are espousing here pathologically.

> But to be frank, there are parts of my job that I literally would be unable to accomplish if my coworkers did what you suggest.

Then you are incompetent and bad at your job. To be blunt I would recommend firing an engineer who pathologically misused a build tool in ways that encourage hard to read code and difficult to document code, while also making the build bloated and more complicated, all in the name of barely existent (and not at all relevant) performance improvements. And then said they couldn't do their job unless everyone conformed to that myopic build pattern.

It's like what, an extra 4 characters to get the list of code files in a library from a bazel query? What in the world could you possibly be doing that having to iterate over five files rather than one makes your job impossible.

joshuamorton · on Oct 30, 2019

Bazel supports visibility declarations your private internal provider can be marked package, or even library-private.

> Tensorflow

Tensorflow is a perenial special case at Google. It's a great tool, but it's consistent disregard for internal development practices is costly. A cost I've had to pay personally before.

> Then you are incompetent and bad at your job

No, I just don't have the time or interest in reimplement language analysis tools when I don't need to.

SolarNet · on Oct 30, 2019

> No, I just don't have the time or interest in reimplement language analysis tools when I don't need to.

And I don't have the time or interest to do it by hand when bazel already has that.

joshuamorton · on Oct 30, 2019

But it doesn't actually have that. It has that for 1 very specific usecase, sometimes. Its very much not generic or intentionally built into bazel, and I am almost certain that there are more complex cases where that caching will break down (for example when the various deps import each other, or as mentioned: tests). Especially when its easier to have tooling automatically manage build files for you, so you don't even have to do it by hand!

SolarNet · on Oct 30, 2019

> But it doesn't actually have that. It has that for 1 very specific usecase, sometimes.

I mean not really, it's there enough that for all practical cases it exists effectively. I would be extremely surprised if that wasn't intentional when any other C++ build tool makes use of it extensively. I would have laughed bazel out of the room if it didn't make use of the fact that nearly all C++ compilers provide nicely formatted lists of the files a given file depends on.

Also it does work for tests perfectly fine.

> Especially when its easier to have tooling automatically manage build files for you, so you don't even have to do it by hand!

And this managing build file tooling is public? Because I am not at all aware of any automatic tools for bazel along those lines.

Which really is my frustration with bazel's google developers, their views are eternally myopic about how other people use their tools (e.g. using C++17 properly means re-writing the whole toolchain from scratch, have fun!). Yet those views are based on a closed and mostly unpublished ecosystem.

Let alone how terrible bazel does with dynamic libraries on windows (each cc_library then outputs a dll! yay!).

I don't think you realize how for anyone outside of Google bazel's tagline might as well be "The best terrible option".

joshuamorton · on Oct 30, 2019

> The best terrible option

Well sure, but build systems are hard, so that's still success in my book.

> And this managing build file tooling is public

No, but neither is the language analysis stuff you suggest I build instead, and claim I'm incompetent if I don't do.

Yes, I get it, you have complaints about the tool. That doesn't make what I'm saying less correct.

SolarNet · on Oct 29, 2019

The case in question is .h file include chains. And it does work correctly.

adgasf · on Oct 29, 2019

Does Bazel break up these steps automatically under the hood?

It could, in theory.

SolarNet · on Oct 29, 2019

It does, this is testable a number of ways. The funniest one is to do a `#include "foo.bar"` (and make sure the file exists and is in `srcs`!) and watch bazel freak out with "I have no clue what kind of file that is, how do I analyze that!"

Something to be aware of though is that Bazel does do some stupid things around this space. Like adding and removing files causes the build rule to change (if using a glob), which will proc a full rebuild of the library.