Log in

No account? Create an account
Beware of the Train [entries|archive|friends|userinfo]

[ website | My Website ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

[Links:| My moblog Hypothetical, the place to be My (fairly feeble) website ]

Falsehoods programmers believe about build systems [Dec. 6th, 2012|09:45 pm]
[Tags|, , , , ]

Inspired by Falsehoods Programmers Believe About Names, Falsehoods Programmers Believe About Time, and far, far too much time spent fighting autotools. Thanks to Aaron Crane, totherme and zeecat for their comments on earlier versions.

It is accepted by all decent people that Make sucks and needs to die, and that autotools needs to be shot, decapitated, staked through the heart and finally buried at a crossroads at midnight in a coffin full of millet. Hence, there are approximately a million and seven tools that aim to replace Make and/or autotools. Unfortunately, all of the Make-replacements I am aware of copy one or more of Make's mistakes, and many of them make new and exciting mistakes of their own.

I want to see an end to Make in my lifetime. As a service to the Make-replacement community, therefore, I present the following list of tempting but incorrect assumptions various build tools make about building software.

All of the following are wrong:
  1. Build graphs are trees.
  2. Build graphs are acyclic.
  3. Every build step updates at most one file.
  4. Every build step updates at least one file.
  5. Compilers will always modify the timestamps on every file they are expected to output.
  6. It's possible to tell the compiler which file to write its output to.
  7. It's possible to tell the compiler which directory to write its output to.
  8. It's possible to predict in advance which files the compiler will update.
  9. It's possible to narrow down the set of possibly-updated files to a small hand-enumerated set.
  10. It's possible to determine the dependencies of a target without building it.
  11. Targets do not depend on the rules used to build them.
  12. Targets depend on every rule in the whole build system.
  13. Detecting changes via file hashes is always the right thing.
  14. Detecting changes via file hashes is never the right thing.
  15. Nobody will ever want to rebuild a subset of the available dirty targets.
  16. People will only want to build software on Linux.
  17. People will only want to build software on a Unix derivative.
  18. Nobody will want to build software on Windows.
  19. People will only want to build software on Windows.
    (Thanks to David MacIver for spotting this omission.)
  20. Nobody will want to build on a system without strace or some equivalent.
  21. stat is slow on modern filesystems.
  22. Non-experts can reliably write portable shell script.
  23. Your build tool is a great opportunity to invent a whole new language.
  24. Said language does not need to be a full-featured programming language.
  25. In particular, said language does not need a module system more sophisticated than #include.
  26. Said language should be based on textual expansion.
  27. Adding an Nth layer of textual expansion will fix the problems of the preceding N-1 layers.
  28. Single-character magic variables are a good idea in a language that most programmers will rarely use.
  29. System libraries and globally-installed tools never change.
  30. Version numbers of system libraries and globally-installed tools only ever increase.
  31. It's totally OK to spend over four hours calculating how much of a 25-minute build you should do.
  32. All the code you will ever need to compile is written in precisely one language.
  33. Everything lives in a single repository.
  34. Files only ever get updated with timestamps by a single machine.
  35. Version control systems will always update the timestamp on a file.
  36. Version control systems will never update the timestamp on a file.
  37. Version control systems will never change the time to one earlier than the previous timestamp.
  38. Programmers don't want a system for writing build scripts; they want a system for writing systems that write build scripts.

[Exercise for the reader: which build tools make which assumptions, and which compilers violate them?]


From: ungratefulninja
2012-12-06 10:11 pm (UTC)
- All build systems look like make.
(Reply) (Thread)
From: ungratefulninja
2012-12-06 10:12 pm (UTC)
And on a related, $DAYJOB-aggravating note:

- Tools always exit with non-zero status on failure.
- Tools never exit with non-zero status on success.
(Reply) (Parent) (Thread) (Expand)
[User Picture]From: gareth_rees
2012-12-06 11:39 pm (UTC)

  • A build always runs on a single computer.
  • There's always a human available to interact with the build system.
(Reply) (Thread)
[User Picture]From: pozorvlak
2012-12-07 12:13 am (UTC)
Good ones!
(Reply) (Parent) (Thread)
(Deleted comment)
[User Picture]From: pozorvlak
2012-12-07 12:15 am (UTC)
I'm inclined to say that the less scripting your build manager has to do the better - though a tool that doesn't require scripting in any situation sounds like an impossibility (and a tool that doesn't allow scripting sounds like it would eventually become painful).
(Reply) (Parent) (Thread)
From: senji
2012-12-07 03:14 am (UTC)
  • Build steps are idempotent
  • It can be known in advance how many times a particular build step should be executed
  • Build steps do not modify the repository
  • It is possible to determine whether the build will succeed
  • It is possible to determine whether the build will even halt

Edited at 2012-12-07 03:21 am (UTC)
(Reply) (Thread)
[User Picture]From: pozorvlak
2012-12-08 11:46 am (UTC)
*smacks forehead* - all good ones.
(Reply) (Parent) (Thread)
From: (Anonymous)
2012-12-07 04:12 am (UTC)
What is the point of this?
(Reply) (Thread)
From: (Anonymous)
2012-12-07 01:31 pm (UTC)
(Reply) (Parent) (Thread)
From: (Anonymous)
2012-12-07 09:15 am (UTC)
maven sux.

svn sux.

python sux.

lets agree on this first

(Reply) (Thread)
[User Picture]From: Jens Timmerman
2012-12-07 10:35 am (UTC)
* Users will only will only use one specific compiler, library and flags, so you can hardcode them in your build scripts.
* Users will always agree with the location you want to install stuff to.

Shameless plug:
At my current job (HPC system administator at Ghent University) we have been building a lot of software which had these assumptions.
So to automate all of them we created a framework EasyBuild (http://hpcugent.github.com/easybuild/) which is a layer on top off all these build systems that tries to correct their mistakes (be the human to answer questions during the installation, patching makefiles/code to work with different compilers, install under a prefix and generate module files...) and automate the process of building the software.
This is not usefull for programmers, but for end users who want to install the software.

Edited at 2012-12-07 10:36 am (UTC)
(Reply) (Thread)
From: (Anonymous)
2012-12-10 03:06 am (UTC)
build systems are a symptom of software languages that are not designed to build software systems
(Reply) (Thread)
From: (Anonymous)
2013-09-11 04:40 am (UTC)
Greate post. Keep writing such kind of info on your blog.
Im really impressed by your blog.
Hello there, You have performed an incredible job. I'll certainly digg it and in my opinion recommend to
my friends. I am confident they will be benefited from this web site.
(Reply) (Thread)
From: (Anonymous)
2013-10-15 11:38 am (UTC)
Thanks for sharing this great content, I really enjoyed the insign you bring to the topic, awesome stuff!
water systems (http://articlestwo.appspot.com/article/water-filtration-systems)
(Reply) (Thread)
From: (Anonymous)
2018-11-30 11:56 am (UTC)

I believe in the death penalty for spammers.

I'm otherwise opposed to it.
(Reply) (Parent) (Thread)
[User Picture]From: edhorch
2013-10-22 09:05 am (UTC)
Javac is responsible for a lot of these. Why this may or may not be a Good Thing is a debate that could rage for years.

BTW, I'm working now with a horribly botched and broken build system that illustrates many of the listed bad assumptions, even though it's all C/C++ code. Surprisingly few of the problems are the fault of make itself.
(Reply) (Thread)
From: (Anonymous)
2016-10-22 11:38 pm (UTC)

Fun with java

At a previous job many years ago, someone on the team took the time to write enough code to figure out what files would actually be output by the java compiler. It's really amazing, especially when you get into tricks such as having private classes hidden inside another class. The java compiler can produce some amazingly arbitrary output. It also has a habit of figuring out on its own if it wants to go build something else it happens to notice you need.

Try integrating that with make.
(Reply) (Parent) (Thread)
[User Picture]From: edhorch
2013-10-22 09:08 am (UTC)
One more thing: there is a special place in hell for designers of products that can only be built or configured through a GUI.
(Reply) (Thread)
[User Picture]From: livejournal
2015-07-30 12:24 pm (UTC)

Falsehoods Programmers Believe

User nancylebov referenced to your post from Falsehoods Programmers Believe saying: [...] Build systems. [...]
(Reply) (Thread)
From: (Anonymous)
2018-01-21 01:35 pm (UTC)
> It's possible to determine the dependencies of a target without building it.

NixPkgs (a meta build-system) assumes this. Build systems that violate the rule include, oh, Gradle.

Makes Java packaging fun.
(Reply) (Thread)
[User Picture]From: pozorvlak
2018-01-22 10:21 am (UTC)
Java gleefully violates a lot of these assumptions, because why would you ever want to leave the Java ecosystem or integrate a Java component with code written in other languages?*

* I am in exactly this position at the moment :-(
(Reply) (Parent) (Thread)
From: (Anonymous)
2019-01-13 09:21 pm (UTC)

cycle and trees

I know this 5+ years old, but I'm hoping you're still around to reply :)

I was wondering what

1. Build graphs are trees.
2. Build graphs are acyclic.

these 2 statements mean. How can a build system not detect cycles in dependencies? Wouldn't that cause a build loop?

Do you have examples of where this is wrong?
(Reply) (Thread)
[User Picture]From: pozorvlak
2019-01-14 11:22 am (UTC)

Re: cycle and trees

Hi! Yep, I'm still around, though I don't post much these days :-(

To get a build graph that isn't a tree, you just need a diamond:
  • A depends on B and C
  • B depends on D
  • C depends on D
For instance,
  • a.out depends on foo.o and bar.o
  • foo.o depends on common.h
  • bar.o depends on common.h
Note that this is a diamond whether we consider build graphs top-down (starting from the ultimate target to be built) or bottom-up (starting from the dirty files).

I can't, off the top of my head, think of a build tool that makes this error (recursive Make, possibly?), but I've certainly seen it among people writing about build tools. The downside of making this error would be unnecessary rebuilds of clean targets, and possibly race conditions.

Cycles are rarer, but as so often with build-system weirdness, LaTeX has us covered. Suppose `paper.tex` contains cross-references ("see Equation 3.2.4 on p123"). You want to generate a compiled document, `paper.pdf`. The command to do that is `pdflatex paper.tex`, which reads `paper.tex` and `paper.aux`, and updates `paper.pdf`; but if the target of a cross-reference has changed, it will also update `paper.aux`. So you have to repeatedly run `pdflatex` until `paper.pdf` and `paper.aux` stop changing. But this is not guaranteed to happen! I believe that it is possible to construct a pathological document in which the new width of a cross-reference pushes the target to a different page, which updates the .aux file again, which causes a non-convergent build cycle; however, I can't find a reference for this right now :-(

Build cycles that provably reach a fixed point if run enough times are merely infuriating; build cycles that are not guaranteed to terminate are worse; build cycles where you couldn't even get started (which I think is what you're asking about?) would be worst of all, but fortunately I'm not aware of any examples of that - since software does ultimately get built, I think we can rule it out as a case we need to handle. My point was really that if you assume or enforce acyclicity (which seems like a harmless safety measure), then your tool will be unable to correctly handle builds that rely on rerunning a compiler until a fixpoint is reached, like LaTeX.

[How do existing build tools handle this? I think they either handle the fixpoint calculation properly, and accept the possibility of infinite loops, or run the compiler a bounded (or worse, fixed) number of times.]
(Reply) (Parent) (Thread) (Expand)