?

Log in

No account? Create an account
Redo from scratch - Beware of the Train [entries|archive|friends|userinfo]
pozorvlak

[ website | My Website ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Links
[Links:| My moblog Hypothetical, the place to be My (fairly feeble) website ]

Redo from scratch [Jan. 12th, 2011|08:28 pm]
pozorvlak
[Tags|, , , , , ]

I think that djb redo will turn out to be the Git of build systems.

What do I mean by that? Well, think about version control systems before the cryptographic-DAG systems, and especially Git, came along. Everyone (everyone you'd want to work with, at least) agreed that a version control system was an essential part of a self-respecting coder's toolchain, but all the systems available sucked. Managing your VC repository required effort and care, and a lot of coders (myself, I'm ashamed to say, included) just learned the basics because nontrivial version-control tasks just got so complex, so fast, it was usually quicker to go outside the system or even redo your work. Even thinking about what was going on was hard. Large teams sometimes had someone whose entire job was to nursemaid the version control system. Clearly version control was a Serious, Complex Problem. Our best hope was a bunch of guys with serious beards who were developing an impenetrable mathematical formalism for attacking the problem; unfortunately, their code was far too slow, and their formalism at best delayed the onset of head-breaking complexity.

Then Git came along and blew away the old order. Once you've grokked the Git worldview, previously intractable problems dissolve before your eyes. It turns out that version control can actually be simple. There was a bit of kicking and screaming, but the open-source community is well on the way to adopting Git as a de facto standard for version control - removing, as a side-effect, one annoying hurdle to making your first contribution to any given open-source project. There are still holdouts, who use systems which do some things differently to the way Git does them; this is because they haven't yet realised that Git's way is right. Git is, quite simply, the Right Thing.

Now think about build systems. Everyone (who you'd want to work with...) agrees that a build system is an essential part of your toolchain, but all the available systems suck. Make, in particular, has three main problems:
  1. It's Yet Another Goddamn Syntax you have to learn, with its own stupid quoting and whitespace rules.
  2. Recursive make sucks.
  3. Nonrecursive make sucks.
There are numerous make-replacements and make-addons and Makefile generators and they all add complexity, often for little obvious benefit: many of them just fix things that weren't broken in the first place.

I've spent a few hours over the last couple of days reading about and playing with redo, an implementation by Avery Pennarun of a design by the famously uncompromising genius Daniel J. Bernstein. And I've been feeling the same excitement as I felt when I started to understand Git. Redo has an incredibly simple design, but that design makes previously complex problems dissolve. I was initially sad to see that it didn't have fabricate.py's cross-language automatic dependency detection, but then I thought for a few minutes and realised that it would be trivial to add that to your build. If that's what you wanted, of course - core redo (unlike core make) is very, very small, but the community is planning to provide a library of commonly-desired features which you can turn on if (and only if) you want.

So I seriously hope that redo will do what Git did and take over the world, to the benefit of all. But it faces some obstacles.

Firstly, it's closely tied to Unix, both in its implementation and in its use of shell scripts as the default build-script language. From the documentation, it's not even clear if it's been tested on Windows. More recent versions of redo allow you to write do-files in any language that supports the #! convention, so writing cross-platform build scripts won't be that hard, but it will cause additional friction: shell, IMHO, is mostly a good language for this kind of thing. Pennarun talks about linking a POSIX shell directly into redo to provide Windows support, but that kinda misses the point - any nontrivial shell program typically relies on a host of standard Unix utility programs.

Git also suffered from poor Windows support early on, but in Git's case the Unix-centricity was less fundamental.

Secondly, there's the migration problem. It was possible to write importers and exporters allowing conversion between different VCSes, so established projects could move over to Git without losing history. Is it even possible to do that with a build system of any complexity? Redo has been carefully written so it can coexist with make, allowing piecemeal conversion, but that conversion will always, AFAICT, be a manual process. The mailing list is full of success stories about astonishing code savings made by moving to redo, but, again, someone has to understand the existing build system and write the new one. Hopefully the redo community will build up some expertise at handling migrations as time goes on - the redo project is only nine weeks old!

Thirdly, there's the fact that everyone already knows make. One Reddit user asked "What features make up for this not being make(1)?" The clear implication being that make is the standard, everyone already knows it, and redo has to be pretty exceptional for make not to be a better choice.

But does everyone really know make, or do they just kinda know the basics, like everyone used to "know" Subversion? Let me tell you a story. About six years ago, when I was working for a Dilbert-esque systems integrator, I was sent on a training course to learn about the new (and, as it turned out, spectacularly bad) version control system that my team was being ordered to adopt. Everyone in the room was the build manager of their respective project. The instructor asked us "OK, who's used make?" We all put our hands up. "Now, who's written a Makefile from scratch?" Only my hand remained up.

I confidently predict that if the portability and migration problems can be solved, then the community will fall on redo like a pack of starving wolverines.

[I gave a short talk about redo at the January 2011 Glasgow.pm technical meeting. My notes are available here.]
linkReply

Comments:
From: cyocum.myopenid.com
2011-01-12 08:49 pm (UTC)

Cool!

I wish I could see your talk and have a pint after but I will definitely check out redo.
(Reply) (Thread)
[User Picture]From: pozorvlak
2011-01-14 03:17 pm (UTC)

Re: Cool!

Notes are up now, for what it's worth - they don't add much to the README, but it was helpful for me to produce them :-)
(Reply) (Parent) (Thread)
[User Picture]From: ciphergoth
2011-01-12 08:59 pm (UTC)
The *right* solution to the portability problem is to abandon the shell and write in a better language, like Python...
(Reply) (Thread)
[User Picture]From: pozorvlak
2011-01-12 09:53 pm (UTC)
Well, yes and no. While Python is preferable to shell for anything remotely complex, it suffers from this problem. Perhaps if Python had an equivalent of Perl's qw// construct...
(Reply) (Parent) (Thread)
[User Picture]From: pozorvlak
2011-01-12 09:57 pm (UTC)
I imagine that once redo's internal APIs stabilise someone will write Python/Perl/Ruby modules to make writing do-scripts in Real Scripting Languages saner.
(Reply) (Parent) (Thread)
[User Picture]From: pozorvlak
2011-01-13 10:27 am (UTC)
There was an excellent thread about just this question on the redo mailing list this morning. Interestingly, it turns out that since git is largely written in shell, the git team have had to do a lot of the work necessary to provide a drop-in POSIX environment on Windows. Hence a possible solution to the portability problem is to just bundle busybox with redo, and have your do-scripts use that by default.
(Reply) (Parent) (Thread)
[User Picture]From: necaris
2011-01-14 12:57 am (UTC)
Looking forward to your notes -- I haven't heard much about redo but it sounds fascinating! I too have written a Makefile from scratch, but only for toy projects, and non-trivial Makefiles terrify me...
(Reply) (Thread)
[User Picture]From: optimusclimb
2011-01-14 07:36 pm (UTC)

Don't understand why people "like"/want to rely on shell syntax

As a developer that finally has had to learn sh/bash scripting (only recently was able to break free from the Win32 world), I find it ironic that one of the complaints about make is, "It's Yet Another Goddamn Syntax you have to learn, with its own stupid quoting and whitespace rules.", and yet later go on to say, "shell, IMHO, is mostly a good language for this kind of thing."

All I could think while reading my bash book was, "Seriously, I have to know this ridiculous arcane syntax now? Ughhh." It's 2010. If anything, I've noticed a renaissance of newer developers finally breaking past old barriers, creating things like git/hg, GSD quite fast with python and ruby, etc. Languages people are very likely to know or have learned would be python, ruby, the C's, javascript, java, possibly some lisp variation. Why oh why should we clutter newer developers heads with shell syntax if we can avoid it? It's easier than ever to embed interpreters these days.

Now, being 27, I'd be tempted to at least argue for Perl as being much better than shell, and the cadillac of quick and dirty, but I just don't think newer devs bother with it. So, screw it. Harness py, rb, or js, and if you want to handle the cross platform issue, just build a library of routines that would be commonly used for the tasks at hand. On *nix, perhaps they could just shell out to the appropriate utilities.
(Reply) (Thread)
[User Picture]From: pozorvlak
2011-01-15 12:34 am (UTC)

Re: Don't understand why people "like"/want to rely on shell syntax

I agree with you about shell syntax: though I've written shell scripts, my "just rewrite the damn thing in Perl" threshold is deliberately low, and I have to look things up almost every time. But shell is at least better than shell + make, and shell is very good for one thing: running programs, which is something build scripts generally have to do a lot.

I have yet to be convinced on this either way, tbh.
(Reply) (Parent) (Thread)
[User Picture]From: der_ak
2011-01-15 03:47 am (UTC)
I don't think you're right in your analogy with git and the conclusions, because I found scenarios where redo clearly lacks flexibility compared to make. I wrote them down in my blog: http://synflood.at/blog/index.php?/archives/789-Why-djb-redo-wont-be-the-Git-of-build-systems.html
(Reply) (Thread)
[User Picture]From: pozorvlak
2011-01-19 10:11 am (UTC)
Thank you! Very interesting.
(Reply) (Parent) (Thread)
From: (Anonymous)
2011-01-15 05:36 am (UTC)

Converting make to redo?

I haven't looked at redo yet; so I can't comment to its usefulness.

That said, many projects use this thing called Automake. Its a perl script that reads a definition file and spits out a make file. You could probably rewrite it to spit out redo files, and get something that works out of the box for thousands of projects.

Side note: In most languages, trying to do fine-grain dependency tracking outside the compiler is a lost cause. Being 99% correct means you are horribly broken for many users. A good build system should have a mechanism for using compilation output to add more dependencies into the definition file. See automake's use of deps.
(Reply) (Thread)
[User Picture]From: pozorvlak
2011-01-15 02:21 pm (UTC)

Re: Converting make to redo?

I've heard of automake, but never actually used it. If it's possible to input automake syntax and push out .do files (and right now I have no idea how big an "if" that is) it might help solve the migration problem.

After-the-fact dependency checking: you'll be delighted to learn that this was one of the guiding principles of redo's design. See my talk notes.
(Reply) (Parent) (Thread)