|Gay marriage: the category-theoretic perspective
||[Nov. 24th, 2008|09:16 pm]
Over on qntm.org, the always-entertaining Sam Hughes presents us with an essay on gay marriage: the database engineering perspective. It's a discussion of the real obstacle to legalising gay marriage, namely migrating the schemas of all the government databases so that they can accommodate the idea. Worth a read, perhaps especially if you don't know much about how databases work.
As Sam progressively broadens the allowable definition of marriage, he quickly ends up with the problem of representing an arbitrary undirected graph in a relational database. Recall that a graph is something like a half-filled-in join-the-dots puzzle: more formally, it's a collection of vertices and a collection of edges between pairs of vertices (or if you prefer, some dots, and lines connecting some of the dots), and that a graph is directed if the edges have a direction associated with them, and undirected otherwise. Here, the vertices are people, and the edges are marriages between people. It shouldn't come as too much of a surprise that graphs show up in this problem: graphs arise all over the place in mathematics and computing (another obvious one: the machines on your network, and the cables connecting them). And so Sam runs into one of the classic problems: how do you represent something fundamentally unordered (the endpoints of an edge, here the spouses in a binary marriage) using something that's fundamentally ordered (the memory of a computer)? More concretely, how do you decide who gets put in the spouse1 column of your marriages table and who gets put in spouse2?
Sam's solution is expedient if inelegant: since every person in his database has a unique identifying number (good DB design practice, but still, shudder), he can simply make spouse1 the partner with the lower CitizenNumber. But there's a more common solution which I'd like to talk about. Since it's easy to represent directed edges ("A is married to B"), we represent an undirected edge by two directed edges going in opposite directions ("A is married to B, and B is married to A"). When I first encountered this trick, I thought it was an ugly hack, but it turns out that it actually makes more sense than you might think.
First, we'll need just a little category theory, which is the language we use to talk about classes of mathematical objects (such as graphs) and the connections between different classes. A category is just a directed graph (see, I told you they showed up everywhere) with a bit of extra structure that we don't need to worry about now. For any given type of mathematical object (graphs, sets, groups, rings, spaces, Galois connections...) we can form a category whose vertices are those objects, and whose edges are structure-preserving transformations of those objects. For instance, there's a category of sets and functions, a category of groups and group homomorphisms, and a category of spaces and continuous maps. There are other categories: for instance, for any given programming language, there's a category whose vertices are the types of data allowable in that language (strings, integers, real numbers, network sockets, Facebook friend requests...) and whose edges A → B are subprograms taking data of type A and outputting data of type B: this is why the functional programming crowd get so excited about categories. More interestingly for our purposes, there's a category Graph whose vertices are undirected graphs and whose edges are the only sensible things¹, and a category Digraph whose vertices are directed graphs.
We can talk about structure-preserving transformations of categories²; such things are called functors. A functor is just a graph transformation that preserves the extra structure of a category. For instance, if you have a group (a set of things and a rule for combining two things together to get a new thing), you can forget about the rule and just keep the set of things. Given a transformation between groups, you can easily form a function between their underlying sets. These operations give us a "forgetful functor" from Group to Set, which we often call U for "underlying set".
This functor, as it turns out, has a kind of opposite, or dual, usually called F: a functor that takes any set X and constructs a group out of it, called the "free group on X", or just FX. To make the free group on X, start with the elements of X and add a new element called "xy" for every x and y in X, then repeat the process with all the new elements you just added, and so on³. If X is the one-point set, then FX is the group of integers and addition. There's a kind of symmetry here: U throws structure away, and F adds it back in in the simplest way possible4. We say that F and U form an adjoint pair, or an adjunction, or that F is left adjoint to U. Adjunctions are one of the most pervasive and useful notions in modern mathematics: I've been to several lectures whose entire contents could have been proved in a sentence if the lecturer had spotted the existence of a handy adjunction. If you've heard of monads, you'll be pleased to hear that adjunctions and monads are intimately related: if F is left adjoint to U, then the composite functor UF is a monad (and FU is a comonad), and it can be shown that all monads arise this way.
We can repeat this trick whenever we have two types of structure, one of which is a more complex version of the other: we can form a forgetful functor from the category of complex things to the category of simple things, and (under very general conditions) we can construct a free functor from simple things to complex things which is left adjoint to that forgetful functor. This free functor always works in the same general way: if you need something to exist, just freely add it in. The resulting free object is usually very large, but we don't really care about that.
I described the relationship between the two members of an adjoint pair as "symmetry", but the symmetry's not perfect: there's usually one functor which is the right adjoint and one which is the left adjoint, and they have quite different properties. For instance, it's possible to define the notion of the product of two vertices in a category (generalising the usual notions of product for sets, groups, topological spaces, etc), and one can show that right adjoints always preserve products, whereas it's quite rare for left adjoints to do so.
Let's come back to graphs and directed graphs. There's a functor Digraph → Graph, "forget the directions on the edges". At first glance, you might think this is a forgetful functor. But no! A few minutes' calculation shows that it doesn't preserve products, and thus can't be the right adjoint of anything. It does, however, preserve things called coproducts, which suggests that it might be the left adjoint of some functor. And indeed it is; its right adjoint is the functor which takes some undirected graph and produces a directed graph with the same vertices, and replaces each edge with a pair of directed edges going in opposite directions. This, if you remember, was the trick we used to represent undirected graphs at the start of this post.
The free/forgetful characterisation of adjoint pairs strongly suggests that it's this functor that behaves more like forgetting structure, and the "forget the directions" functor that behaves more like adding structure in. So we should think of undirected graphs as being directed graphs with the extra property of invertibility: every edge in an "undirected" graph has an inverse that goes in the opposite direction. In this picture, the "forget the directions" functor takes a directed graph and adds an inverse to every edge, and the "replace undirected edges with opposing pairs of directed edges" functor takes an undirected (= known invertible) graph and returns an identical directed graph that just happens to be invertible.
¹ If G and H are graphs, then a graph morphism f : G → H is
such that for any edge e with endpoints a and b, the vertices fv(a) and fv(b) are the endpoints of fe(e).
- a function fv from the vertices of G to the vertices of H;
- a function fe from the edges of G to the edges of H
² Yes, there's a category of categories, though you have to fudge things a bit to work around Russell's Paradox.
³ There's really a bit more to it than that, because I haven't told you the whole definition of a group. For those that care, the full details can easily be found on Wikipedia.
4 We can make this more formal: F is left adjoint to U if the edges from FX to Y are in one-to-one correspondence with the edges from X to UY, "naturally" in X and Y, where "naturally" has a technical definition but really means something like "without making any arbitrary choices".
TL;DR version: Very often in mathematics, we have a pair of operations, one of which forgets about extra structure on some object and one of which adds that structure in simple-mindedly. We capture this notion using adjunctions. Applying this framework to undirected and directed graphs, we discover that the well-known trick for representing an undirected graph as an invertible directed graph actually behaves more like forgetting structure than adding it back in. Thus undirected graphs "really are" invertible directed graphs.