| Git and directory renames |
[Dec. 13th, 2009|03:16 pm] |
We've been using git for version control at work for the last couple of months, and I'm really impressed with it. My favourite thing about git is what might be termed its unfuckability: no matter what you do to a repository to fuck it up, it seems, it's always possible to unfuck¹ it, usually simply by keeping calm and reading the error messages. I've managed to lose data with an ill-advised git reset --hard, but that was before I knew about the reflog, and I've always been able to recover "lost" work in every other case. And then there's the rest: cheap local branching², the index, the raw speed, git-bisect, git-gui and gitk (which has rapidly become an indispensable part of my development toolchain)³.
The merge algorithm pretty much Just Works: we get the occasional merge conflict, sure, but (so far) never without good reason. So I was surprised to learn (from Mark Shuttleworth) of a really simple case where git's merge algorithm ( does the Wrong Thing ).
Bazaar obviously gets this right, otherwise Mark Shuttleworth wouldn't have written his post. Commenters there suggest that Darcs gets this right too, but after spending a while looking through the Darcs wiki I discover that I really, really can't be arsed to work out how to do the necessary branching to test it. Hopefully some helpful Darcs user (I know you're still out there...) will be able to post the relevant transcript in a comment. [Edit: I realised belatedly that you don't need branches for this. Transcript here.]
Overall, I don't think this is a show-stopper, or even a reason to seriously think about switching to another DVCS, but it's certainly worth remembering and watching out for.
¹ Why yes, I have been watching the excellent Generation Kill. How did you guess? :-) ² If you're not used to systems in which this is possible, it probably sounds really scary and difficult, and like the kind of thing you'd never need or use. It's not. It's actually really simple and incredibly useful. The article that made it click for me, Jonathan Rockway's Git Merging By Example, is no longer online, but I'm sure there are equally good ones out there. You'll probably find Aaron Crane's "branch name in your shell prompt" utility helpful. ³ Just to clarify: I'm not saying that Your Favourite VCS sucks, or that it's impossible to get these features using it: I'm just saying that git has them, and they're really, really helpful. |
|
|
| Project Euler in Factor, part 1 |
[Nov. 17th, 2009|11:48 pm] |
In a (probably futile) attempt to acquire Real Ultimate Power, I have been trying some Project Euler problems in Factor. Here are my first five solutions, along with my comments: I was going to post my first ten solutions, but this was getting too long.
Spoilers herein, obviously.
( Problem 1 )
( Problem 2 )
( Problem 3 )
( Problem 4 )
( Problem 5 )
Overall comments
- The dev tools are nice, once you work out what everything's called - the debugger's called the "walker", for instance. I particularly like the way that the listener pops up a stack effect declaration in a minibuffer when you type in a word's name - since the conventions are strictly followed, this is often all the documentation you need. When you need more, it's easy to get help and see source code using the
help and see words, or using the browser (Alt+B). I haven't yet worked out how to take best advantage of the editor integration, but I'm sure that will come. - For getting started, I found the Factor cookbook invaluable.
- #concatenative is one of the friendliest tech forums I've encountered, and hands down the most helpful. In every case my (admittedly n00bish) questions were answered before I'd finished posing them.
- I'm finding programming without variables to be really hard. Factor does have facilities for both lexically- and dynamically-scoped variables, but I've been mostly avoiding their use, on the grounds that I'll have to learn how to use the stack some time and I might as well do so now.
Any comments or questions, on the code or the presentation, are very welcome!
Edit: thanks once again to the members of #concatenative, particularly slava and erg, for comments on this post. |
|
|
| (no subject) |
[Sep. 27th, 2009|12:00 pm] |
It occurred to me recently that programming comes in several flavours, and that how much I enjoy a programming task depends strongly on which flavour predominates. The flavours I've identified so far, in descending order of how much I like them, are the following:
Data munging: You have a large mass of data in some fairly generic form, and must fold, spindle and mutilate it with well-understood tools to extract the information of interest to you. Writing Unix scripts is the classic example, but list manipulation in Haskell or Lisp and array manipulation in J or APL have this flavour too.
Clever algorithms: You have some calculation or task to perform, and brute force has proved inadequate. Now you must apply the Power Of Your Mind to find a cunning and better approach. I haven't actually done very much of this stuff, but I have had to solve a couple of problems of this nature at my current employer, and have another one waiting for me on Monday morning.
Twisty if-statements, all alike: You want to zom all the glaars, but only if (the moon's in (Virgo or Libra), unless of course the Moon's in Libra and Hearts are playing at home), or (the engine-driver's socks are mismatched xor (the year ends in a seven and the month contains an R)). And you meanwhile want to wibble the odd-numbered spoffles if the Moon's in Libra, Hearts are playing at home and (the year ends in a seven or the engine-driver has mismatched socks). The challenge lies in making sure you've identified all the exceptions and special cases, and in actually coding them up correctly. Not remotely elegant, but better than...
Doctor X-style wizardry: Making a system do things that it was never intended to do. If you squint at the problem just right you have all the tools you need to do the job, sort of, but it's at best a witty hack and at worst a horrible bodge, and certainly not something you'd want to put much weight on. All non-trivial TeX programming has this nature. This kind of thing is quite fun when you're doing it as a joke or a proof-of-concept, but it's downright horrible when you need to do it to get something important done. But it's still far more fun than...
API spelunking. You have a candle, a slice of cheese, and a pair of old boots. You need a fork-handle. Is it even possible to construct one out of what you have? Is there a chain of method calls and constructors that will lead you from what you have to what you need? And if you succeed in constructing your fork-handle, is it the fork-handle you need? Or some nearby, but completely inappropriate, fork-handle? This is in some sense dual to data-munging. The worst examples I've encountered have actually not been in relation to documented APIs (though try reading lines out of a zipped text file in Java, if you want to see what I'm talking about), but rather in large crufty systems with complicated and ill-thought out data models. The Law of Demeter addresses this problem, but I have yet to work on a project that sticks to it. I'd settle for some sort of coherence theorem, but that would require coherent API design, which is kind of the problem to begin with... |
|
|
| Conversation in the office the other day... |
[Sep. 11th, 2009|08:17 pm] |
Me: You know, we really want some way of representing this problem that lets us have WAAFs pushing markers around on a giant diagram to keep track of our state. Joe: Yeah. On green baize. That bit's important. Me: Or we could just get a full-sized snooker table for the office. Joe: And an office that's large enough to fit a full-sized snooker table in it. Me: I've worked in offices with table football tables, but never one with a snooker table... Joe: Or we could just move the office into a pool hall. Me: Yeah, I like it. Joe: How cyberpunk would that be? We'd be an AI consultancy based out of a pool hall! Me: I think we'd need at least one cyborg. Does anyone have a pacemaker? Joe: I've got a false tooth... Me: Is it Turing complete, though? It doesn't count unless you can play Tetris on it.
... and, not entirely coincidentally, we're both hiring and looking for a new office. As well as the web developer role listed at that link, we're also looking to recruit more NLP and machine-learning experts. Pool skills and cybernetic body parts not essential ;) |
|
|
| Violet |
[Jul. 7th, 2009|10:30 pm] |
A friend asked me to beta-test his entry for the 2009 Interactive Fiction Competition, and doing so reminded me that hey, I really enjoy text adventures, even though I suck at them. So I spent a few hours on Sunday playing some of last year's IFComp entries. In particular, I played through Jeremy Freese's Violet, last year's winner, which I hereby recommend to you all. Download just Violet here, or all the entries here.
Violet endeared itself to me from the start by being about a graduate student's struggle to write 1000 words of his/her¹ dissertation, in the face of difficulties and distractions; but I would have loved it anyway, because it's so well written and so clever and so funny. Seriously, it's worth playing just for the descriptions of the tracks on your iPod. I found most of the puzzles were just difficult enough to be fun without being frustrating: the game's good about warning you if you're about to do something deadly or solution-preventing, which also helps prevent frustration. It's obvious that a lot of care was put into the game's construction - there was almost no guess-the-verb nonsense (more precisely, there were a couple of instances of guess-the-preposition, and one case of guess-that-the-game-wants-the-plural-of-this-noun, which meant that the first solution I'd thought of worked after all). And the way that the backstory was incrementally revealed was very nice. Violet, by the way, is the name of your girlfriend, who narrates the game: as a trick to motivate yourself into writing, you're imagining her voice encouraging you. Meanwhile, the real Violet is back at your flat... but enough of that. Play the game yourselves.
Unfortunately, getting Violet to work under Linux was more challenging than some of the puzzles. Most IF these days is distributed in a format called Z-code, which you play by opening it in a Z-code interpreter (much as you open JPEGs in an image viewer), but Violet, for reasons that presumably make sense to the author, is distributed in an extended form of Z-code called Blorb, which isn't supported by most of the common Z-code interpreters. If you're on Windows or Mac OS, you shouldn't have any problems: just download the interpreter bundles from the IFComp 2008 download page. On (Ubuntu) Linux, however, the procedure is as follows:- Download the file
nitfol-0.5_linux.tgz from here. - Unpack it, either from a graphical file-manager or by typing
tar xvzf nitfol-0.5_linux.tgz at a command prompt (having first cd'd to wherever you saved the file). - Put the contents of the zipfile somewhere useful.
sudo mv *nitfol /usr/local/bin
sudo mv nitfol-0.5.so /usr/lib - Try to run Nitfol now and it will complain that it can't find
libpng.so.2. This is a shared library file that contains code for reading and writing PNG image files. Look in your shared libraries directory to see if there's anything promising.cd /usr/lib
ls *png* - In my case, that came up with
libpng12.so.0 and libpng12.so.0.27.0, the one being a symbolic link to the other. Let's cross our fingers, make libpng.so.2 a symlink to libpng12.so.0.27.0, and see what happens.sudo ln -s libpng12.so.0.27.0 libpng.so.2 By the way: ever have trouble remembering the order of arguments to ln? It's the same as cp and mv: "from" then "to". - Change directory to wherever you installed Violet, and invoke Nitfol on the file
violet.zblorb.cd ~/if/Comp08/zcode/violet
nitfol violet.zblorb - All being well, you should now be looking at the start of the game. Happy adventuring!
Edit: or you could just play this Flash version. Thanks to mrkgnao for the link!
¹ The default is to play as male, but you can become female using the FEMALE command, or the slightly classier HETERONORMATIVITY OFF. The text and backstory are altered to fit, and some of the puzzles are slightly different. |
|
|
| Are you ready to get pumped? |
[Jun. 26th, 2009|12:02 am] |
Hi, this post is about Factor, real Factor. This post is awesome. My name is pozorvlak and I can't stop thinking about Factor. Factor is cool, and by cool, I mean totally sweet.
Facts: 1. Slava Pestov is a mammal. 2. Slava Pestov hacks on Factor ALL the time. 3. The purpose of Slava Pestov is to flip out and kill people.
Weapons and Gear: 1. Concatenative (stack-based, Forth-like) language. 2. Dynamic types. 3. First-class functions. 4. Object-orientation. 5. Real macros. 6. Batteries included. 7. The listener: debugger/REPL/help browser/etc. 8. Optimizing, self-hosting native-code compiler. 9. Ninja stars.
Testimonial: Factor programmers can kill anyone they want! Slava cuts off heads ALL the time and doesn't even think twice about it. The guys on #concatenative are so crazy and awesome that they flip out ALL the time. I heard that littledan was eating at a diner. And when some dude dropped a spoon littledan GC'ed every object in memory. My friend Mark said that he saw a Factor programmer totally uppercut some kid just because the kid opened a gl-window.
And that's what I call REAL Ultimate Power!!!!!!!!!!!!!!!!!!
If you don't believe that Factor programmers have REAL Ultimate Power you better git clone their repository right now or they will chop your head off!!! It's an easy choice, if you ask me.
Writing OpenGL code in Factor: Step 1: Look for some Factor OpenGL documentation. Step 2: Fail to find any. Step 3: Get really super pissed. Step 4: Get some C++ OpenGL documentation instead. Step 5: Put something slippery on it, like butter or cream. Step 6: Bend it to fit (this is crucial). Step 7: Keep folded and insert into listener hard. Step 8: Push hard until you can't see it. Step 9: Wait. Step 10: Die.
If you succeed, everybody will be like “Holy Crap!”
Update: by popular demand, a picture of Slava wailing on guitar. |
|
|
| (no subject) |
[Jun. 6th, 2009|10:33 pm] |
I've got a problem: I've had an idea I want to write about, but it depends on two or three other ideas I wanted to write about but never got around to. So I'm going to write this post in a top-down, Wirthian fashion, stubbing out those other posts: maybe, if there's enough interest, I'll come back and write them properly and replace the stubs here with links. OK with everyone?
Right, on with the motley.
Stub post no. 1 Extreme Programming (XP), whether intentionally or unintentionally (and my money is on "intentionally, but try getting them to admit it") is really good for getting work out of people who are bright but have short attention spans. This is a Good Thing. It's most obvious in the case of pair programming - imagine saying to your partner "Y'know, this is kinda hard. Let's surf Reddit for a while" - but actually, most XP practices have this benefit. Short feedback cycles, concrete rewards, definite "next moves" (given by failing tests and the "simplest thing that could possibly work" approach) - all of these things have the effect of maintaining flow and reducing the incentive to slack off. It's programming as a highly addictive game. Dynamic languages work well with this approach, because they make it as easy as possible to get something up and running, and to test the things you've written.
Stub post no. 2 Haskell is the opposite. It encourages deep thinking, and everything about the language makes it as hard as possible to get something running unless it's just right. Screw up, and you're not presented with a running program and a failing test that you can run in the debugger; you're presented with an unfriendly compiler message and a bunch of dead code that you can't interrogate in any meaningful way. After a morning hour few minutes of this (usually involving no small loss of hair), the consultant Barbie that lives in my head invariably says "Statically-typed pure functional programming is hard. Let's go shopping!" And I, fed up and mindful of my scalp, agree. This is why I am no good at Haskell.
Stub post no. 3 Everything I read by or about the climber Dave MacLeod (blog) makes me more inspired by him. Partly for his visionary climbs, but mostly for his approach to choosing, training for and tackling really hard problems, which I think should generalise really well, if only I could put my finger on what exactly it is. It helps that he's a really friendly, pleasant guy in person. Check out the BAFTA-winning film Echo Wall that he and his wife made about his preparation for his first ascent of the trad route of the same name. If you're in Edinburgh, you can borrow my DVD, I'm positively eager to lend it out.
Anyway, something Dave wrote about training (which I can't be arsed to find right now) said that in order to train effectively, you have to be constantly pushing yourself in some way: either in terms of power, or stamina, or technique, or fear¹, or whatever. You have to find your comfort zone and then consciously go beyond it, in whichever direction you wish to improve. As you improve, your comfort zone shifts, and you need to keep pushing yourself harder and harder in order to continue to improve. But (and here's the interesting bit), he said that if you do this for long enough, your whole conception of comfort shifts, and you start to feel uncomfortable if you aren't pushing yourself in some way.
So, here's the thought I had. Maybe all the Haskellers have been training themselves Dave MacLeod-stylee, and now only feel comfortable pushing themselves really hard, and that's why they like using such a bloody difficult language.
¹ About a year and a half ago, I was seconding a route called Dives/Better Things (Hard Severe) in Wales, and got to a bit that was a bit too hard and a bit too scary. I floundered around for a bit, getting more and more freaked out, and then said to myself "What would Cale Gibbard do? He'd pause for a bit, think really hard, work out exactly what to do next, and then do that. OK. Do that, then." I have no idea if Cale climbs, but it did the trick. Cale, if you're reading, thanks for that :-) |
|
|
| #amazonfail, as the cool kids are apparently calling it. |
[Apr. 14th, 2009|09:49 am] |
Looks like the recent Amazon-delisting-LGBT-books flap may have been the result of Amazon getting trolled. weev has claimed responsibility. Now, I think weev is probably just trolling (about trolling Amazon, if that makes any sense), and apparently his sample code doesn't actually work as described (I haven't tried it), but the attack vector he describes sounds plausible.
Short version:- The attacker does a search for anything tagged "gay and lesbian", "rape survivor", etc.
- The attacker reports a few of them as "adult content", and deduces the form of the (generally underused) "flag as adult" URL.
- By some nefarious means (cross-site request forging and/or hiring third-world captcha solvers to create zillions of accounts), the attacker creates many "flag as adult" requests for every LGBT book in Amazon's system. The crucial point is that the requests have to come from many different accounts.
- Amazon's automated systems observe that many users have reported the books as adult content, and remove them from search listings.
- Lulz ensue.
If that's how it worked, it was an elegant piece of trolling, and I (briefly) applaud the perpetrator(s).
Anyway, I now feel a lot better about the big Amazon order I put in a few days ago...
Edit: bronxelf_ag001 makes an excellent point in the comments thread of tehdely's post. Read the whole thread.
Further edit: truth is more boring than fiction. Seems someone at Amazon France got confused and thought everything tagged "sexuality" should be classified under "adult". Meh. |
|
|
| Even a stopped clock is right twice a day |
[Mar. 30th, 2009|05:08 pm] |
I just noticed that my clock stopped forty minutes ago. "Hang on, it can't still be twenty-five past four!" I thought to myself, before looking at my watch and confirming that no, it was five past five.
The weird thing is that this isn't an analogue clock we're talking about. It's the time readout in the corner of my Windows taskbar¹. Clicking on it to bring up the calendar reveals that the computer knows fine well what time it really is, but for some reason it hasn't updated the display for nearly three quarters of an hour.
Weird.
¹ Yes: I, a confirmed *nix elitist, have to use Windows at work. Laugh all you want. If you're wondering how I cope, the answer is "Firefox, cygwin and XMing". |
|
|
| (no subject) |
[Mar. 11th, 2009|11:45 pm] |

Please, think of the narwhals. |
|
|
| Automating automation |
[Dec. 20th, 2008|05:34 pm] |
Whenever I spot myself doing something repeatedly on a computer, I try to automate it. Consequently, I have a directory full of random little scripts, and a long list of aliases in my .bash_profile, all of which arose because I had to do something lots of times. Often, these scripts and aliases call other ones which I wrote earlier, as I spot higher-level patterns in what I'm doing. But every time, I have to fight against the False Laziness that tells me not to bother, I'll only have to do it once or twice more, and it's not worth the effort of doing it right.
So yesterday I added the following line to my .bash_profile:alias viprof="vim ~/.bash_profile && source ~/.bash_profile" We'll see how it works out. |
|
|
| Gay marriage: the category-theoretic perspective |
[Nov. 24th, 2008|09:16 pm] |
Over on qntm.org, the always-entertaining Sam Hughes presents us with an essay on gay marriage: the database engineering perspective. It's a discussion of the real obstacle to legalising gay marriage, namely migrating the schemas of all the government databases so that they can accommodate the idea. Worth a read, perhaps especially if you don't know much about how databases work.
As Sam progressively broadens the allowable definition of marriage, he quickly ends up with the problem of representing an arbitrary undirected graph in a relational database. Recall that a graph is something like a half-filled-in join-the-dots puzzle: more formally, it's a collection of vertices and a collection of edges between pairs of vertices (or if you prefer, some dots, and lines connecting some of the dots), and that a graph is directed if the edges have a direction associated with them, and undirected otherwise. Here, the vertices are people, and the edges are marriages between people. It shouldn't come as too much of a surprise that graphs show up in this problem: graphs arise all over the place in mathematics and computing (another obvious one: the machines on your network, and the cables connecting them). And so Sam runs into one of the classic problems: how do you represent something fundamentally unordered (the endpoints of an edge, here the spouses in a binary marriage) using something that's fundamentally ordered (the memory of a computer)? More concretely, how do you decide who gets put in the spouse1 column of your marriages table and who gets put in spouse2?
Sam's solution is expedient if inelegant: since every person in his database has a unique identifying number (good DB design practice, but still, shudder), he can simply make spouse1 the partner with the lower CitizenNumber. But there's a more common solution which I'd like to talk about. Since it's easy to represent directed edges ("A is married to B"), we represent an undirected edge by two directed edges going in opposite directions ("A is married to B, and B is married to A"). When I first encountered this trick, I thought it was an ugly hack, but it turns out that it actually makes ( more sense than you might think. )
TL;DR version: Very often in mathematics, we have a pair of operations, one of which forgets about extra structure on some object and one of which adds that structure in simple-mindedly. We capture this notion using adjunctions. Applying this framework to undirected and directed graphs, we discover that the well-known trick for representing an undirected graph as an invertible directed graph actually behaves more like forgetting structure than adding it back in. Thus undirected graphs "really are" invertible directed graphs. |
|
|
| (no subject) |
[Oct. 3rd, 2008|12:44 pm] |
wormwood_pearl: Fsck! We're at war with Israel!
pozorvlak: What?! How? When? Why?
wormwood_pearl: Again!
pozorvlak: What do you mean, "again"? What? Oh... you're playing Civilization, aren't you?
For some reason she keeps getting drawn against Israel, who (ironically?) are always aggressive and genocidal in FreeCiv. |
|
|
| Callbacks in TeX |
[Jun. 20th, 2008|12:37 pm] |
In my thesis, there are a lot of places where I write expressions likef1 x ... x fn or (Σ y1, ... , Σ yn) or something similar. The code to generate this is repetitive. It would be nice to abstract out the general pattern of "list of things indexed from 1 to n, combined using some binary operation". But there's a problem: the indexing can occur anywhere in what might be quite a complicated expression. Rather than handing our "one to n" macro an ordinary TeX expression, we need to hand it an expression with placeholders: that is, we need to hand it a macro. Then our one-to-n macro can invoke the macro we handed it with the arguments "1" and "n" to generate the code we want.
This is quite a common thing to want to do in computer programming. It goes under various names, but probably the most common is "callback". More precisely, the callback is the piece of code accepted by the one-to-n macro, which it then invokes. Callbacks are very generally useful - they're the standard way of writing code for graphical user interfaces (if you've ever written an event handler in VB or an ActionListener in Java, you were writing callbacks), but they're also used for XML processing (via SAX) and all sorts of other things. Ruby's iterators and Smalltalk's if-statements take callbacks. If you've ever used the map function in Perl or Lisp or Python or Haskell, you were writing a callback. But can we write callback-based code in TeX?
Well, ( yes, as it turns out. ) |
|
|
| Further proof that GCSE metalwork was the most valuable part of my formal education. |
[Jun. 9th, 2008|03:29 pm] |
Yak-shaving at its finest:
- This weekend, I wanted to work on my thesis.
- But my laptop refused to boot up (inevitable, really, after this post). I had backups of the thesis, but they were a few days old.
- Edit: First, I tried booting from CD.
- But that didn't work.
- So I tried some other CDs.
- But they didn't work either.
- So I tried the boot CDs in a different machine.
- They worked, so I concluded that the problem was with the machine and not the CDs.
- So I tried to remove the hard drive and mount it in a USB drive cradle.
- But the screw heads on the drive mount sheared off when I tried to unscrew them.
- So I tried all the screwdrivers in my toolbox until I found one that fitted better.
- I found one, but the first two screw heads were now too damaged to use.
- So I tried removing the screws with pliers.
- But the pliers skittered off before they managed to turn the screws.
- So I tried to file down the sides of the screw heads into rounded squares, so the pliers would get better grip.
- But the protruberances on the sides of the drive enclosure got in the way of the file.
- So I had to file them down first.
That, I think, was the high-point of the call stack: I've elided various extra steps and dead ends. In the end, I was able to (a) recover all the data (including over a year of un-backed-up mountain photographs), and (b) mount the hard drive in another laptop which had a broken hard drive, thus cobbling together one working computer out of two broken ones. But overall, a less than productive weekend. |
|
|
| Counting in hexadecimal on your fingers |
[Jun. 2nd, 2008|07:01 pm] |
A simple and occasionally handy trick which deserves to be better known.
Use your thumb as a pointer, and your finger joints and fingertips to represent the numbers. Thus:- Thumb pointing to base of index finger = 0
- Thumb pointing to first joint of index finger = 1
- Thumb pointing to second joint of index finger = 2
- Thumb pointing to tip of index finger = 3
- Thumb pointing to base of middle finger = 4
... - Thumb pointing to tip of little finger = F = 15
By using both hands, you can either have a counter with two hex digits (so you can count from 0 to 255), or two separate counters. You can use the same trick to count in base 12: ignore the fingertips, and just count on the joints. A nice side-effect of counting like this (as I discovered this afternoon) is that to casual observers you look like you're meditating: hold your hands palm upwards and stare at something for the full effect.
I learned this trick from an early chapter of Georges Ifrah's monumental Universal History of Numbers, the bulk of which has sat accusingly unread on my bookshelves since roughly 2002. While it looks fascinating, it's also about six inches thick (it's spread over three volumes), and thus is rather intimidating. The trick was listed as one of the ways in which pre-literate peoples count: the general rule is "by enumerating parts of their bodies in a predefined sequence", but the number and sequence of the parts varies from people to people (though less than you'd think). Base sixteen isn't as common a number base as 10 or 12, but it's far from unknown. Another interesting titbit from the book was the list of independent reinventions of "Roman" numbers: it seems that they're essentially the obvious way of counting using notches on a stick after "one notch per item" becomes unwieldy.
[Totally confused by this entry? Hexadecimal notation (or base 16) is a way of writing numbers down. In ordinary decimal notation, each column describes numbers ten times the size of the numbers in the column to its right; the sequence is ... thousands, hundreds, tens, units, tenths, hundredths, ... . In hexadecimal, each column describes numbers sixteen times the size of the numbers in the column to its right. This has all sorts of uses in computing and electronics. For more, see here.]
Edit: Reddit thread. And it seems I was totally wrong in thinking this technique was confined to preliterate people - it's extremely common throughout the Indian subcontinent, and probably further afield! I do like it when I learn something new from comments on a blog post. See the comments below for more finger-counting techniques (and yes, I already know about the sodding binary technique, I just choose not to use it because it's extremely uncomfortable for me. Happy? :-) ). |
|
|
| (no subject) |
[Jun. 2nd, 2008|12:09 pm] |
Random thought: I'm sure people who actually know about this stuff will be able to tell me why it won't work, or, alternatively, cases where it's already been done.
Recall the problem with the naive Haskell mean programmean xs = sum xs / length xs which when asked to calculate the mean of [1 .. 1e9] allocates 8GB of RAM and brings the computer to a grinding halt. The problem there isn't really to do with naiveté: a similarly naive Ruby version works fine and runs in constant space, albeit slowly. The problem is garbage collection, or rather the lack of it. The Haskell runtime calculates the list [1.. 1e9] bit-by-bit so it can be summed, and normally would be able to throw every element away when it was done with it; but because the elements of the list will later need to be counted, there's still a (very long) pointer chain from the top-level to any given element of the list. So none of it can be garbage-collected.
In this case, this is a bad strategy: the effects of swapping out that much memory totally eliminate any benefit from caching the list. It would be quicker to throw the list away and calculate it again from scratch. Since Haskell's a purely functional language, we could in principle do this without worrying about any side-effects of the calculation. We could set up our garbage collector to become more aggressive as memory got shorter: when it starts to use swap, we could throw away any pure data that's more than N indirections from the toplevel, where N is varied dynamically to maximise performance. But what if our program is, say mean $ take 1000000000 mersennePrimes? Then it will take much longer to re-calculate the values than it would to swap them in and out of core. You'd need to keep track of how long each function call takes, and how long each bit of data took to calculate, and estimate re-calculation times when throwing data away.
But this is exactly the kind of profiling data that you need to track for the smart optimizing VMs that Steve Yegge talks about here. So, it seems that there's a synergy here. Does anyone know of a Sufficiently Smart VM that does this kind of aggressive garbage collection? Has it been tried before? Is there some obvious problem with the idea that I'm not seeing? |
|
|
| navigation |
| [ |
viewing |
| |
most recent entries |
] |
| [ |
go |
| |
earlier |
] |
| |
|
|