Log in

On Haskell - Beware of the Train [entries|archive|friends|userinfo]

[ website | My Website ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

[Links:| My moblog Hypothetical, the place to be My (fairly feeble) website ]

On Haskell [Dec. 4th, 2006|09:02 pm]
[Tags|, , , ]

As I've mentioned a few times, I'm (re)-learning Haskell. Here are my current thoughts on the language. No doubt, when I have drunk of the Kool-aid in fullness, I shall look back on them and laugh/cringe at my naiveté. Please try not to get upset about them: if it helps, think of them as having big <at-my-current-state-of-knowledge> tags around them.
  • Haskell's syntax is nice. Clear and concise. As part of my Haskell-learning regime, I've been going through Paul Graham's paean to the Lisp macro, On Lisp1, re-writing the example code in Haskell. Thus far (ie, in the early, functional chapters), the Haskell code is generally shorter and clearer than the Lisp code it replaces, sometimes dramatically so. And recall that the Lisp code was written by an expert who's fanatical about brevity and writing for expositional purposes.
  • In particular, pattern-matching is a Good Thing, as is the (+3) notation for operators. I note that Paul Graham wants to add something like the latter to Arc, in the form of [_ + 3], but of course you could implement it fairly trivially in Common Lisp with read-macros.
  • On the other hand, Haskell's implementation of syntactic whitespace is needlessly complicated and unintuitive. Do you know the exact rules Python uses for syntactic whitespace? No, neither do I. That's because you don't have to.
  • Pervasive lazy evaluation is a major win (but see also...). A lot of the On Lisp macro examples can be seen as workarounds for the lack of lazy evaluation.
  • Pervasive partial application allows for some very nice effects, and could thus be construed as a win. However, it disallows optional and keyword arguments, prevents the existence of functions with the same name and different numbers of arguments, and in general rules out variadic functions (unless all the arguments are of the same type). All of these are Good Things. I haven't made up my mind which I'd rather have yet.
  • Haskell's type system is not a major win. It may in fact be a win, but so far I have seen little evidence to support this. So far, it's caused me nothing but pain while writing code, and it prevents you from expressing lots of useful and well-defined ideas as Haskell code. Yes, it helped me find a bug the other day, but probably not as quickly as I could have found it in a weakly-typed language. It's less expressive than the type system Ada had in 1987 - in particular, the lack of types parametrised by integers is a huge sucking chest wound disfiguring the language. It's possible Data.Generic may help here - I haven't got my head around that yet.
    On the other hand, hackers whose opinions I respect assure me that some day I will come to see the type system as a friend and ally, so I'm trying to keep an open mind about this point in particular.
  • On a similar note, type classes are an ugly crock.
  • Haskell's documentation is inadequate. I've written about this before: then, I'd had a particularly Bad Documentation Day and was ranting, but I stand by the substance of my comments. It occurs to me that this may reflect a cultural difference. I come from Perl, which (intentionally) makes very few guarantees, and encourages the wide use of downloaded libraries. Some Perl libraries are well-known, but most are very specialised, and you might well download a library from CPAN for one specific project, hack with it for a day or two, and never use it again. In this environment, good documentation is extremely important, and thus community standards for docs are very high. Occasionally, they're even satisfied :-)
  • Debugging tools are not luxury items, guys. I had a useful chat with Duncan over the weekend, and now intend to check out Buddha and Hat, but really, this stuff should come with the distribution.
  • Hugs and GHCi are the two least useful toplevels I've ever had the misfortune to use.
  • Referential transparency is a Seriously Good Thing, but then I thought that anyway. I'm not entirely convinced that the language should be enforcing it: it's often possible to write a side-effect free function that works by manipulating some state. Compiler optimisations, I suppose, and it's good to have "Here Be Dragons" signs around the non-transparent bits, in the form of the rather ugly monad syntax.
  • I miss variable interpolation. I tried to add it using Template Haskell, but ran into some problems: I'll post about that some time. I also miss having hashes2 at my fingertips. They're wonderfully useful things, and having them on a par with lists allows for some extremely cool/simple solutions to some problems. Syntax is not trivial.

By the way, I've fixed the juggling program. But then I was reading Burkard Polster's The Mathematics of Juggling on Saturday, and came across a much simpler algorithm: a siteswap [a_0,...,a_n-1] is juggleable iff the function (i |-> i + a_i mod n) is a permutation of [0..n-1] (by finiteness, iff it's a surjection). In Haskell (untested),

isJuggleable ss = and (map (`elem` test) [0 .. n])
        where test = map (`mod` n) $ zipWith (+) [0 .. n-1] ss
              n    = length ss
Or in Perl:
sub is_juggleable {
        $hit{$_ + $i++ % ($#_+1)}++ foreach @_;
        return scalar(keys %hit) > $#_;
See what I mean about the hashes? :-) Tests: er, that has a bug in it, but my girlfriend has been wanting to go home for a while now. I'll fix it tomorrow.

1I strongly recommend this exercise to everyone, actually - writing code helps you to absorb the ideas, and thinking about why the code's different in the two languages helps you to see the tradeoffs between their design decisions. And On Lisp is a great book, with rather more than its fair share of Keanu moments - those moments when you sit back and say "Whoa." The chapter on anaphoric macros is particularly cool.
2By "hashes", I mean associative arrays/dictionaries/etc, I'm not bothered about the implementation. Data.Map qualifies, but it's a lot less convenient than just being able to say $foo{$bar} = $fred, or $foo{bar} = $fred (barewords are treated as strings in hash indexes). Consider how ugly most Haskell code would be if there were no syntactic sugar for lists...

[User Picture]From: wormwood_pearl
2006-12-05 01:36 pm (UTC)
Woo! Yay! First comment! I'm so cool. Do I get a prize?
(Reply) (Thread)
[User Picture]From: totherme
2006-12-05 02:26 pm (UTC)
On learning to love the type system:

Something just occurred to me, which should have been obvious, and may in fact have been obvious to everyone but me. On the off-chance that it wasn't, I'll relate it here.

Just as you can't get the benefits of functional programming by doing procedural programming in ML (in fact, you'll just get annoyed by the way you keep having to fight the language to make the programme do what you want), I suspect you can't get the real benefits of static type checking just by running the type checker over code that you wrote as if it were dynamic. All it will ever do in that case is tell you what you can't do. If you're lucky, it might find a bug a bit faster than if you'd had to test for it.

The flip side of those things that you can't do, are the things you don't have to do. The type checker doesn't just provide rules, it provides guarantees. There are certain things that dynamic programmers have to think about, which static programmers do not - because the type checker does it for them. If a dynamic programmer codes in a static language, there's nothing to prevent him from continuing to duplicate the work of the type checker in his head, but if he does this, he doesn't feel the full benefit of having it there. This, then, is an argument that static typing is good, because it frees your mind from mundane things that the machine can think about for you - so that you can concentrate on more interesting higher level problems.

It's quite difficult to explain how to let go of the thoughts you don't need when the machine is watching your back - it's actually quite hard even to explain precisely which ones they are. I suspect I could give you examples if I spent a lot of time thinking about it, but I don't think they'd be very illustrative, and I suspect there's a short-cut we can use instead.

(ok, this next bit is potentially really condescending - it's not intended to be. I know that anyone reading this already knows how to program, and it looks like I'm trying to teach "how to program". What I actually want to do is describe a mindset that I suspect leans towards static typing - and the best way I know how to do that is by teaching novices how to code with it from scratch. I ask your indulgence in looking for that mindset in the following, rather than the face value "this is how to code".)

When we teach programming (particularly in haskell) to first year CS students, we always tell them to start with the type:

"Write a function sentances :: -> String -> [String] that takes a String, and returns a list of 'sentance' strings where each sentance begins with a capital letter, ends in a full stop, and contains precisely one full stop"

pozorvlak : I know that you were never a first year CS student - you skipped that bit, and went straight on to writing Hard Code ;) ...so I figure there's a chance that you were never encouraged to adopt this mindset. If you do want to see the type system from the (loving, awed) point of view of Duncan or myself, and if this isn't something you've tried already, then this might be a useful exercise to try:

Spend some time (a day, week, month, whatever) deliberately, and slavishly writing down the type of any code (in any language) that you want to think about. Do this before thinking about how the code behaves, or what it does. The process for discovering code to do what you want should hopefully become something vaguely like:

  • What is the type of the input?

  • What is the type of the output?

  • (write them down)

  • What are the properties of the input and output?

  • What is the relationship between data with these input properties, and data with these output properties?

This is the logical conclusion of the school of thought that teaches "Get your data structures right first, and the algorithm will naturally follow" - applied everywhere.

To put it another way: The type of your code is a specification, not a side effect. Specify first, then implement.

(max comment length)
(Reply) (Thread)
[User Picture]From: totherme
2006-12-05 02:27 pm (UTC)

This actually makes top-down development more natural too, I guess... You want prog A, you write A :: Type_A then you decide that probably A = B.C where B :: Type_B, C :: Type_C, etc. You don't actually have to code it top-down (although you can if you want) - you can code the easy bits first, making the assumption that you'll be able to find/write functions to fit the other bits of the spec.

Once this mindset is adopted, hoogle suddenly seems much more powerful and useful. The first thing you think about a problem is the type. You enter that type into hoogle, and it tells you that someone's already done it. Sorted ;)

...anyway, my hope is that there is a connection between this type/data first approach to programming, and a willingness to let the compiler provide the guarantees it can, without worrying about how. My further hope is that by learning to adopt this approach and that mindset at will, one can learn to co-exist peacefully (in fact - productively!) with the haskell type system - and actually feel the benefit it can provide.

I know that I've written these thoughts down rather clumsily, and I know that to type or not to type is a religious issue, so I hope I haven't annoyed anyone. Perhaps I'll take some time at a later date to try to write this all more clearly....
(Reply) (Parent) (Thread)
[User Picture]From: pozorvlak
2006-12-05 03:13 pm (UTC)
Interesting. But I'm not sure I agree with one of your axioms:
The type checker doesn't just provide rules, it provides guarantees. There are certain things that dynamic programmers have to think about, which static programmers do not - because the type checker does it for them.

Actually, the major problem is that I'm not used to thinking about type at all, because if I hand a Perl function a variable of the wrong type, it will perform the appropriate coercion behind the scenes and Do What I Meant (TM), often magically so (C++ does something similar). From that perspective, there's no difference in meaning between "1", 1 and (1), just a difference in efficiency. The "if it walks like a duck, and quacks like a duck, but you need a gorilla, then stick some fur on it and give it a banana" approach to typing :-)

This is something that puzzles me, actually: I can see the rationale behind strong static typing (efficiency, correctness; it's just that I think there are better ways of achieving the desired effects), and I can see the rationale behind Perl-style DWIMmy weak typing, but I can't see the rationale behind Python/Lisp(?)-style dynamic strong typing, where there are no guarantees provided by the compiler, but type errors can still kill your program. "4" + 3 in Python will die with a type error, for instance (partly because they've used + for string concatenation, but still). Does anyone here get this?
(Reply) (Parent) (Thread)
[User Picture]From: totherme
2006-12-05 03:49 pm (UTC)

I'm not sure I agree with one of your axioms ... the major problem is that I'm not used to thinking about type at all

Yeah - I didn't communicate it very well. I don't (and didn't) believe that you're thinking about types when the compiler could be doing it for you, but that you're thinking about other things which give you the same guarantees, when the compiler could be doing it for you, via the abstraction of types.

Does that make any more sense? I know what I mean, but the words are coming slowly today ;)

Python will die with a type error ... Does anyone here get this?

So, I think this may be part of the same thing, but first I'll answer the question directly:

The python programmer wants "4" + 3 to die, because he might have meant 4 + 3 (7) or he might have meant "4" + "3" ("43"). For every compiler which assumes one of the two, there's a legion of programmers who expected the other in some circumstance, and a universe of bugs caused by that expectation.

The python language deals with the ambiguity by disallowing it, while the perl language deals with the ambiguity by declaring one of them to be correct, and forces all perl programmers to remember which one. Python programmers don't have to remember a list of default behaviours - an error is an error. If your programme dies because of that error, you know why - the error contains the line number. Make the same mistake in perl and you get a weird behaviour a long way down the line, which you have to spend time and effort tracing back to that mistake.

The advantage of dynamic strong typing over static strong typing is that sometimes you know what the type of a variable will be, but you can't convince the compiler of it at compile time (remember casting in java before the generic collections came along?). I've never run into this problem in haskell - the polymorphism works really well.

The reason that I think it's a part of the same thing is this:

With my strongly typed hat on, "4" + 3 is just wrong. It doesn't make any sense. This has nothing to do with how it'll behave when the programme runs. I'm not worrying about behaviour, or errors, or bugs. I'm not even thinking about running the programme yet - I'm thinking about the data. It's just a grammatically incorrect non-sentence. I'm not using the type system as a safety net to catch mistakes in a programme which I've written by thinking about behaviour - instead I'm writing a programme by thinking about types.

I think this is closer to the python mindset. Whether that "wrongness" is best revealed to them at compile time or test time is a whole other issue - the key thing is the undeniable wrongness of the statement.

Does that make sense?
(Reply) (Parent) (Thread)
[User Picture]From: pozorvlak
2006-12-05 03:56 pm (UTC)
Yes, I suppose so.

I still think a cleaner solution in this particular case is to have a different operator for string concatenation. Then "4".3 is unambiguously "43", and "4" + 3 is unambiguously 7 :-)

Remembering lists of default behaviour: this reflects a difference in the intended audiences for Perl and Python. Perl's really intended for people who use it every day, and so don't have to look up this stuff.
(Reply) (Parent) (Thread)
[User Picture]From: totherme
2006-12-05 04:19 pm (UTC)
I like that "different operator" idea :)

If you've got

(.) :: String -> String -> String
(+) :: (Num a) => a -> a -> a

then you could indeed have your language automatically massage the arguments to those functions into the correct type wherever possible.

(Of course, in some cases, it might still not be possible to do in a sane way... Converting a GTK window into an IRC command isn't something I'd want to leave to the compiler - and I certainly don't want to be forced to do it myself when I declare the GTK window type, just because someone else declared the IRC type first - so I think there's still a place for type errors. Of course, if someone else wants to define how it should work - I'm happy to let them.)

Ok - so we'd like to massage the arguments to our functions into the correct type wherever possible. Perhaps a good way to do this would be with user defined conversion functions (so that the conversion can be as intelligent as possible at every stage - and do the right thing more often than not).

In this case, we want f :: Num -> String in order for (.) to do what we want. In other cases, we'll probably want f :: x -> String and we'll need a way for the compiler to know what function to call in order to do this massaging.

Let's rename the function f to something more intuative, like show, and say that all types x for which show x makes sense are of class Show. Now we can have:
(.) :: (Show a) => a -> a -> a

...will that do what you want? ;)
(Reply) (Parent) (Thread)
[User Picture]From: pozorvlak
2006-12-05 05:50 pm (UTC)
In this specific case, no, because show "Fred" is "\"Fred\"" :-)

[And the type of (.) should be (Show a, Show b) => a -> b -> String :-) ]
(Reply) (Parent) (Thread)
[User Picture]From: pozorvlak
2006-12-05 06:22 pm (UTC)
More generally: the duck-typing approach could be seen as something like implicit declaration of all possible type classes and instances. So if we needed a function to be able to call show and +, we don't need to declare a new typeclass ShowAndPlus, and go around declaring hundreds of things to be members of this class (and recall that we can't simply say instance (Show a, Num a) => ShowAndPlus a).

Hmmmm. It ought to be possible to do a lot of that at compile-time...
(Reply) (Parent) (Thread)
[User Picture]From: totherme
2006-12-05 06:53 pm (UTC)
We don't need a new type class ShowAndPlus, because type classes have multiple inheritance:
> class Gds1 a where
>         gds1 :: a -> String

> class Gds2 a where
>         gds2 :: a -> String

> data SomeType = A | B deriving Show

> instance Gds1 SomeType where
>         gds1 x = "gds1! "++(show x)

> instance Gds2 SomeType where
>         gds2 x = "gds2! "++(show x)

> f :: (Gds1 a , Gds2 a) => a -> String
> f x = (gds1 x) ++ (gds2 x)

*Main> f A
"gds1! Agds2! A"
(Reply) (Parent) (Thread)
[User Picture]From: pozorvlak
2006-12-07 07:58 pm (UTC)
OK, I think I get you. But you still can't use (+) unless something's of class Num, which implies the existence of bits of interface you might not need.
(Reply) (Parent) (Thread)
[User Picture]From: pozorvlak
2006-12-05 06:15 pm (UTC)
Your example of converting a GTK window into an IRC command is suggestive. It suggests that my mental model is mostly in terms of code that manipulates basic datatypes like strings, integers, floats, and closures, and lists/hashes/sets/etc of them, whereas you're thinking of code that manipulates more complex opaque datatypes representing higher-level constructs. It is of course possible to implement complex data in terms of simple data, but are you then doing work in your head that the compiler could be doing for you? When I'm coding in Perl, I'm thinking about what the variable contains, rather than how it's represented. "1" contains 1, albeit as a string. $cmd might contain an IRC command, hopefully represented with some nice OO interface provided by Net::IRC::Command or whatever. Passing it to a function which expects a GTK window would be fine at the time, but would cause a runtime error when the code tries to invoke an undefined method on it. Does this count as a type error? I suppose it counts as a "Does not walk like a duck" error...

When I read your comment, my first reaction would be "why would anyone try to do that?". But now I'm assuming a programmer who knows when a variable represents an IRC command (or, more generally, which implements the bit of the interface that $_ wants to use), which is effectively one who has some sort of type-checking subroutine running somewhere in $_'s head.

From man perltoot:
Some languages provide a formal syntactic interface to a class's methods, but Perl does not. It relies on you to read the documentation of each class. If you try to call an undefined method on an object, Perl won't complain, but the program will trigger an exception while it's running. Likewise, if you call a method expecting a prime number as its argument with a non-prime one instead, you can't expect the compiler to catch this. (Well, you can expect it all you like, but it's not going to happen.)

C++ functions, by the way, if handed arguments of the wrong type, will attempt to invoke constructors on their arguments to coerce them into the right types. IIRC, this only works with classes, not basic types. And I don't think it does this recursively, though that would be pretty cool :-)
(Reply) (Parent) (Thread)
[User Picture]From: pozorvlak
2006-12-05 03:19 pm (UTC)
Another thing: the first language I learned was BBC Basic (types implicit in variable names), but the second language I learned was Pascal. I remember an unpleasant "lack of safety net" feeling from when I first started to use dynamic languages, and when I used VB I would religiously declare types for everything. More experience with dynamic languages led me to learn that I didn't actually need to do this :-)
(Reply) (Parent) (Thread)
[User Picture]From: totherme
2006-12-05 03:51 pm (UTC)
So, you learned to programme by thinking about behaviour.

It's still possible to programme by thinking about typed data though :)
(Reply) (Parent) (Thread)
[User Picture]From: pozorvlak
2006-12-05 04:00 pm (UTC)

The (perhaps unclear) caveat to everything I've said above is that I'm aware that there's a typeful way of thinking about code, and by learning Haskell I'm trying to add it to my repertoire/learn if there's anything in it. But it ain't easy :-)
(Reply) (Parent) (Thread)
[User Picture]From: totherme
2006-12-05 04:27 pm (UTC)
Yes indeed :)

I'm hoping that talking about it like this, makes it easier to learn what that way of thinking is. An annoying thing about ways of thinking, is that they're hard to communicate.
(Reply) (Parent) (Thread)
[User Picture]From: pozorvlak
2006-12-05 06:24 pm (UTC)
Yeah. No, that was really helpful, and I'll try to force myself to think about types first for a while :-)

[BTW, remember that 20-line Lisp macro I thought I'd boiled down to 2 lines of Haskell? Turns out I hadn't read the code carefully enough: it was variadic, so it would actually require infinitely many lines of Haskell to implement. You can still do it in about three lines with generalized zipWith, though...]
(Reply) (Parent) (Thread)
[User Picture]From: totherme
2006-12-06 08:51 am (UTC)
(Reply) (Parent) (Thread)
[User Picture]From: pozorvlak
2006-12-07 07:21 pm (UTC)
Nice. I'll have a look. Interestingly, I note that the quote at the top of the first page is by michiexile, in response to something I wrote!
(Reply) (Parent) (Thread)
From: sable_veins
2006-12-05 08:38 pm (UTC)
Ha! "$hit" and "%hit" both look a bit like... shit!
(Reply) (Thread)
[User Picture]From: pozorvlak
2006-12-07 07:50 pm (UTC)
That thought had occurred to me, I confess.
(Reply) (Parent) (Thread)