?

Log in

No account? Create an account
Type warnings - Beware of the Train [entries|archive|friends|userinfo]
pozorvlak

[ website | My Website ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Links
[Links:| My moblog Hypothetical, the place to be My (fairly feeble) website ]

Type warnings [Aug. 19th, 2007|12:09 am]
pozorvlak
[Tags|, , , , , , , , ]

Here's something that occurred to me the other day: consider duck-typed languages like Python. The idea of duck typing is that if something walks like a duck, and quacks like a duck, then it might as well be a duck. Or rather, if something walks and quacks in a sufficiently ducklike manner, it doesn't really matter for your purposes whether it actually is a duck or not. Less metaphorically, we don't specify that the arguments to our functions can only accept (say) Integers, we only specify that they have to have an add() method. In fact, duck typing as commonly understood goes further: we don't explicitly specify a necessary interface at all, we specify it implicitly by the methods we call on our arguments.

An example is probably in order. Consider the Python code
def foo(bar, baz) :
    bar.spoffle()
    bar.buffy(baz.willow())
    baz.xander(bar.angel())
From that code, we can deduce that the first argument to foo must have methods called spoffle, buffy and angel, and the second argument must support methods called willow and xander. Now, here comes the clever bit. That's all that's necessary for foo to work. Supposing some clever hacker comes along later and invents a new datatype which it would make sense to foo, all they need to do to allow your code to work on theirs is to implement those methods in a sensible way. This can be done without any knowledge or forethought on your part. Maybe you honestly believe that fooing is something that can only be done to ScoobyGang objects, but in fact the algorithm is much more general, and will work with any TeenSuperHeroTeam. It doesn't matter: your code will still work. To achieve the same effect in Java or Haskell, you'd have to have defined a Fooable interface explicitly, and users of your library would have to start their code with a raft of
instance Fooable TheThem where ...
instance Fooable AquaTeenHungerForce where ...
instance Fooable PowerRangers where ...
type stuff. This is in fact how most of my Haskell code ends up looking: I find typing it almost physically painful. Until the RSI starts to kick in, at which point it becomes actually physically painful. And that's in the good case, where the author of the library I'm trying to use has thought about the kind of generalisation I want to make and defined the relevant interfaces.

OK, so far so standard. Now, most duck-typed languages are dynamic, which means that we only try to determine if bar has a spoffle method at runtime, and die with an error message when it doesn't (possibly after trying some error recovery, e.g. by calling an AUTOLOAD method or similar). But it occurred to me that in simple cases (which is to say, the majority), we could work out most of this stuff statically. For each function definition, see what methods are called on its arguments. Recurse into functions called from that function. Now, each time a function is called, try to work out what methods the arguments will support, and see if that includes the interface required by the function. Issue an error if not. Thus we get the code reuse benefits of duck typing, and the correctness benefits of static checking. If the static checking is slowing down your development cycle too much, drop back to fully dynamic checking, and only run the static checks on your nightly builds or something.

This also cleared up something else I'd been vaguely wondering about. In his splendid Drunken Blog Rant Rich Programmer Food, Steve Yegge says
Another problem is that they believe any type "error", no matter how insignificant it might be to the operation of your personal program at this particular moment, should be treated as a news item worthy of the Wall Street Journal front page. Everyone should throw down their ploughshares and stop working until it's fixed. The concept of a type "warning" never enters the discussion.
I'd wondered at the time what a type warning would mean. When is a type error mild enough to only warrant a warning? Here's one idea: a type warning should be issued when it cannot be proved that an argument to a function implements the interface that function needs; a type error should be issued when it can be proved that it doesn't.

This all seemed very interesting, and struck me as potentially a fun and reasonably easy hacking project, at least to get something workable going. But if it's occurred to me, it has probably occurred to someone else, so I asked the Mythical Perfect Haskell Programmer if he was aware of any work that had been done on static duck-typed languages. "Oh yes," he said, "O'Caml's one." Buhuh? Really? Well, that's O'Caml moved a couple of rungs up my "cool languages to learn" stack...
linkReply

Comments:
[User Picture]From: metamoof
2007-08-21 12:59 pm (UTC)
Pychecker and associated tools do a limited form of type checking, but they only take it so far, because the python type system, when you come down to it, is Turing-complete.

Consider the following interactive session:

>>> class scary(object):
... 	def __getattr__(self, name):
... 		try:
... 			return getattr(self, name)
... 		except:
... 			setattr(self, name, self._genmethod(name))
... 			return getattr(self, name)
... 	def _genmethod(self, name):
... 		def _amethod(*args, **kw):
... 			sargs = [repr(arg) for arg in args]
... 			sargs += ["%s=%s" % (key, repr(val)) for (key, val) in kw.items()]
... 			print "called %s(%s)" % (name, ", ".join(sargs))
... 		return _amethod
...
>>> sc = scary()
>>> [method for method in dir(sc) if not method.startswith('_')]
[]
>>> sc.foo('a', 'b')
called foo('a', 'b')
>>> sc.foo('a', 'b', 'c')
called foo('a', 'b', 'c')
>>> [method for method in dir(sc) if not method.startswith('_')]
['foo']
>>> sc.bar('d')
called bar('d')
>>> [method for method in dir(sc) if not method.startswith('_')]
['bar', 'foo']
>>> sc.baz(harry="potter", ron="weasley", hermione="granger")
called baz(hermione='granger', ron='weasley', harry='potter')
>>> [method for method in dir(sc) if not method.startswith('_')]
['bar', 'baz', 'foo']
>>> 


The method for method bit removes all the attributes on the class that are built-in (start with __) or internal (start with _) by convention. Also, never use a straight except: in real code.

How do you type check that without actually running it?

You can't. So why bother?

The whole point is to ignore the type of the object. Your test are supposed to catch the common errors, and likely as not, the uncommon cases are being done by developers getting thing wrong or testing your code anyway. And they should test their code to catch these common errors.

Given that it's impossible to prove a python program's type correctness, we prefer to not bother, and get on with stuff that actually matters. Yes, having type warnings from pychecker et al can sometimes help, but normally they are more of a hindrance, and the sort of type errors that actually worry you won't be picked up by such tools.
(Reply) (Thread)
[User Picture]From: pozorvlak
2007-08-22 11:39 am (UTC)
PyChecker sounds like the kind of thing I had in mind, and I was really only thinking of this as an academic/hobby exercise - I'm a confirmed drinker of the dynamic Kool-Aid myself :-)

And as for
The whole point is to ignore the type of the object. Your test are supposed to catch the common errors, and likely as not, the uncommon cases are being done by developers getting thing wrong or testing your code anyway. And they should test their code to catch these common errors.

Given that it's impossible to prove a python program's type correctness, we prefer to not bother, and get on with stuff that actually matters. Yes, having type warnings from pychecker et al can sometimes help, but normally they are more of a hindrance, and the sort of type errors that actually worry you won't be picked up by such tools.
I can only say: "well said!". The trouble is that this is such a horribly polarised debate: those who believe that compilation is just a weak form and frustrating of testing on one hand (you, me, Steve Yegge, etc) and those who believe that testing is just a weak and labour-intensive form of compilation on the other (the Haskell, Java etc communities - and have you ever tried to argue with Haskell users? The dead, staring eyes, the moans of "monadsss... monadssss..." *shudder*. Almost as bad as Mac users :-) )
(Reply) (Parent) (Thread)