Monday, 18 July 2011

Adding Type Overloading Methods to Python

Python does not have type-based method signatures. One consequence of this IMHO is that a lot of 'dirty code' is being written that would be much improved with the presence of type-signatures. Now I started to write this with a whinge about the liberal use of if isinstance() or if type() statements within Python methods/functions, poor implementation habits, lack of understanding of clean code design and testing, blah, blah, blah... But I think that's just a waste of time for me to write and you to read.  So I'm just going to cut to the chase:

I've implemented a Metaclass that enables type-dispatch on method signatures. What's that mean? Well, here's an example:

from typed_calls import *

class Printer:
    __metaclass__ = ClassWithTypedCalls
 
    def show(self,an_object):
        print str(an_object)
  
    @when('show',list)
    def show_list(self,a_list):
        print 'A list:'+str(a_list)
  
    @when('show',int)
    def show_int(self,an_int):
        print 'An integer:'+str(an_int)
  
if __name__=='__main__' :
    p = Printer()
    p.show('Just a String')
    p.show([1,2,3])
    p.show(67)
    p.show(p)

When this runs you get the following output:
Just a String
A list: [1, 2, 3]
An integer:67
<__main__.Printer object at 0x100497d90>

This syntax/mechanism allows you to override methods based on the type of the first argument. In this case there are overrides in place for list and int but everything else uses the standard method.

Why is this useful?
  • The code is clean (both in the implementer and the user)
  • The unit testing is simple
  • You don't have to add code to other classes (which might be out of your control)
  • It makes writing Visitor patterns much cleaner
An example of Visitors using this code would be:

from typed_calls import *
from copy import copy

class A(object):pass
class B(object):pass

class NodeWithChildren(object):
    def __init__(self,*children):
        self.children=children

class CopyingVisitor():
    __metaclass__ = ClassWithTypedCalls
 
    @when('do',object)
    def make_copy(self,an_object):
        return copy(an_object)
 
    @when('do',NodeWithChildren)
    def do_children(self,a_node_with_children):
        new_node = self.make_copy(a_node_with_children)
        new_node.children = [self.do(c) for c in a_node_with_children.children]
        return new_node
 
class TreePrinter():
    __metaclass__ = ClassWithTypedCalls
  
    def __call__(self,an_object):
        self.depth=0
        self.buffer=''
        self._print(an_object)
        print self.buffer
 
    @when('_print',object)
    def default_print(self,an_object):
        self.buffer += '  '*self.depth
        self.buffer += str(an_object)
        self.buffer += '\n'
  
    @when('_print',NodeWithChildren)
    def node_print(self,a_node):
        self.default_print(a_node)
        self.depth+=1
        for c in a_node.children : self._print(c)
        self.depth-=1
  
if __name__=='__main__' :
    child3 = A()
    child4 = B()
    child1 = NodeWithChildren(child3,child4)
    child2 = A()
    top = NodeWithChildren(child1,child2)
 
    print_out = TreePrinter()
    copier = CopyingVisitor()
 
    print 'Original'
    print_out(top)
    copy_of_top = copier.do(top)
    print
    print 'Copy'
    print_out(copy_of_top)

There are two Visitors in here the CopyVisitor ad the TreePrinter. Note the TreePrinter is implemented as a Functor, too (using the standard Python __call__ as the initiator). I like Visitor Functors as they allow you to implement clean and simply testable approaches to execution-state-based behaviour (e.g. then indentation depth of the tree) without compromising the implementation of the data-structure being walked or having to pass long lists of parameters.

Now you might think, "Isn't this just multi-methods?", and in a way it is except this implementation allows for inheritance - both in the argument types and in the implementers. That is to say, if you implement an override with @when and an object is a subclass it will be recognized. Secondly, the mechanism will look back up the implementation tree to find an appropriate method. Essentially the rule goes: look for a typed method the specific typeof the argument in this class, then in its parent classes, then look for a method for the next abstract type of the argument in this class and its parent classes, [repeat up the argument abstract types....]. An example is easier (and check out the tests):

from typed_calls import *

class A(object):pass
class B(A):pass
class C(B):pass

class BaseClass:
    __metaclass__ = ClassWithTypedCalls
 
    @when('method_call',B)
    def call_in_base(self,argument):
        return 'call_in_base'
 
class TestClass1(BaseClass):
 
    @when('method_call',A)
    def call_in_test_class_1(self,argument):
        return 'call_in_test_class_1'
 
class TestClass2(TestClass1):
 
    @when('method_call',B)
    def call_in_test_class_2(self,argument):
        return 'call_in_test_class_2'
 
test_object1 = TestClass1()
assert test_object1.method_call(A()) == 'call_in_test_class_1'
assert test_object1.method_call(B()) == 'call_in_base'
assert test_object1.method_call(C()) == 'call_in_base'

test_object2 = TestClass2()
assert test_object2.method_call(A()) == 'call_in_test_class_1'
assert test_object2.method_call(B()) == 'call_in_test_class_2'
assert test_object2.method_call(C()) == 'call_in_test_class_2'

Caveats:
  • this only dispatches on the type of the first argument. You can pass more arguments through but they will not effect the dispatching (exercise for reader to extend...)
  • mechanism is relatively efficient, but will not be as fast as doing a hand-written double-dispatch/Visitor pattern
  • when debugging you will see and extra level of stack per call - can be off-putting to some
  • all instances of a class share the same typed-methods. Cannot (currently) override on a per instance basis like you can with normal Python functions
Now as much as I've written this and am sharing it, I would say that this is done as a 'patch' to provide a better/cleaner way for developers who cannot/do not want to write better code.  If you think you have need of this, please consider changing your code to be better by:
  • implementing double-dispatch on your classes. Makes thing much easier to test and allows greater extendability to other programmers without having to change your code
  • not using base types (int, list, dict, etc.) as primary types in your model. Wrap basic types and use trivial sub-classing. In concert with double dispatch makes your code more testable, understandable and extendable
If you do want to use/try this download typed_calls.py and the accompanying tests _test_typed_calls.py

0 comments:

Post a Comment