Bob Ippolito (@etrepum) on Haskell, Python, Erlang, JavaScript, etc.

Pure python "structseq" factory


This is a pattern I've been playing with lately (to be used in the next version of aeve, in the aeut/aete decoder). Basically it's a factory for "tuples with a little metadata that can be used a little bit like dicts too". Yeah I know that dict-like methods will stomp on names if there's a clash, but I don't have any use cases where that happens so my implementation doesn't consider that.

def newNamedTupleType(typename, names):
    """Return a subtype of tuple with named accessors"""
    dct = dict([(_name, property(lambda self,_i=_i:self[_i], doc=_name)) for (_i, _name) in enumerate(names)])
    dct['__names__']    = names
    dct['keys']         = lambda self: self.__names__
    dct['items']        = lambda self: zip(self.__names__, self)
    dct['iteritems']    = lambda self: itertools.izip(self.__names__, self)
    dct['values']       = lambda self: self
    return type(typename, (tuple,), dct)

It's primarily useful in conjunction with the struct module, for example, to unpack a version record, which is a pair of 16bit words, you could use something like the following:

import struct
VersionRecord = newNamedTupleType('VersionRecord', ('major', 'minor'))

def decodeVersionRecordFromPackedString(s):
    return VersionRecord(struct.unpack('HH', s))

def prettyPrintVersionRecordFromPackedString(s):
    rec = decodeVersionRecordFromPackedString(s)
    print "%(major)s.%(minor)s" % dict(rec.iteritems())

Yes, the example is contrived. But it really does make things easier when you're passing non-trivial versions of these around. Instead of having to remember the index of a field, you can just remember the name. In my typical use case, these names come from C structures, so it makes the Python code closer to what a C implementation would be like (in a good way).

I have a more generic (but mutable and not nearly as lightweight) solution to this problem (making structures easier to work with), currently called "ptypes" since it is so much like ctypes Structures API-wise. It's currently being used in a pure python package (macholib, formerly potool) for reading/writing Mach-O headers (the OS X analog of Linux's ELF, or Win32's PE). macholib is useful for dependency walking, and automation of install_name_tool like functionality. However, since it also understands symbol tables, it would be useful for determining if a particular source file contains Objective C class definitions.. which is good to know because Objective C classes can not be unloaded at runtime. It could even be used to write a symbolic debugger if one were so inclined. I have written a largely untested symbolic PowerPC disassembler in Python, so it would be neat to hook the two together eventually.