Log in

No account? Create an account
16 November 2009 @ 06:01 pm
Static analysis of CPython's .c code  
I've been hearing good things about Coccinelle for a while now, a tool for working with C code.

For example, it's been used for automating tedious (and error-prone) work on the Linux kernel.

I decided it was time to take it for a test-drive on CPython code.

I occasionally run into problems with the PyArg_ParseTuple API. It's a convenient API - it makes it very easy to marshal the objects passed as parameters to Python function calls into their C equivalents using a mini-language. The downside of this approach is that the compiler can't check such code for type safety, and so it's an area where bugs can lurk.

So I've written a tool which can detect such problems. You can download it from my Fedora people page using this command:
git clone git://fedorapeople.org/home/fedora/dmalcolm/public_git/check-cpython.git

or simply read the code online here:

To run it you'll need Coccinelle (which includes the "spatch" tool). On Fedora you can install it using this command:
yum install coccinelle

You should then be able to run it thus:
spatch -sp_file pyarg-parsetuple.cocci buggy.c

init_defs_builtins: /usr/share/coccinelle/standard.h
HANDLING: buggy.c
buggy.c:13:socket_htons:Mismatching type of argument 1 in ""i:htons"": expected "int *" but got "unsigned long *"

thus correctly finding the bug (an old one, fixed in http://svn.python.org/view?view=rev&revision=34931 )

Early days yet, but this seems promising. Does anyone know of any other non-proprietary tools that can do this kind of thing?

(I've posted more info to python-dev list here)