June 24th, 2011

Verifying the more awkward parts of the CPython API

I've been running my Python extension module static analyser over CPython itself (the latest in the 2.7 hg branch, specifically).

I'm pleased to say that the project's mailing list received the first patch to the checker from someone other than me (Thanks Tom!) - He's been running it over the sources of gdb (which embeds python).

These make for good torture tests for the analyser, and I'm pleased with how far it survived. Some bugs do remain - in the checker, that is.

There are quite a few places where CPython calls PyArg_ with a code expecting a "const char*", but receives a "char*'. I think the ones in CPython are all false positives, so I think we're going to need to make that configurable.

Perhaps the most fiddly part of the checking this API is the "O&" conversion code - I wasn't able to handle this in my previous Coccinelle-based approach to this problem.

Here's an example:
    69	extern int convert_to_ssize(PyObject *, Py_ssize_t *);
    71	PyObject *
    72	buggy_converter(PyObject *self, PyObject *args)
    73	{
    74	    int i;
    76	    if (!PyArg_ParseTuple(args, "O&", convert_to_ssize, &i)) {
    77	        return NULL;
    78	    }
    80	    Py_RETURN_NONE;
    81	}

The idea is that you're meant to supply a conversion callback, which can extract a value back to the next argument.

The above example has a bug (can you see it?)

After a fair amount of coding today, the checker is now able to detect it:

[david@fedora-15 gcc-python]$ ./gcc-with-python cpychecker.py $(python-config --cflags) demo.c
demo.c: In function ‘buggy_converter’:
demo.c:76:26: error: Mismatching type in call to PyArg_ParseTuple with format code "O&" [-fpermissive]
  argument 4 ("&i") had type
    "int *" (pointing to 32 bits)
  but was expecting
    "Py_ssize_t *" (pointing to 64 bits) (from second argument of "int (*fn) (struct PyObject *, Py_ssize_t *)")
  for format code "O&"

Notice how it used the type of the callback to figure out what the type of the next argument must be.

I've also reformatted the error messages slightly, adding newlines and indentation to try to make them easier to grok.

Hopefully we'll shake out the rest of the bugs soon, and then on to reference-count checking...

If's still rather rough around the edges, but if you want to try running it on your extension module, then come and join us.