BMI Students

Friday, August 25, 2006

Mixing Python and C

There are a bunch of ways to mix Python and C, including Pyrex (nice technology, but non-standard, so hard to distribute), Boost (never tried it, looks ok), and SWIG (good, but requires some heavy lifting; for large-scale projects), and PyCXX, which Zach mentioned on this blog before (never tried it).

First, before resorting to C, try the excellent psyco module, which gets you a free speedup and requires no work (and if you like the cut of its jib, google for PyPy). The only catch is that psyco is i386-specific.

My preferred way to use C with Python is by actually writing the boilerplate C myself. This sounds stupid/hairy but once you have the minimal code in place, it becomes quite easy to extend. This is especially true if you are doing what I imagine to be the typical Python/C mix: calling a C function from Python with an array to operate on, and getting an array or a number in return (e.g. replacing a slow matrix-operation loop). Smith-Waterman would be a good example; write it in Python, then replace the Smith-Waterman function with C, and verify it is correct by comparing to the Python output, which I assume is correct, but slow, (for instance, it might use easy-to-human-parse strings). I am also assuming that you are using Numeric/numpy arrays and not Python lists, which is likely/advisable for these kinds of number-crunching tasks.

In that spirit, and to save others time I have wasted, below is a very small example C program, a python program that calls it, and a "setup.py" file to build the C shared object that python imports.
More


First, the C code. This code is very simple. It takes the Python Numeric/numpy array as an argument, and its length (you can also null terminate the array). C requires two files to be imported, Python.h, which should be in your path, and arrayobject.h, a Numeric file that may not be in your path (you can copy it into the directory for testing).

Note how the C array is just the data part of the Numeric array cast as int* ( c_segs_array = (int *)segs_array->data; ). At the end of the function a "PyArrayObject" is built from this C array, and returned using "PyBuildValue". The ease of translation between C arrays and Numeric arrays is key, and simplifies the whole process.

Note that c_segs_array must be cast as "char*" for the "PyArray_FromDimsAndData" function.

The second and third functions are boilerplate, and won't change much. No doubt some of this C file is mysterious, but most of it will not change at all. Any function that takes as input a Numeric array or number and returns an array or number can just be slotted into the mintest function.




#include "Python.h"
#include "Numeric/arrayobject.h"


static PyObject *
mintest(PyObject *self, PyObject *args, PyObject *kwargs) {

//-----------------------
//List arguments/keywords
//-----------------------
static char *kwlist[] = {"py_segs","num_segs",NULL};

int i;

int num_segs;
int dims[1];

PyObject *py_segs;
PyArrayObject *segs_array;
int *c_segs_array;

//---------------
//Parse the input
//---------------
if (!PyArg_ParseTupleAndKeywords(args, kwargs, "Oi:nothing", kwlist,
&py_segs, &num_segs)) {
return NULL;
}

//-------------------------------------------
//Make C arrays from my python numeric arrays
//-------------------------------------------

segs_array = (PyArrayObject *)PyArray_ContiguousFromObject(py_segs, PyArray_INT, 0, num_segs);
c_segs_array = (int *)segs_array->data;


for (i = 0; i < num_segs; i++) {
fprintf(stderr,"C testing %d\n",c_segs_array[i]);
}

//----------------
//Return the array
//----------------
dims[0] = num_segs;
PyArrayObject *return_array = (PyArrayObject *)PyArray_FromDimsAndData(1,dims,PyArray_INT, (char*)c_segs_array);
return Py_BuildValue("Oi", return_array, num_segs);
}


static PyMethodDef mintestMethods[] = {
{"mintest", (PyCFunction)mintest, METH_VARARGS|METH_KEYWORDS,
"HELP for minimal_test\n"},
{NULL,NULL,0,NULL} /* Sentinel -- don't change*/
};

PyMODINIT_FUNC
initmintest(void) {
(void) Py_InitModule("mintest", mintestMethods);
import_array();
}


Now setup.py. This is simply a distutils file that tells python how to build the C file. Like with any python module, you type "python setup.py build" to build it, and "python setup.py install" to install. For testing, I usually just build it (which makes a build directory), then make a symbolic link in the main directory (ln -s build/lib.linux/mintest.so mintest.so).



from distutils.core import setup,Extension

module1 = Extension('mintest',sources=['mintest.c'])

setup(name = 'mintest',
version = '1.0',
description = 'minimum C test',
ext_modules = [module1])


#extra_compile_args = ["-O4"] # You could put "-O4" etc. here.


Finally, the Python program, which is hopefully self-explanatory.


import os, sys, re
import random
import Numeric as N
import mintest

#Make a 1D array of length 10
pyarray_length = 10
pyarray = N.array([random.randrange(100) for i in range(pyarray_length)])

#Print out the array as Python sees it
print "Python printing array", type(pyarray), pyarray, pyarray_length

#Get the same array after passing it to C and back
carray, carray_length = mintest.mintest(pyarray, pyarray_length)

#Finally print out the returned array
print "Array after going through C", type(carray), carray, carray_length


And that's it! Pretty easy once you know how.

Tuesday, August 22, 2006

infosthetics

Following on from Zach's junk charts post, this infosthetics blog is sweet.
www.infosthetics.com

Saturday, August 05, 2006

ANSI Escape Codes in Python

ANSI escape codes are surprisingly useful. For Python, the escape code is "\x1b[". Here is an example loading bar. This is a Unix thing, won't work on windows.



sys.stderr.write("\x1b[34mloading[" + " "*10 + "]\x1b[0m\r")
sys.stderr.write("\x1b[8C")
for i in range(10):
sys.stderr.write('.')
sys.stderr.write('\n')

Here "\x1b[34" is "colour foreground red", and "\x1b[8C" means move the cursor right 8 spaces"

It prints out something like this, but with loading in red:
loading[..........]

http://en.wikipedia.org/wiki/ANSI_escape_code