Python is a Very High Level Language (VHLL) with powerful facilities for dynamic typing, reflection, interpretation, and modularity. It is also, however, very slow. While python code performs much faster than shell scripting it is slower than, say, Perl equivalents. It is runs very quickly, though, when accessing pre-compiled C code as built-in and extension modules.
Python is very easy to extend in C and building C modules is a very important skill for a Python programmer to have. By learning how to do special sections in C, Python becomes an excellent 'glue' language in the traditional of Tk and presents a powerful paradigm of prototyping everything in Python and then switching in C where necessary for library interfacing or for greatly increased performance.
You can create Python objects in C, but, I will focus here on how to make simple, imperative module functions. I would recommend making the most of the object-oriented facilities of Python rather than creating your objects in C. But, anytime you need to do something very quickly, seriously conserve memory, perform low-level functions not available in Python, or access a C API, I would advise doing it with a C module.
Here is a simple example that creates
the module, mymod, which contains function, test.
test prints out "Hello!" upon invocation:
#include "Python.h"
PyObject * mymod_test(PyObject *self) {
puts("Hello!");
Py_INCREF(Py_None);
return Py_None;
}
static PyMethodDef mymod_methods[] = {
{"test", (PyCFunction)mymod_test, METH_NOARGS, "Prints test string.\n"},
{NULL, NULL, 0, NULL}
};
DL_EXPORT(void) initmymod(void)
{
Py_InitModule3("mymod", mymod_methods, "Provides a test function.\n");
}
mymod_test is the actual
function invoked when test
is called from Python. All Python functions return a Python object. We
don't have anything important to return, so we just return None,
which is available in C as Py_None. Python employs
reference counting for garbage collection. There is only one None
object and if we return it and no one retains it, it will have its
reference count decreased (i.e. if no one says "x = mymod.test()").
Once an object's reference count gets to zero, the object is released.
So, to prevent the release of None
we need to increment it's reference count to balance the decrement that
will happen to it. Understanding this is important. If you are
concentrating on the C code and just using Python objects as
transports, you aren't likely to get burned by this, though.
Once you are in C, you are pretty much
free to do anything you
want. You can call C APIs, allocate and free memory, etc. In the case
of our example we just puts some text. Your function gets
registered in a method table as the module is initialized. The example
shows the basic sequence for doing this. The string explainations are
actually help strings that will be visible from Python. Note that your PyMethodDef
table will need to have an empty entry at its end. initmymod
is init + whatever the name of your module is. This needs
to be consistent as it is automatically invoked by python.
To compile the module, you can build it
into Python or build it as
an extension module.
This varies quite a bit, especially with Windows, but, a basic command
for building extension modules with GCC is (for a source file named "mymod.c"):
gcc -shared -I/usr/include/python2.3 mymod.c -o mymod.so
In the Python interpreter, you can now
load the module in Python by entering "import mymod"
in the same directory as the module (you can also put it your Python
library directory or add its location to the PYTHONPATH environment
variable). You can run the module by entering "mymod.test()"
and you can see the help strings by entering "help mymod".
Okay, now handling arguments in the function is just a small modification.
PyObject * mymod_test(PyObject *self) {
/* .... */
static PyMethodDef mymod_methods[] = {
{"test", (PyCFunction)mymod_test, METH_NOARGS, "Prints test string.\n"},
becomes
PyObject * mymod_test(PyObject *self, PyObject *args) {
/* .... */
static PyMethodDef mymod_methods[] = {
{"test", mymod_test, METH_VARARGS, "test(int, object)\n"},
Now we are ready to handle arguments.
Note that every Python object is represented as a Py_Object
struct. In the case of arguments passed to a C function, the args
is actually a Python tuple. You can get the tuple's size with the PyType_GET_SIZE
macro (e.g. "int size = PyTuple_GET_SIZE(args);").
Extracting the arguments from the tuple is usually done with the PyArg_ParseTuple
function. That function takes a format string and addresses for putting
the tuple's data:
PyObject * mymod_test(PyObject *self, PyObject *args) {
int x;
PyObject *obj;
if (! PyArg_ParseTuple(args, "iO", &x, &obj)) {
return NULL;
}
printf("%d\n", x);
return obj;
}
This takes two arguments, an integer
(the "i" in the format string) and an object (the "O").
It will set x to the integer and return the same object
that was give to it as the second argument. PyArg_ParseTuple
automatically prepares a reasonable Python exception if the tuple
doesn't parse correctly and returning NULL tells the
Python engine that an exception occurred. We can also explicitly throw
exceptions like this:
if (x > 1000) {
PyErr_SetString(PyExc_Exception, "The number is too large!");
return NULL;
}
A convenient function, PyErr_SetFromErrno,
sets up an exception from the standard C library's errno:
if (open("d3iohwdkionc290j4fj", O_RDONLY) < 0) {
PyErr_SetFromErrno(PyExc_Exception);
}
Additional exception types can be
found in pyerrors.h where your include files are located
(e.g. /usr/include/python2.2). Lots of format variables
are available. For example, "s" is a char *
string, "d" is a double, and "s#" is a useful
format returning two parameters, the char * string and an
int length (this way you can pass strings with embedded NULL
characters).
Now, there is a function that is
converse to extracting arguments; it prepares Python objects for
returning. This function is Py_BuildValue and it takes a
format string, variables, and returns a PyObject.
The format codes are the same as before. If two or more format codes
are used (i.e. if there are two or more variables being stored), it
returns a tuple containing objects for those variables. Note that Py_BuildValue
takes the actual values of ints, doubles,
etc rather than their addresses. The following returns two int
objects in a tuple. Their value is the same as was passed in args
int x;
PyObject *num;
PyObject *tuple;
PyArg_ParseTuple(args, "i", &x);
num = Py_BuildValue("i", x);
tuple = Py_BuildValue("iO", x, num);
return tuple;
An interesting feature of formatting
for both parsing and building
values is the ability to use parentheses and brackets to handle tuples,
list, and dictionaries. For example, "[i,i,i]" would
handle three integers in the format of a list, either parsing or
returning a list. And, "{s:i}" would handle a dictionary
with a char * key and an int value. Cool
beans!
Although Python is multithreaded, this
is at the bytecode level.
Only one C function is actually entered at a time in Python unless you
relinquish the thread temporarily (as some blocking I/O functions do).
So do this with the macros, Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS.
To leave such a block in the middle, you must insert Py_BLOCK_THREADS:
Py_BEGIN_ALLOW_THREADS
/* .... */
if (surprise_condition) {
Py_BLOCK_THREADS
PyErr_SetFromErrno(PyExc_IOError);
return NULL;
}
/* .... */
Py_END_ALLOW_THREADS
These macros are normally harmless if
you don't have threading
enabled; they just don't release the thread. Threading inside a Python
function is pretty complex, but, you can easily handle a blocking
function with those macros (note that they don't have a semicolon; they
actually create their own bracketed blocks (which is why you jump out
with Py_BLOCK_THREADS)).
One thing that the C bridge is really useful for is calling C libraries. For example, this will produce an SDL window on the screen (this will require that the SDL libraries are installed):
#include "SDL.h"
SDL_Surface *sdl_screen;
PyObject * mymod_test(PyObject *self, PyObject *args) {
int x;
if (! PyArg_ParseTuple(args, "i", &x)) {
return NULL;
}
SDL_Init(SDL_INIT_VIDEO);
sdl_screen = SDL_SetVideoMode(x, x, 8, SDL_SWSURFACE);
Py_INCREF(Py_None);
return Py_None;
}
To build it, you need to make sure you
link in the library you are using (it won't complain if you omit the "-lSDL"
until you actually load your module in Python):
gcc -shared -I/usr/include/python2.3 -I/usr/include/SDL mymod.c -lSDL -o mymod.so
Well, you now know enough to easily build your own C-based Python modules. It really is quite easy to do. The Python C library is fairly extensive and you'll find there is a variety of ways to do things. You can find the API's reference here:
http://www.python.org/doc/current/api/api.html
Happy hacking!