Design Document for porting EMAN to python


Introduction

Reference

Unresolved Issues

Design

  1. All the implementations in this project shouldn't change anything in the existing EMAN C++ code.

    For those classes which have no pointers in their public interfaces and constructors, wrap them directly using boost python. No change need to be made in the EMAN C++.

    For those classes which have pointers in their public interfaces and constructors, define the corresponding public subclasses. In the subclasses, use smart pointers and references to replace these pointers. Then wrap those subclasses using boost python.

  2. The Python classes should have the same names to C++ classes.

    When wrapping EMAN classes' subclasses, give the subclasses the names of its base classes' name. For example, in EMAN, we have EMData class. Its subclass is called PyEMData. Here is the way they are wrapped in boost python:

    boost::python::module_builder pyEM("pyEM");
    boost::python::class_builder _EMData(pyEM, "_EMData");
    boost::python::class_builder EMData_class(pyEM, "EMData");
    EMData_class.declare_base(_EMData, boost::python::without_downcast);
    	
    In the above code, a module called 'pyEM' is defined. then C++ class EMData is added into 'pyEM' module as '_EMData' in python. Its subclass 'PyEMData' is added as 'EMData'. The reason that we need both is that only functions with pointers in their signatures in C++ class EMData are overrided in PyEMData class. So we need PyEMData to port those overrided functions, and EMData to port those non-overrided functions. In the python module, user should only use 'EMData' class. they shouldn't use '_EMData' class.

  3. A native pointer or reference variable in a C++ function, which is supposed to return some new values, should have the same behavior in python .

    Because python has no pointer, only reference can be used to return a value through function argument. In addition, python doesn't support native-type (float, int, string, etc) reference. So in order to port C++ functions that return native-type pointers through the arguments, wrapper classes are used to replace the native types. for example, the following code:

    void foo(float* num_ptr);
    will be changed to
    void py_foo(Float num_obj);

    In python side, users will use 'Float' objects to return values in function arguments.

  4. A pointer instance in C++ should be able to ported to python without having to create new copies of this instance.

    Use boost::shared_ptr to implement this.

  5. EMAN's build-in C++ List should be ported to python list.

  6. A C++ array should be ported to python list.

    Here is an example how this is implemented.

    EMAN code:

    void foo(float* f_array, int array_sz);
    Wrapper function code:
    void py_foo(boost::python::list f_list, int list_sz) {
    	create float array f_array with size = f_list.size();
    	copy f_list to f_array;
    
    	call foo(f_array, list_sz);
    
    	copy f_array to f_list;
    	free f_array;
    }

  7. The public functions should have the same signatures, with C++ types mapping to the corresponding python types.

    The following are a mapping between original C++ types and new C++ types:

    Original Type New Type
    float* (pointer to a number) Float class
    float* (pointer to an array) boost::python::list
    float [] boost::python::list
    char* (string) const char*
    vector boost::python::list
    EMData* boost::shared_ptr<EMData>

  8. EMAN classes should support pickle/unpickle.

    Let's use PyEuler class as an example.

  9. EMData class's data should be exported as Numeric Array in python.

    Here is the way to support Numeric Array in boost python:

    1. foo.h
      	
      class Foo {
      public:
          PyObject* get_num_array();
      private:
          float* array;
      };
      		
    2. foo.C
      #define PY_ARRAY_UNIQUE_SYMBOL Py_Array_test_foo
      #define NO_IMPORT_ARRAY
      #include "Numeric/arrayobject.h"
      		
      PyObject* Foo::get_num_array()		
      {
          int dims[1];
          int ndim = 1;
          int dim = 10;
          dims[0] = dim;
      		
          PyArrayObject* num_array =
      	(PyArrayObject *) PyArray_FromDims(ndim, dims, PyArray_FLOAT);
          
          float* num_array_data = (float*) num_array->data;
      
          for (int i = 0; i < dim; i++) {
      	num_array_data[i] = array[i];
          }
          
          return (PyObject*) num_array;
      }   
      	
    3. foo_wrap.C
      		
      #include "boost/python/class_builder.hpp"
      
      #define PY_ARRAY_UNIQUE_SYMBOL Py_Array_test_foo
      #include "Numeric/arrayobject.h"
      
      namespace python = boost::python;
      
      BOOST_PYTHON_MODULE_INIT(numbermod)
      {   
          python::module_builder foo_module("foomod");
          python::class_builder foo_class(foo_module, "Foo");	
          foo_class.def(&Foo::get_num_array, "get_num_array");
      
          import_array();
      }
      		

lpeng@bcm.tmc.edu
Last modified: Wed Mar 20 11:30:22 CST 2002