3D -> 2D

In the 3D object to 2D image file, we first read in all of the lines of an object file and try to construct a list of vertices (each composed of 3 points corresponding to x, y, and z coordinates) and list of faces (each face is defined by three vertices). In terms of parallelization, this is easiest done in MPI, assigning each processor of rank k, the '(line % size) == k' line. It would be extremely contrived to parallelize this code with CUDA. PyCUDA, as it is currently implemented cannot directly read from a file which is stored on the HDD. Furthermore, one of the intricacies associated with this image decomposition algorithm is that the order of lines in the input file matters, forcing either sequential reads, or massive memory transfer and computation from the GPU.

In the parralelized MPI input reading code, each processor creates a dictionary which has a key value pair associated with the line number and the line. We then send the list to the root pprocessor and broadcast it from there. The order of the face list does not matter so it is an embarrassingly parallel problem: each processor reads in a given line, stores it in an array, then all of the processors perform a gather to obtain the full face list.

In the next major section of code, we iterate through planes to find the line segments that intersect the given plane and write them out to an image. In all implementations, we iterate through slice depths. In the serial case we iterate through each face in the face list for every slice depth.

There are 5 possibilites for a given face and slice plane:

  • All of the face's vertices are on one side of the slice plane. If so, then this face does not intersect with the slice plane and nothing should be shown on the corresponding image.
  • If there is one vertex on the slice plane, then we must find the other point of intersection (if there is one) and then find the line segment connecting the two points and write it to the slice image.
  • If two vertices are on the slice plane, then we draw a line that connects these two vertices to the corresponding image.
  • If two vertices are on one side of the slice plane and the third vertex is on the other, we must find the line segment that describes the intersection between the face and the plane.
  • If there are three vertices on the slice plane, then the entire face is on the slice plane and draw a shaded triangle to the corresponding image.

Parallelization Schemes

In MPI, we can assign the kth processor the '(slice % size) == k' slice. Thus, we iterate over the slice planes (parallelized), iterate over the face list, find the line segments intersecting the plane of each face and write them to an image. Each MPI Process will save a different image to the disk.

In Cheetah'ed CUDA, we assign each face to a thread. We iterate through all the slice depths; each thread simultaneously calculates its intersecting line segments and sends them back to the CPU. This eliminates the need for iteration through the faces.