Genome wide epistatic study using high performance computing graphic cards and multi/manycore system
Background: In the project we want to learn how to program in GPU based languages in order to implement the parallel algorithm on such hardware. We also want to get theoretical training in the field of parallel computing on GPUs in view of constructing our algorithm for establishing decision trees in the form of fully independent evaluations for which we can pipeline the parallel paradigm on large data set. A third goal is the introduction to different approaches (CUDA, OpenCL, CUDA via PGI) to allow choosing the most appropriate one.
Results: As initial step, the implementation of an MPI version of our code demonstrated the parallelization potential of the methods applied could be accomplished with minimal overhead. For up to 16 cores performance scaled almost linearly.
The most straight-forward way to extend this to GPU computing seemed via the PGI compiler. This approach extends standard C code with compiler directives for automatic parallelization (either multi-core or GPU), rather than adopting a different implementation model as would be necessary for either CUDA or OpenCL. While this idea holds some promise, some difficulties were encountered in the details of process. To take advantage of the compilers parallelization features targeted for GPU various adjustments to the original code were necessary.
In the end we achieved a GPU-parallel version of our algorithm showing considerable speed-up compared to a CPU version.
- KONWIHR funding: two months during Multicore-Software-Initiative 2009/2010
- Dr. Benno Pütz, Max-Plank-Institut für Psychiatrie