ParBaum-II: High Performance Computing & Emerging Parallel Architectures for Evolutionary Bioinformatics
The Maximum Likelihood (ML) criterion for phylogenetic (evolutionary) infe- rence has repeatedly been shown to be one of the most accurate models for phylogeny reconstruction. Recent advances in search algorithms and high-performance computing have lead to a new generation of programs for ML-based phylogenetic inference that scale well up to several thousand sequences. However, due to the continuously accelerating accumulation of sequence data that is driven by novel wet-lab techniques such as pyrosequencing, the algorithmic and technical development can hardly keep pace with the forthcoming data flood. The main goal of this project will be to devise and implement novel parallelization strategies as well as to assess emerging parallel HW architectures and programming paradigms for large-scale phylogeny reconstruction. Apart from the development of proof-of-concept parallelizations and system-level mechanisms to facilitate parallelization, particular emphasis will be put on developing fully usable and accessible open-source Bioinformatics tools for production-level runs, i.e., tools that will contribute to generate novel Biological and medical insights. Therefore, another important part of this project will focus on joint interdisciplinary large-scale real-world phylogenetic analyses on the HLRB II and an IBM BlueGene/L system located at the San Diego Supercomputer Center in collaboration with Biologists.
- ParBaum-II is a follow-up project of ParBaum
- KONWIHR funding of ParBaum-II: 9/2008 - 8/2011
- Dr. Alexandros Stamatakis, The Exelixis Lab, Lehrstuhl für Bioinformatik (I12), Fakultät für Informatik, TU-München
- Prof. Arndt Bode, Lehrstuhl für Rechnertechnik und Rechnerorganisation, Fakultät für Informatik, TU-München
- Dipl.-Inf. Michael Ott, Lehrstuhl für Rechnertechnik und Rechnerorganisation, Fakultät für Informatik, TU-München
Publications and presentations
- S. Berger, D. Krompass, A. Stamatakis: Performance, accuracy, and web server for evolutionary placement of short sequence reads under maximum likelihood, Systematic Biology, 60(3) (2011) 291.
- E. Lee, A. Cibrián-Jaramillo, S.-O. Kolokotronis, M. Katari, A. Stamatakis, M. Ott, J. Chiu, D. Little, D. Stevenson, R. McCombie, R. Martienssen, G. Coruzzi, R. DeSalle: Functional Phylogenomics of the Origin and Diversity of Seed Plants, PLOS Genetics, (2011).
- S. Berger and A. Stamatakis: Aligning short reads to reference alignments and trees, Bioinformatics, 27:(15) (2011) 2068-2075.
- F. Izquierdo-Carrasco and A. Stamatakis: Computing the phylogenetic likelihood function out-of-core, Workshop Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, (2011).
- S. Berger, A. Stamatakis: Accuracy and performance of single versus double precision arithmetics for maximum likelihood phylogeny reconstruction, Lecture Notes in Computer Science, Springer, 6068 (2010) 270-279.
- M. Ott: Inference of Large Phylogenetic Trees on Parallel Architectures, Dissertation, Fakultät für Informatik, Technische Universität München, (2010).
- W. Pfeiffer, A. Stamatakis: Hybrid MPI/Pthreads Parallelization of the RAxML Phylogenetics Code, Workshop Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, (2010).
- A. Stamatakis, N. Alachiotis: Time and Memory efficient Likelihood-based Tree Searches on gappy Phylogenomic Alignments, Proceedings of the 18th Annual International Conference on Intelligent Systems for Molecular Biology, (2010).
- A. Stamatakis, Z. Komornik, S. Berger: Evolutionary Placement of Short Sequence Reads on Multi-Core Architectures, Proceedings of the 8th ACS/IEEE International Conference on Computer Systems and Applications, (2010).
- A. Hejnol, M. Obst, A. Stamatakis, M. Ott, G. W. Rouse, G. D. Edgecombe, P. Martinez, J. Baguñà, X. Bailly, U. Jondelius, M. Wiens, W. E. G. Müller, E. Seaver, W. C. Wheeler, M. Q. Martindale, G. Giribet, C. W. Dunn: Assessing the root of bilaterian animals with scalable phylogenomic methods, Proceedings of the Royal Society B: Biological Sciences, 276:(1677) (2009) 4261-4270.
- M. Ott, A. Stamatakis: Preparing RAxML for the SPEC MPI Benchmark Suite, Transactions of the Fourth Joint HLRB and KONWIHR Review and Results Workshop (Ed: S. Wagner, M. Steinmetz, A. Bode, and M. M. Müller), Springer, (2009).
- A. Stamatakis, M. Ott: Load Balance in the Phylogenetic Likelihood Kernel, Proceedings of the 38th International Conference on Parallel Processing, (2009).
- A. Stamatakis, M. Ott: Efficient Computation of the Phylogenetic Likelihood Function on Multi-Gene Alignments and Multi-Core Architectures, Philosophical Transaction of the Royal Society B, 363:(1512) (2008) 3977-3984.
- A. Stamatakis, M. Ott: Exploiting Fine-Grained Parallelism in the Phylogenetic Likelihood Function with MPI, Pthreads, and OpenMP: A Performance Study, Lecture Notes in Computer Science (Ed: M. Chetty, A. Ngom, and S. Ahmad), Springer, 5265 (2008) 424-435.
- see ParBaum for the full list of publications and presentations related to the project and its forerunner