A large amount of big data analyses are exploiting the advantages of computing cluster architectures in combination with the MapReduce paradigm. The data explosion has also reached life sciences, particularly since the advent of the Next Generation Sequencing (NGS). NGS allows decoding all of approximately 3.2 billion positions of a human genome in feasible time, which has resulted in an unprecedented increase in the available data. Due to the nature and size of the produced raw data, the MapReduce paradigm has been demonstrated as a well-fitting model for analyzing NGS data. Unfortunately, smaller genetic centers or individual investigators can hardly afford the acquirement and maintenance of own computer systems with adequate performance. A possible solution comes in the form of virtualized computer infrastructure resources, or cloud computing. However, formidable challenges need to be addressed to make these resources available to the individual investigators. This thesis introduces an execution framework for deploying MapReduce-based applications in virtualized infrastructures, which also builds the backbone of this work. Additionally, a number of computational modeling approaches have been devised in this work that cover pre-processing of sequencing data, mapping of many short sequencing reads to a reference genome, and the detection of new genetic risk factors using data from the 1000 Genomes Project. In this context, a data caching mechanism utilizing the distributed file system that simulates the behavior of makefiles and results in runtime cost savings in the cloud is presented. Summarized, the presented framework introduces a novel method for delivering MapReduce cluster infrastructures for domain experts in genomics, greatly increasing accessibility of the MapReduce paradigm in public clouds, while the devised modeling solutions show promising data parallelization results.