Using CUDA to Enhance Data Processing of Variant Call Format Files for Statistical Genetic Analysis
Document Type
Oral Presentation
Campus where you would like to present
Ellensburg
Event Website
https://digitalcommons.cwu.edu/source
Start Date
18-5-2020
Abstract
Utilizing the power of GPU parallel processing with CUDA can significantly speed up the processing of Variant Call Format (VCF) files and statistical analysis of genomic data. A software package designed toward this purpose would be beneficial to genetic researchers by saving them time which they could spend on other aspects of their research. A data set containing genetics from a study of trichome production in Mimulus guttatus, or yellow monkey flower, was used to develop a package to test the effectiveness of GPU parallel processing versus serial executions. After a serial version of the code was generated and benchmarked, OpenACC with Portland Group's PGI compiler using CUDA was applied to the parallizable parts and the program run time was recorded to be compared to the serial execution. To create this program more accessible to researchers in the biological field, the accelerated functions of the program are written in the C language and compiled as a driver file to be used from R.
Recommended Citation
McKinnon, Heather, "Using CUDA to Enhance Data Processing of Variant Call Format Files for Statistical Genetic Analysis" (2020). Symposium Of University Research and Creative Expression (SOURCE). 50.
https://digitalcommons.cwu.edu/source/2020/COTS/50
Department/Program
Computer Sciences
Additional Mentoring Department
https://cwu.studentopportunitycenter.com/2020/04/using-cuda-to-enhance-data-processing-of-variant-call-format-files-for-statistical-genetic-analysis/
Using CUDA to Enhance Data Processing of Variant Call Format Files for Statistical Genetic Analysis
Ellensburg
Utilizing the power of GPU parallel processing with CUDA can significantly speed up the processing of Variant Call Format (VCF) files and statistical analysis of genomic data. A software package designed toward this purpose would be beneficial to genetic researchers by saving them time which they could spend on other aspects of their research. A data set containing genetics from a study of trichome production in Mimulus guttatus, or yellow monkey flower, was used to develop a package to test the effectiveness of GPU parallel processing versus serial executions. After a serial version of the code was generated and benchmarked, OpenACC with Portland Group's PGI compiler using CUDA was applied to the parallizable parts and the program run time was recorded to be compared to the serial execution. To create this program more accessible to researchers in the biological field, the accelerated functions of the program are written in the C language and compiled as a driver file to be used from R.
https://digitalcommons.cwu.edu/source/2020/COTS/50
Faculty Mentor(s)
Donald Davendra