Using CUDA to Enhance Data Processing of Variant Call Format Files for Statistical Genetic Analysis

Document Type

Oral Presentation

Campus where you would like to present

Ellensburg

Event Website

https://digitalcommons.cwu.edu/source

Start Date

18-5-2020

Abstract

Utilizing the power of GPU parallel processing with CUDA can significantly speed up the processing of Variant Call Format (VCF) files and statistical analysis of genomic data. A software package designed toward this purpose would be beneficial to genetic researchers by saving them time which they could spend on other aspects of their research. A data set containing genetics from a study of trichome production in Mimulus guttatus, or yellow monkey flower, was used to develop a package to test the effectiveness of GPU parallel processing versus serial executions. After a serial version of the code was generated and benchmarked, OpenACC with Portland Group's PGI compiler using CUDA was applied to the parallizable parts and the program run time was recorded to be compared to the serial execution. To create this program more accessible to researchers in the biological field, the accelerated functions of the program are written in the C language and compiled as a driver file to be used from R.

Faculty Mentor(s)

Donald Davendra

Department/Program

Computer Sciences

Additional Mentoring Department

https://cwu.studentopportunitycenter.com/2020/04/using-cuda-to-enhance-data-processing-of-variant-call-format-files-for-statistical-genetic-analysis/

Share

COinS
 
May 18th, 12:00 PM

Using CUDA to Enhance Data Processing of Variant Call Format Files for Statistical Genetic Analysis

Ellensburg

Utilizing the power of GPU parallel processing with CUDA can significantly speed up the processing of Variant Call Format (VCF) files and statistical analysis of genomic data. A software package designed toward this purpose would be beneficial to genetic researchers by saving them time which they could spend on other aspects of their research. A data set containing genetics from a study of trichome production in Mimulus guttatus, or yellow monkey flower, was used to develop a package to test the effectiveness of GPU parallel processing versus serial executions. After a serial version of the code was generated and benchmarked, OpenACC with Portland Group's PGI compiler using CUDA was applied to the parallizable parts and the program run time was recorded to be compared to the serial execution. To create this program more accessible to researchers in the biological field, the accelerated functions of the program are written in the C language and compiled as a driver file to be used from R.

https://digitalcommons.cwu.edu/source/2020/COTS/50