Massively Parallel k-NN using CUDA and MARS

Presenter Information

Joshua Smithrud
Patrick McElroy

Document Type

Oral Presentation

Campus where you would like to present

SURC 140

Start Date

16-5-2013

End Date

16-5-2013

Abstract

In pattern recognition, the k-nearest neighbor algorithm (k-NN) is a method for classifying objects based on closest training examples in the feature space. k-NN is a type of instance-based learning, where the function is only approximated locally, and all computation is deferred until classification. An object is classified by a majority vote of its neighbors, with the object being assigned to the class most common amongst its k nearest neighbors (k being a positive integer, typically small). For big datasets, this algorithm can become very slow. One way to increase its efficiency is to use a parallel implementation on Graphic Processing Units (GPUs). CUDA is a parallel computing platform and programming model, developed by Nvidia, which enables dramatic increases in computing performance by harnessing the power of the GPU. Our contribution is a massively parallel implementation using Nvidia GPUs with the CUDA Application Programming Interface (API) and the MARS MapReduce libraries.

Faculty Mentor(s)

Razvan Andonie

Additional Mentoring Department

Computer Science

This document is currently not available here.

Share

COinS
 
May 16th, 9:00 AM May 16th, 9:20 AM

Massively Parallel k-NN using CUDA and MARS

SURC 140

In pattern recognition, the k-nearest neighbor algorithm (k-NN) is a method for classifying objects based on closest training examples in the feature space. k-NN is a type of instance-based learning, where the function is only approximated locally, and all computation is deferred until classification. An object is classified by a majority vote of its neighbors, with the object being assigned to the class most common amongst its k nearest neighbors (k being a positive integer, typically small). For big datasets, this algorithm can become very slow. One way to increase its efficiency is to use a parallel implementation on Graphic Processing Units (GPUs). CUDA is a parallel computing platform and programming model, developed by Nvidia, which enables dramatic increases in computing performance by harnessing the power of the GPU. Our contribution is a massively parallel implementation using Nvidia GPUs with the CUDA Application Programming Interface (API) and the MARS MapReduce libraries.