Document Type

Thesis

Date of Degree Completion

Winter 2018

Degree Name

Master of Science (MS)

Department

Computational Science

Committee Chair

Dr. Razvan Andonie

Second Committee Member

Dr. Boris Kovalerchuk

Third Committee Member

Dr. Szilard Vajda

Abstract

Visualization of multidimensional data is a long-standing challenge in machine learning and knowledge discovery. A problem arises as soon as 4-dimensions are introduced since we live in a 3-dimensional world. There are methods out there which can visualize multidimensional data, but loss of information and clutter are still a problem. General Line Coordinates (GLC) can losslessly project n-dimensional data in 2- dimensions. A new method is introduced based on GLC called GLC-L. This new method can do interactive visualization, dimension reduction, and supervised learning. One of the applications of GLC-L is transformation of vector data into image data. This novel approach of transforming vector data into images using lossless visualization introduces a new method for classification of data in vector format. Having images which are in raster format instead of vector format allows it to be classified with a Convolutional Neural Network (CNN). Experiments conducted on datasets of different sizes show that these artificially created images provide useful information for the CNN. The CNN can classify these artificially created images with competitive results to other analytic machine learning algorithms for vector data. The artificially created images were also classified with a Support Vector Machine (SVM) and a Multilayer Preceptron (MLP).

Share

COinS