Visualization of Decision Trees based on General Line Coordinates to Support Explainable Models
Document Type
Oral Presentation
Event Website
https://source2022.sched.com/
Start Date
18-5-2022
End Date
18-5-2022
Keywords
Visual Analytics, Interpretability, Decision Trees
Abstract
Visualization of Machine Learning (ML) models is an important part of the ML process to enhance the interpretability and prediction accuracy of the ML models. This paper proposes a new method SPC-DT to visualize the Decision Tree (DT) as interpretable models. These methods use a version of General Line Coordinates called Shifted Paired Coordinates (SPC). In SPC, each n-D point is visualized in a set of shifted pairs of 2-D Cartesian coordinates as a directed graph. The new method expands and complements the capabilities of existing methods, to visualize DT models. It shows: (1) relations between attributes, (2) individual cases relative to the DT structure, (3) data flow in the DT, (4) how tight each split is to thresholds in the DT nodes, and (5) the density of cases in parts of the n-D space. This information is important for domain experts for evaluating and improving the DT models, including avoiding overgeneralization and overfitting of models, along with their performance. The benefits of the methods are demonstrated in the case studies, using three real datasets.
Recommended Citation
Worland, Alex, "Visualization of Decision Trees based on General Line Coordinates to Support Explainable Models" (2022). Symposium Of University Research and Creative Expression (SOURCE). 107.
https://digitalcommons.cwu.edu/source/2022/COTS/107
Department/Program
Computer Science
Additional Mentoring Department
Computer Science
Visualization of Decision Trees based on General Line Coordinates to Support Explainable Models
Visualization of Machine Learning (ML) models is an important part of the ML process to enhance the interpretability and prediction accuracy of the ML models. This paper proposes a new method SPC-DT to visualize the Decision Tree (DT) as interpretable models. These methods use a version of General Line Coordinates called Shifted Paired Coordinates (SPC). In SPC, each n-D point is visualized in a set of shifted pairs of 2-D Cartesian coordinates as a directed graph. The new method expands and complements the capabilities of existing methods, to visualize DT models. It shows: (1) relations between attributes, (2) individual cases relative to the DT structure, (3) data flow in the DT, (4) how tight each split is to thresholds in the DT nodes, and (5) the density of cases in parts of the n-D space. This information is important for domain experts for evaluating and improving the DT models, including avoiding overgeneralization and overfitting of models, along with their performance. The benefits of the methods are demonstrated in the case studies, using three real datasets.
https://digitalcommons.cwu.edu/source/2022/COTS/107
Faculty Mentor(s)
Boris Kovalerchuk