Abstract
In data analysis, visualization through dimensionality reduction (DR) is one of the most effective ways to understand a dataset. However, the quality of a visualization is hard to evaluate quantitatively and the hyperparameters of visualization algorithms are sometimes difficult to tune for end-users. This article proposes a score for visualization assessment that can be used to ease the choice of hyperparameter values for widely used DR methods like $t$ -distributed stochastic neighbor embedding, LargeVis, and uniform manifold approximation and projection. We present the constraint preserving score , a computationally efficient score to measure visualization quality. The idea is to measure how well a visualization preserves the information encoded in pairwise constraints like group information or similarity/dissimilarity relationships between instances. Based on this quantitative measure, we use Bayesian optimization to effectively explore the solution space of all visualizations and find the most suitable one. The proposed score is flexible as it can measure quality in different ways depending on the provided constraints. Experiments show its interest for end-users, its complementarity with existing visualization quality measures, and its flexibility to easily express different quality aspects.
Original language | English |
---|---|
Pages (from-to) | 269 - 282 |
Number of pages | 14 |
Journal | IEEE Transactions on Artificial Intelligence |
Volume | 2 |
Issue number | 3 |
Publication status | Published - 7 Jul 2021 |
Keywords
- Bayesian Optimization
- Dimensionality Reduction
- Pairwise Constraints
- Visualization
- Hyperparameter Tuning
- Machine Learning