The Art of Visualizing High Dimensional Data

Vincent Granville
1 min readJun 10, 2022

Finally published! This article discusses enriched visualizations, with a focus on animated gifs and videos built in Python. For instance, the comet video can feature several dimensions that are difficult to show in a static picture: the comet locations at any given time, the relative velocity of each comet, the change in velocity (acceleration), the change in comet size when approaching the sun, the comet interactions (the apparent collisions), and more. It can easily display 17 dimensions, as discussed in the paper.

The PDF document (6 pages + code + illustrations, 11MB) focuses on four applications: prediction intervals in any dimension, supervised classification, convergence of algorithms such as gradient descent when dealing with chaotic functions, and spatial time series (the comet illustration). All visualizations use the RGB color model, and one uses RGBA for special and particularly useful effects, by playing with the transparency level. In essence it allows you to perform supervised classification using image techniques only, after mapping your dataset onto an image.

Image compression and anti-aliasing techniques are included in the Python code. They require only a simple call to a library function. The code is also on GitHub, and the videos on YouTube. The document also presents surprising data in number theory and experimental math. It leads to interesting machine learning problems: boundary / holes detection, and convergence acceleration for chaotic iterations.

Read the full article and access the free PDF, here.

--

--

Vincent Granville

Founder, MLtechniques.com. Machine learning scientist. Co-founder of Data Science Central (acquired by Tech Target).