Distributed Stochastic Neighbor Embedding

11 Jan 2018

If you want to see multidimensionally data, for example the MNIST dataset, there is a machine learning algoritm created by Geofrey Hinton and Laurens van der Maarten by 2008.

Basically comprises two stages:

  • calculates the probability distribution of that each point par is similar $p_ij$.
  • define a similar probability distribution on a low-dimensional map, and minimizes the diference between the two distributions using gradient descent.

Algorithm is free of use for non commercial use.

Python version (and another languages) is downloadable here

Python version is an example of MNIST dataset that has the following result:

MNIST result

Best way to use it is scikit-learn that has up to eight manifolds (tsne is one of them) to show high dimensional datasets.

manifold



who am i

Engineer in Barcelona, working in BI and Cloud service projects. Very interested in the new wave of Machine-Learning and IA applications

what is this

This is a blog about software, some mathematics and python libraries used in Mathematics and Machine-Learning problems

where am i

github//m-alcu
twitter//alcubierre
linkedin//martinalcubierre
facebook//m.alcubierre
2017 by Martín Alcubierre Arenillas.
Content available under Creative Commons (BY-NC-SA) unless otherwise noted.
This site is hosted at Github Pages and created with Jekyll.