Dimensionality Estimation Diary
Contents
Papers
 Belkin, Mikhail, and Partha Niyogi. "Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering."
 Ulrike von Luxburg. "A Tutorial on Spectral Clustering."
 Fowlkes, Belongie, Chung and Malik. "Spectral Grouping Using the Nyström Method."
Paper summaries
 A Tutorial on Spectral Clustering
 Spectral Grouping Using the Nyström Method
 Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering
Implementation Steps
 Take a dataset  ISOMAP, turning teapot or synthetically generated
 For synthetically generated data:
 Choose random orientation of the circle in high dimension
 Produce points on circle by uniformly sampling from the random orientation in high dimension
 Add Gaussian noise
 Construct a graph from the data by connecting knearest neighbors of each node
 Solve Graph Laplacian of the graph and find the projection of the data onto the low dimensional embedding
 Apply KMeans++ to identify structure in low dimension
 Using the structure in low dimension, construct the graph and solve the Graph Laplacian as illustrated above
Spectral Clustering
We performed Spectral clustering on a few toy datasets such as, concentric circles, half moons and swiss roll, to ensure the correctness of the spectral clustering implementation. Below are the plots of the data which have been clustered. The plots illustrate the clustering is done based on connectivity rather than proximity.
Spectral Analysis
To further verify the working of Spectral analysis, we tested it on two datasets, namely Swiss Roll and Turning Teapot, to ensure the projections make sense.
Swiss Roll
The above plot represents the projection of the second eigenvector on the swiss roll dataset. That is, the eigenvector corresponding to the eigenvalue . This shows that the values of the projection along one dimension is constant which implies that that dimension has been reduced.
The above plot represents the projection of the first eigenvector on the swiss roll dataset. That is, the eigenvector corresponding to the eigenvalue <math\lambda_0=0</math>. This shows that the projection of the data onto the first eigenvector produces an almost constant value all along.
Turning Teapot
The above plot represents the second eigenvector of the spectral decomposition, which is represented in 2D Euclidean Space. The plot is annotated with the corresponding image of the point. As we can see, the points have been ordered by the direction of rotation although the initial data was shuffled. (Note: The missing annotations do not mean that the corresponding image is not present. The number of annotations shown has been reduced to ensure that the images are better visible)
Dimensionality Estimation
Turning Teapot
Yoavfreund 20:25, 13 February 2017 (PST) It is hard to say that there is a downward slope in any of these graphs. It looks more like a number of downward steps. I think that in order to understand what is going on we need a much larger number of datapoints. Instead of real datasets it might be better to generate a dataset that corresponds to a 2D circle embedded in a high dimensional spaces.
We used the centroid initialization technique of KMeans++ to estimate the dimensionality of the data. We use the two measures to compute a loglog plot which in turn is used to estimate the dimensionality:
  Number of cluster representative chosen till now
  Average of previous distances of the distance of the chosen point to the closest cluster representative.
We add noise to distort the data to improve the effect of KMeans++ centroid initialisation. Below are the loglog plots of vs for different levels of noise: Yoavfreund 08:59, 22 February 2017 (PST)
 I don't understand you method for creating a circle. It seems too complex. I would generate a circle in using the first two coordinates of the 1000, and then multiply by a random orthonormal matrix to rotate it to a different direction.
 What is the scaling of log(epsilon) in the second plot? And why is it so different from the scaling in the first plot?
 Please specific the actual noise amount of "little noise" and "some noise"
 please generate the graph also for "no noise"
High dimensional circle
We generated a circle in using the following method:
 Generate points on a nsphere using Marsaglia's method
 Sample two points and from the nsphere. These will determine the plane the circle will lie on.
 Make them unit vectors by dividing by the norm
 Make perpendicular to as follows:
 Use the parametric equation of the circle to sample points: where
Below are the plots obtained from running KMeans++ as mentioned above:
Running Kmeans++ with different levels of noise
Below are the plots obtained from running KMeans++ as mentioned above, for different levels of noise:

 Noise is sampled from a Gaussian distribution:
Yoavfreund 16:22, 22 March 2017 (PDT) The step from nonoise to 1x noise is devastating, from a slope of (19)/5=2 , you get to a slope of (10.9710.85)/6~0.02.
It is also clear that the noise dominates the circle because the distances are so much larger.
I would try much smaller levels of noise: 0.01,0.001,... and see what level of noise is small enough that the slope is around 2.