Skip to content

Latest commit

 

History

History
122 lines (79 loc) · 4.73 KB

README.mkd

File metadata and controls

122 lines (79 loc) · 4.73 KB

nca

An implementation of neighborhood components analysis in C++.

nca uses stochastic gradient ascent for maximization.

Read the original paper: http://www.cs.toronto.edu/~hinton/absps/nca.pdf

Requirements

Eigen: http://eigen.tuxfamily.org

(optional) boost: http://www.boost.org

boost is required if you want to run the test program.

Usage

If you want to use nca in your own project, just copy over the nca.hpp header into your include directory.

If you want to run the test program, you may need to modify the Makefile to reflect the location of your eigen headers. After that, you can run it on some sample datasets, such as:

Iris: http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data

Wine: http://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data

For command-line usage:

usage: ./nca csv_file label_index iterations learning_rate

    csv_file: the input file. It must be in csv format.
    label_index: the index (starting from 0) of the class label on each line of the input file.
    iterations: how many iterations to run sgd. Make this at least 100000 for good results.
    learning_rate: how fast it learns. 0.01 or 0.001 is usually a good value.

An example:

make
./nca iris.data 4 100000 0.01

-0.09935 -0.2215 0.3383 0.443
0.2532 0.5835 -0.8461 -0.8915
-0.729 -0.6386 1.767 1.832
-0.9405 -0.8461 2.281 2.794

Nearest neighbors on raw data:
Got 144 correct out of 150

Nearest neighbors on scaled data:
Got 143 correct out of 150

Nearest neighbors on nca data:
Got 146 correct out of 150

It first prints the resulting transformation matrix, and then it compares results from raw, scaled, or nca nearest neighbors (k=1).

Another example:

./nca wine.data 0 100000 0.001

0.4978 0.2596 0.08335 -0.08415 0.01434 0.01025 -0.005508 0.002091 -0.09629 0.4064 -0.06766 0.003626 0.01129
0.2134 0.5756 0.05306 0.1198 -0.01342 0.04069 -0.06715 -0.02467 0.005831 0.2541 -0.09105 0.06598 -0.02819
0.1268 0.1405 0.6284 -0.08583 0.003564 0.02861 0.1232 -0.002581 -0.04287 0.2555 -0.009432 0.07747 0.01096
-0.005172 0.0633 -0.02238 0.07806 -0.006145 0.009894 -0.05917 -0.008921 0.02883 -0.05833 -0.01572 0.0164 -0.01219
0.00425 0.01535 -0.00187 0.009342 0.000703 -0.004693 -0.02994 -0.001181 -0.01185 0.01151 -0.006103 -0.007321 -0.0005278
-0.01359 0.03761 0.01087 0.02303 -0.005879 0.3658 0.1907 0.0133 0.1144 -0.2291 0.04946 0.1213 -0.01147
-0.0318 -0.03219 0.05605 -0.105 -0.02715 0.04992 0.6728 0.02288 0.2075 -0.1793 0.07372 0.1541 -0.007872
0.07661 -0.09324 0.007211 -0.1328 0.0343 0.02169 0.07863 1.889 -0.08345 0.146 0.01709 0.03459 0.0254
-0.04669 0.007582 -0.007626 0.06529 -0.04801 0.01435 0.2305 0.0005779 0.4935 -0.2362 0.02504 0.06895 -0.03655
0.1796 0.1617 0.05787 -0.07707 0.02035 -0.04257 -0.1533 -0.004989 -0.1726 0.427 -0.06306 -0.08069 0.01949
-0.2027 -0.2696 -0.02322 -0.05194 0.001799 0.06445 0.3156 0.02274 0.1676 -0.3912 0.9321 0.159 0.005724
-0.05804 0.03536 0.03448 0.07081 -0.01617 0.1041 0.4157 0.02553 0.2509 -0.3545 0.09938 0.6347 -0.02614
0.004366 0.008813 3.738e-05 0.001056 0.001054 -0.005621 -0.02075 -0.0006036 -0.01226 0.01737 -0.004586 -0.009052 0.0005957

Nearest neighbors on raw data:
Got 137 correct out of 178

Nearest neighbors on scaled data:
Got 169 correct out of 178

Nearest neighbors on nca data:
Got 177 correct out of 178

Programming

You can best understand how to use nca.hpp by reading the nca.cpp file.

Include the required headers:

#include "nca.hpp"
#include <Eigen/Core>

Initialize your data:

std::vector<Eigen::VectorXd> input;
std::vector<std::string> label;
int iterations = 100000;
double learning_rate = 0.001;

Add entries to input and label. The indexes for input and label should correspond with one another.

To get the transformation matrix:

Eigen::MatrixXd A = neighborhood_components_analysis(input, label, scaling_matrix(input), iterations, learning_rate);

scaling_matrix() returns a matrix that scales the input so that the difference between the minimum of a column and the maximum of a column are equivalent for all columns. It's supplied as the third argument because neighborhood components analysis requires an initial transformation matrix. The best initial matrix often treat all columns equally.

If you want the precision matrix (or inverse covariance matrix) for the Mahalanobis distance:

MatrixXd precision = A.tranpose()*A;

To rescale your input for nearest neighbors or support vector machines (rbf kernel):

std::vector<Eigen::VectorXd> nca_input(scale(A, input));

To test your matrix, you probably want to run nearest neighbors:

nearest_neighbors(nca_input, label);