LTI-Lib latest version v1.9 - last update 10 Apr 2010

lti::kNNClassifier Class Reference

Implements a k nearest neighbors search based classifier. More...

#include <ltiKNNClassifier.h>

Inheritance diagram for lti::kNNClassifier:
Inheritance graph
[legend]
Collaboration diagram for lti::kNNClassifier:
Collaboration graph
[legend]

List of all members.

Classes

class  parameters
 the parameters for the class kNNClassifier More...
struct  pointInfo
 Information about a feature point. More...

Public Member Functions

 kNNClassifier ()
 kNNClassifier (const kNNClassifier &other)
virtual ~kNNClassifier ()
virtual const char * getTypeName () const
kNNClassifiercopy (const kNNClassifier &other)
kNNClassifieroperator= (const kNNClassifier &other)
virtual classifierclone () const
const kNNClassifier::parametersgetParameters () const
virtual bool train (const dmatrix &input, const ivector &ids)
virtual bool train (const dmatrix &input, const ivector &ids, const ivector &pointIds)
virtual bool trainObject (const dmatrix &input, int &id)
virtual bool trainObject (const dmatrix &input, int &id, const ivector &pointIds)
virtual bool trainObjectId (const dmatrix &input, const int id)
virtual bool trainObjectId (const dmatrix &input, const int id, const ivector &pointIds)
virtual bool classify (const dvector &feature, outputVector &result) const
virtual bool classify (const dmatrix &features, outputVector &result) const
virtual bool classify (const dmatrix &features, dmatrix &result) const
int getColumnId (const int columnId) const
virtual bool classify (const dvector &feature, outputVector &result, std::vector< pointInfo > &points) const
virtual bool nearest (const dvector &feature, pointInfo &nearestPoint) const
virtual bool write (ioHandler &handler, const bool complete=true) const
virtual bool read (ioHandler &handler, const bool complete=true)
void build ()
void clear ()

Protected Types

typedef std::map< int, int > idMap_type
typedef kdTree< dvector,
std::pair< int, int > > 
treeType

Protected Member Functions

void buildIdMaps (const ivector &ids)
void defineOutputTemplate ()
bool classify (const dvector &feature, outputVector &output, std::multimap< double, treeType::element * > &resList) const

Static Protected Member Functions

Reliability weighting functions



static double linear (const double &ratio, const double &threshold)
static double exponential (const double &ratio, const double &threshold)

Protected Attributes

idMap_type idMap
idMap_type rIdMap
int nClasses
treeType databaseTree
std::vector< double > classWeight
int minPointsPerClass
int maxPointsPerClass

Detailed Description

Implements a k nearest neighbors search based classifier.

The simplest case of a k-nearest neighbor classifier is for k=1, also known as a nearest neighbor classifier, which assings as winner class for a test point x the one belonging to the nearest sample point.

For k>1, a k nearest neighbor classifier assigns to a point x the class most represented in the k nearest neighbors. In the simplest case, each of the k nearest sample points vote with the same weight for their respective classes. In more sophisticated cases, each point will vote with a weight depending on the total number of sample points of its class and/or the ratio between the distance of the test point to the winner sample and the distance of the test point to the first sample point belonging to another class.

At this point, only the use of Euclidean distance is allowed.

This classifier uses a kd-Tree to perform the search in an efficient manner, but shows therefore also the drawbacks of a normal kd-Tree: it is not suitable for higher dimensional spaces. If you use high-dimensional spaces, maybe you should try increasing the bucket size, or activating the best-bin-first mode, which is a suggestion of David Lowe to get an aproximative solution (accurate enough) in much less time.

This classificator differs a little bit from the other LTI-Lib classificators. Since the whole training set is stored as sample points, it is useful in many applications to obtain, besides the winner class, the exact winner samples. Therefore, this class administrates two sets of ID numbers. One set for the classe IDs, which is used the same way than all other LTI-Lib classifiers, and a second set that administrates IDs for each sample point. This second set can be explicitelly given, or generated automatically. You can then for example use some tables containing additional information for each winner point, that are accessed using the point ID.

Example:

 #include <iostream>
 #include <fstream>

 #include "ltiKNNClassifier.h"
 #include "ltiLispStreamHandler.h"

 // ...

 double inData[] = {-1,-1,
                    -1, 0,
                    -1,+1,
                    +0,+1,
                    +1,+1,
                    +1,+0,
                    +1,-1,
                    +0,-1,
                    +0,+0};

 lti::dmatrix inputs(9,2,inData);     // training vectors

 int idsData[] = {1,0,2,1,0,2,1,0,1}; // and the respective ids
 lti::ivector ids(9,idsData);

 lti::kNNClassifier knn;  // our nearest neighbor classifier

 lti::kNNClassifier::parameters param;
 param.useReliabilityMeasure = false;
 knn.setParameters(param);

 // we want to see some info while training
 streamProgressInfo prog(std::cout);
 knn.setProgressObject(prog);

 // train the network
 knn.train(inputs,ids);

 // let us save our network for future use
 // in the file called mlp.dat
 std::ofstream out("knn.dat");
 lti::lispStreamHandler lsh(out);

 // save the network
 knn.write(lsh);
 // close the file
 out.close();

 // show some results with the same training set:

 lti::kNNClassifier::outputVector outv; // here we will get some
                                        // classification results
 std::cout << std::endl << "Results: " << std::endl;

 int id;
 dvector sample(2,0.0);
 // generate some points and check which would be the winner class
 for (sample.at(0) = -1; sample.at(0) <= 1; sample.at(0)+=0.25) {
   for (sample.at(1) = -1; sample.at(1) <= 1; sample.at(1)+=0.25) {
 
     knn.classify(sample,outv);
     std::cout << "Input " << sample << " \tOutput: ";
     outv.getId(outv.getWinnerUnit(),id);
     std::cout << id;
     std::cout << std::endl;
   }
 }

Member Typedef Documentation

typedef std::map<int,int> lti::kNNClassifier::idMap_type [protected]

Type for maps mapping ids from internal to external and viceversa.

typedef kdTree< dvector, std::pair<int,int> > lti::kNNClassifier::treeType [protected]

Exact kd-Tree type used for the database.

The data in the tree is composed by two id numbers:

  • The first component contains the object or class id
  • The second component contains the point id, since using kNN it can be necessary to know exactly which points in the training set were the winners.

Constructor & Destructor Documentation

lti::kNNClassifier::kNNClassifier (  ) 

default constructor

lti::kNNClassifier::kNNClassifier ( const kNNClassifier other  ) 

copy constructor

Parameters:
other the object to be copied
virtual lti::kNNClassifier::~kNNClassifier (  )  [virtual]

destructor


Member Function Documentation

void lti::kNNClassifier::build (  ) 

Finish a training process.

If you used the methods trainObject() or trainObjectId() you must call this method in order to complete the training process.

If you used one of the train() methods, you must avoid calling this method.

Remember that the "incremental" training mode with trainObject() or trainObjectId() cannot be combined with a "at-once" training step using the method train().

See also:
trainObject(),trainObjectId,reset()
void lti::kNNClassifier::buildIdMaps ( const ivector ids  )  [protected]

mapping of the ids

bool lti::kNNClassifier::classify ( const dvector feature,
outputVector output,
std::multimap< double, treeType::element * > &  resList 
) const [protected]

Helper for classification.

virtual bool lti::kNNClassifier::classify ( const dvector feature,
outputVector result,
std::vector< pointInfo > &  points 
) const [virtual]

Classification.

Classifies the feature and returns the outputVector object with the classification result.

NOTE: This method is NOT really const. Although the main members of the kNNClassifier are not changed some state variables used for efficiency are. Thus, it is not save to use the same instance of the kNNClassifier in two different threads.

Parameters:
feature pattern to be classified
result of the classifications as a classifier::outputVector
points vector sorted in increasing order of the distances to the feature point and containing two ID numbers. The first one corresponds to the class id, the second one to the point id. Also a const pointer to the feature point of the train set and the distance to that point are contained in the pointInfo structure
Returns:
true if the classification has been successful
virtual bool lti::kNNClassifier::classify ( const dmatrix features,
dmatrix result 
) const [virtual]

Classification.

Classifies all features (the rows of the matrix) and returns for each of them a vector of unnormalized probabilities, coded in the rows of the result matrix.

Since no classifier::outputVector is constructed, only the classification "raw data" is produced.

This method is used in recognition tasks based on many local hints, for which the individual classification of each feature vector would cost too much time.

Each column of the output matrix represent one object. To get the id represented by a vector column you can use the outputTemplate of the classifier:

 kNNClassifier knn;

 knn.train(data);

 int columnId = knn.getOutputTemplate().getIds().at(columnNumber);

or the shortcut method of this class getColumnId()

NOTE: This method is NOT really const. Although the main members of the kNNClassifier are not changed some, state variables used for efficiency are. Thus, it is not save to use the same instance of the kNNClassifier in two different threads.

Parameters:
features patterns to be classified each row is one feature
result of the classifications as a classifier::outputVector
Returns:
true if the classification has been successful
virtual bool lti::kNNClassifier::classify ( const dmatrix features,
outputVector result 
) const [virtual]

Classification.

Classifies all features (the rows of the matrix) and returns the outputVector object with the classification result.

The classification will be the accumulation of the voting for all given points, assuming that they all belong to the same class.

NOTE: This method is NOT really const. Although the main members of the kNNClassifier are not changed some state variables used for efficiency are. Thus, it is not save to use the same instance of the kNNClassifier in two different threads.

Parameters:
features patterns to be classified each row is one feature
result of the classifications as a classifier::outputVector
Returns:
true if the classification has been successful
virtual bool lti::kNNClassifier::classify ( const dvector feature,
outputVector result 
) const [virtual]

Classification.

Classifies the feature and returns the outputVector object with the classification result.

NOTE: This method is NOT really const. Although the main members of the kNNClassifier are not changed some state variables used for efficiency are. Thus, it is not save to use the same instance of the kNNClassifier in two different threads.

Parameters:
feature pattern to be classified
result of the classifications as a classifier::outputVector
Returns:
true if the classification has been successful

Implements lti::supervisedInstanceClassifier.

void lti::kNNClassifier::clear (  ) 

Resets all values and deletes the content.

If you want to forget the sample points and start giving new points with trainObject, you need to call this method first

virtual classifier* lti::kNNClassifier::clone (  )  const [virtual]

returns a pointer to a clone of this clustering.

Implements lti::classifier.

kNNClassifier& lti::kNNClassifier::copy ( const kNNClassifier other  ) 

copy data of "other" clustering.

Parameters:
other the clustering to be copied
Returns:
a reference to this clustering object

Reimplemented from lti::supervisedInstanceClassifier.

void lti::kNNClassifier::defineOutputTemplate (  )  [protected]

Define the output template.

int lti::kNNClassifier::getColumnId ( const int  columnId  )  const [inline]

Shortcut method to comfortably access the object id for the column of the result matrix of the classify(const dmatrix&,dmatrix&) method.

It returns a negative value if the input column index is invalid.

References lti::genericVector< T >::at(), lti::classifier::outputTemplate::getIds(), lti::classifier::getOutputTemplate(), and nClasses.

const kNNClassifier::parameters& lti::kNNClassifier::getParameters (  )  const

returns used parameters

Reimplemented from lti::supervisedInstanceClassifier.

virtual const char* lti::kNNClassifier::getTypeName (  )  const [virtual]

returns the name of this type ("kNNClassifier")

Reimplemented from lti::supervisedInstanceClassifier.

virtual bool lti::kNNClassifier::nearest ( const dvector feature,
pointInfo nearestPoint 
) const [virtual]

Get only the nearest point to the given vector.

Sometimes it is not necessary to have the probability distribution for the objects computed with the classify() methods. Only the nearest point can be of interest. This method provides an efficient way to just search for the nearest point and obtain its data.

Parameters:
feature reference multidimensional point
nearestPoint nearest point in the training set to the presented point.
Returns:
true if search was successful, false otherwise.
kNNClassifier& lti::kNNClassifier::operator= ( const kNNClassifier other  ) 

alias for copy member

Parameters:
other the clustering to be copied
Returns:
a reference to this clustering object

Reimplemented from lti::supervisedInstanceClassifier.

virtual bool lti::kNNClassifier::read ( ioHandler handler,
const bool  complete = true 
) [virtual]

read the classifier from the given ioHandler

Parameters:
handler the ioHandler to be used
complete if true (the default) the enclosing begin/end will be also written, otherwise only the data block will be written.
Returns:
true if write was successful

Reimplemented from lti::classifier.

virtual bool lti::kNNClassifier::train ( const dmatrix input,
const ivector ids,
const ivector pointIds 
) [virtual]

Supervised training.

The vectors in the input matrix are arranged row-wise, i.e. each row contains one data vector. The ids vector contains the class label for each row.

This is an alternative method to trainObject(). You cannot add further objects after you have called train(), nor can you call train() after calling trainObject(), since all data provided with trainObject() would be removed. In other words, you must decide if you want to supply all objects separately or if you want to give them all simultaneously, but you cannot combine both methods.

As ids for each feature point the index of the corresponding values will be used.

Parameters:
input the matrix with the input vectors (each row is a training vector)
ids vector of class ids for each input point
pointIds vector containing the ids for each single feature point of the training set.
Returns:
true if successful, false otherwise. (if false you can check the error message with getStatusString())
virtual bool lti::kNNClassifier::train ( const dmatrix input,
const ivector ids 
) [virtual]

Supervised training.

The vectors in the input matrix are arranged row-wise, i.e. each row contains one data vector. The ids vector contains the class label for each row.

This is an alternative method to trainObject(). You cannot add further objects after you have called train(), nor can you call train() after calling trainObject(), since all data provided with trainObject() would be removed. In other words, you must decide if you want to supply all objects separately or if you want to give them all simultaneously, but you cannot combine both methods.

As ids for each feature point the index of the corresponding matrix row will be used.

Parameters:
input the matrix with the input vectors (each row is a training vector)
ids vector of class ids for each input point
Returns:
true if successful, false otherwise. (if false you can check the error message with getStatusString())

Implements lti::supervisedInstanceClassifier.

virtual bool lti::kNNClassifier::trainObject ( const dmatrix input,
int &  id,
const ivector pointIds 
) [virtual]

Adds an object to this classifier.

The id is given automatically and returned in the parameter.

After you have trained several objects, you must call the build() method to finish the training process. If you don't do it, the classifier will ignore everything you have provided.

This is an alternative method to train(). You cannot add further objects after you have called train, nor can you call train() after calling this method, since all data provided with trainObject would be removed. In other words, you must decide if you want to supply all objects separately or if you want to give them all simultaneously, but you cannot combine both methods.

Note that the difference with the method trainObjectId() is that here you will receive as a returned argument the id assigned to the object, while in the method trainObjectId() you decide which id should be used for the given object.

Parameters:
input each row of this matrix represents a point in the feature space belonging to one single class.
id this id will be used for the class represented by the points in the input matrix.
pointIds each point in the input matrix will have its own id given by the entries of this vector, which must have a size equal to the number of rows of input.
Returns:
true if successful, false otherwise.
virtual bool lti::kNNClassifier::trainObject ( const dmatrix input,
int &  id 
) [virtual]

Adds an object to this classifier.

The id is given automatically and returned in the parameter.

After you have trained several objects, you must call the build() method to finish the training process. If you don't do it, the classifier will ignore everything you have provided.

This is an alternative method to train(). You cannot add further objects after you have called train, nor can you call train() after calling this method, since all data provided with trainObject would be removed. In other words, you must decide if you want to supply all objects separately or if you want to give them all simultaneously, but you cannot combine both methods.

Note that the difference with the method trainObjectId() is that here you receive as as returned argument the id assigned to the object, while in the method trainObjectId() you decide which id should be used for the given object.

As id for each point in the given matrix will be used the row index plus the number of points trained until now, i.e. just the successive numeration of each sample point will be continued.

Parameters:
input each row of this matrix represents a point in the feature space belonging to one single class.
id this id will be used for the class represented by the points in the input matrix.
Returns:
true if successful, false otherwise.
virtual bool lti::kNNClassifier::trainObjectId ( const dmatrix input,
const int  id,
const ivector pointIds 
) [virtual]

Adds an object to this classifier.

The object ID is given by the user.

After you have trained several objects, you must call the build() method to finish the training process. If you don't do it, the classifier will ignore everything you have provided.

This is an alternative method to train(). You cannot add further objects after you have called train, nor can you call train() after calling this method, since all data provided with trainObject would be removed. In other words, you must decide if you want to supply all objects separately or if you want to give them all simultaneously, but you cannot combine both methods.

Note that the difference with the method trainObject() is that here you directly provide the id to be used for the object, while the trainObject() method returns one id that is computed automatically.

Parameters:
input each row of this matrix represents a point in the feature space belonging to one single class.
id this id will be used for the class represented by the points in the input matrix.
pointIds each point in the input matrix will have its own ID, given by the entries in this vector, which must have the same size than the number of rows of input.
Returns:
true if successful, false otherwise.
virtual bool lti::kNNClassifier::trainObjectId ( const dmatrix input,
const int  id 
) [virtual]

Adds an object to this classifier.

The object ID is given by the user.

After you have trained several objects, you must call the build() method to finish the training process. If you don't do it, the classifier will ignore everything you have provided.

This is an alternative method to train(). You cannot add further objects after you have called train, nor can you call train() after calling this method, since all data provided with trainObject would be removed. In other words, you must decide if you want to supply all objects separately or if you want to give them all simultaneously, but you cannot combine both methods.

Note that the difference with the method trainObject() is that here you directly provide the id to be used for the object, while the trainObject() method returns one id that is computed automatically.

As id for each point in the given matrix will be used the row index plus the number of points trained until now, i.e. just the successive numeration of each sample point will be continued.

Parameters:
input each row of this matrix represents a point in the feature space belonging to one single class.
id this id will be used for the class represented by the points in the input matrix.
Returns:
true if successful, false otherwise.
virtual bool lti::kNNClassifier::write ( ioHandler handler,
const bool  complete = true 
) const [virtual]

write the classifier in the given ioHandler

Parameters:
handler the ioHandler to be used
complete if true (the default) the enclosing begin/end will be also written, otherwise only the data block will be written.
Returns:
true if write was successful

Reimplemented from lti::classifier.


Member Data Documentation

std::vector<double> lti::kNNClassifier::classWeight [protected]

Optionally, a scalar weight for each can be applied, as a-priori value.

The std::vector is used due to the push_back interface.

It is accessed with the internal id.

The database with accelerated nearest neighbor search.

The kdTree uses as n-dimensional points dvectors and as data requires a std::pair containing the class id and the point id.

Map from external id to internal id, used for training.

Maximum number of points per class.

This attribute is valid only after the complete training process

Minimum number of points per class.

This attribute is valid only after the complete training process

Number of classes currently in the classifier.

Referenced by getColumnId().

Map from internal to external id, used for training.


The documentation for this class was generated from the following file:

Generated on Sat Apr 10 15:26:52 2010 for LTI-Lib by Doxygen 1.6.1