LTI-Lib latest version v1.9 - last update 10 Apr 2010

lti::DBScan< T > Class Template Reference

A class providing the DBScan for clustering. More...

#include <ltiDBScan.h>

Inheritance diagram for lti::DBScan< T >:
Inheritance graph
[legend]
Collaboration diagram for lti::DBScan< T >:
Collaboration graph
[legend]

List of all members.

Classes

class  parameters
 the parameters for the class DBScan More...

Public Member Functions

 DBScan ()
 DBScan (const DBScan &other)
virtual ~DBScan ()
virtual const char * getTypeName () const
DBScancopy (const DBScan &other)
DBScanoperator= (const DBScan &other)
virtual classifierclone () const
const parametersgetParameters () const
virtual bool train (const dmatrix &input, ivector &ids)
virtual bool train (const dmatrix &input)
virtual bool classify (const dvector &feature, outputVector &result) const

Protected Member Functions

virtual bool dbScan (dKdTree &setOfPoints)
void cluster (dNode *nPtr, dKdTree *setOfPoints)
bool expandCluster (dKdTree *setOfPoints, dElement *point, int clusterID)
bool allElementsClustered (dNode *nPtr)

Protected Attributes

double eps
int minPts
int clusterID
dKdTree setOfPoints
int buckedSize

Detailed Description

template<class T = l2SquareDistantor<vector<double> >>
class lti::DBScan< T >

A class providing the DBScan for clustering.

DBScan is a density-based clustering algorithm, which is designed to find clusters of abitrary shape.

Using one point the algorithm searches for minPts in the eps-neighborhood of this point. If not enough points are found, they are classified as "NOISE" (Value = 0) else it tries to expand this found cluster by using each found point for another search and assigns the current cluster ID to each point. If then not enough points are found an new cluster is created using an not jet classified point. The clustering finishes if all points are clussified, that means either having a cluster ID or marked as "NOISE".

The original algorithm uses a R*-Tree as internal data struckture, this implementation uses the lti::kdTree. The trees provide an efficient method for searching points in an eps-neighborhood.

DBScan requires three parameters:

The eps is the critical parameter of this algorithm, because the number of clusters depends on it. A good value would be the density of the least dense cluster. But it is very hard to get this information on advance. Normally one does not know the distribution of the points in the space. If no cluster is found, all points are marked as noise!

The type T describes the distance measure used in the internal data structure. The Default value is l2SquareDistance

The original DBScan algorithm was introduced in:

Martin Ester, Hans-Peter Kriegel, Joerg Sander, Xiaowei Xu A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96)

URL: http://www.dbs.informatik.uni-muenchen.de/Publikationen/Papers/ KDD-96.final.frame.ps


Constructor & Destructor Documentation

template<class T = l2SquareDistantor<vector<double> >>
lti::DBScan< T >::DBScan (  ) 

default constructor reads and sets the parameters

template<class T = l2SquareDistantor<vector<double> >>
lti::DBScan< T >::DBScan ( const DBScan< T > &  other  ) 

copy constructor

Parameters:
other the object to be copied
template<class T = l2SquareDistantor<vector<double> >>
virtual lti::DBScan< T >::~DBScan (  )  [virtual]

destructor


Member Function Documentation

template<class T = l2SquareDistantor<vector<double> >>
bool lti::DBScan< T >::allElementsClustered ( dNode *  nPtr  )  [protected]

check if elements of node have been clustered, if the node is a leaf returns true if all elements are clusterd, else false

template<class T = l2SquareDistantor<vector<double> >>
virtual bool lti::DBScan< T >::classify ( const dvector feature,
outputVector result 
) const [virtual]

classify

Parameters:
feature vector to be classified
result result as described above
Returns:
true if successful, false otherwise

Implements lti::unsupervisedClassifier.

template<class T = l2SquareDistantor<vector<double> >>
virtual classifier* lti::DBScan< T >::clone (  )  const [virtual]

returns a pointer to a clone of this clustering.

Implements lti::classifier.

template<class T = l2SquareDistantor<vector<double> >>
void lti::DBScan< T >::cluster ( dNode *  nPtr,
dKdTree setOfPoints 
) [protected]

The dbScan algorithm.

walks rekursively through the tree, if one point ist found, the clustering is started

template<class T = l2SquareDistantor<vector<double> >>
DBScan& lti::DBScan< T >::copy ( const DBScan< T > &  other  ) 

copy data of "other" clustering.

Parameters:
other the clustering to be copied
Returns:
a reference to this clustering object

Reimplemented from lti::clustering.

template<class T = l2SquareDistantor<vector<double> >>
virtual bool lti::DBScan< T >::dbScan ( dKdTree setOfPoints  )  [protected, virtual]

DBScan algorithm working with the kdTree as data struckture.

Parameters:
setOfPoints a kd-Tree containing the data to be clustered
template<class T = l2SquareDistantor<vector<double> >>
bool lti::DBScan< T >::expandCluster ( dKdTree setOfPoints,
dElement *  point,
int  clusterID 
) [protected]

expandCluster tries to expand the cluster found, with respect of eps, minPts, density-reachability and desity-connectivity

Parameters:
setOfPoints pointer to the used kd-Tree
point one point/element of the data stored in the kd-Tree
clusterID the ID of the current cluster returns true if the cluster could be expanded
template<class T = l2SquareDistantor<vector<double> >>
const parameters& lti::DBScan< T >::getParameters (  )  const

returns used parameters

Reimplemented from lti::clustering.

template<class T = l2SquareDistantor<vector<double> >>
virtual const char* lti::DBScan< T >::getTypeName (  )  const [virtual]

returns the name of this type ("DBScan")

Reimplemented from lti::clustering.

template<class T = l2SquareDistantor<vector<double> >>
DBScan& lti::DBScan< T >::operator= ( const DBScan< T > &  other  ) 

alias for copy member

Parameters:
other the clustering to be copied
Returns:
a reference to this clustering object

Reimplemented from lti::classifier.

template<class T = l2SquareDistantor<vector<double> >>
virtual bool lti::DBScan< T >::train ( const dmatrix input  )  [virtual]

Unsupervised training.

The vectors in the input matrix will be clustered using each specific method. Train will return false, if no cluster is found.

Parameters:
input the matrix with the input vectors (each row is a training vector)
Returns:
true if successful, false otherwise. (if false you can check the error message with getStatusString())

Implements lti::clustering.

template<class T = l2SquareDistantor<vector<double> >>
virtual bool lti::DBScan< T >::train ( const dmatrix input,
ivector ids 
) [virtual]

Unsupervised training.

The vectors in the input matrix will be put into groups according to the training algorithm. Additionally, an integer indicating the class each point belongs to is returned.

By default this method uses the other train method train(const dmatrix&) and then calls classify(const dvector&,outputVector&) to get the ids for each train vector. These ids are then returned. Train will return false, if no cluster is found and all points are marked as noise!

Parameters:
input the matrix with the input vectors (each row is a training vector)
ids vector of class ids for each input point
Returns:
true if successful, false otherwise. (if false you can check the error message with getStatusString())

Reimplemented from lti::clustering.


Member Data Documentation

template<class T = l2SquareDistantor<vector<double> >>
int lti::DBScan< T >::buckedSize [protected]

bucked-size of the kd-tree number of elements stored in one leaf default value: 2

template<class T = l2SquareDistantor<vector<double> >>
int lti::DBScan< T >::clusterID [protected]

clusterID

template<class T = l2SquareDistantor<vector<double> >>
double lti::DBScan< T >::eps [protected]

eps neighbourhood

template<class T = l2SquareDistantor<vector<double> >>
int lti::DBScan< T >::minPts [protected]

minPts

template<class T = l2SquareDistantor<vector<double> >>
dKdTree lti::DBScan< T >::setOfPoints [protected]

kd-tree used for effective search on elements internal used data strukture


The documentation for this class was generated from the following file:

Generated on Sat Apr 10 15:26:49 2010 for LTI-Lib by Doxygen 1.6.1