|
latest version v1.9 - last update 24 Nov 2005 |
|
#include <ltiDBScan.h>
Inheritance diagram for lti::DBScan< T >:


Public Member Functions | |
| DBScan () | |
| DBScan (const DBScan &other) | |
| virtual | ~DBScan () |
| virtual const char * | getTypeName () const |
| DBScan & | copy (const DBScan &other) |
| DBScan & | operator= (const DBScan &other) |
| virtual classifier * | clone () const |
| const parameters & | getParameters () const |
| virtual bool | train (const dmatrix &input, ivector &ids) |
| virtual bool | train (const dmatrix &input) |
| virtual bool | classify (const dvector &feature, outputVector &result) const |
Protected Member Functions | |
| virtual bool | dbScan (dKdTree &setOfPoints) |
| void | cluster (dNode *nPtr, dKdTree *setOfPoints) |
| bool | expandCluster (dKdTree *setOfPoints, dElement *point, int clusterID) |
| bool | allElementsClustered (dNode *nPtr) |
Protected Attributes | |
| double | eps |
| int | minPts |
| int | clusterID |
| dKdTree | setOfPoints |
| int | buckedSize |
Classes | |
| class | parameters |
| the parameters for the class DBScan More... | |
DBScan is a density-based clustering algorithm, which is designed to find clusters of abitrary shape.
Using one point the algorithm searches for minPts in the eps-neighborhood of this point. If not enough points are found, they are classified as "NOISE" (Value = 0) else it tries to expand this found cluster by using each found point for another search and assigns the current cluster ID to each point. If then not enough points are found an new cluster is created using an not jet classified point. The clustering finishes if all points are clussified, that means either having a cluster ID or marked as "NOISE".
The original algorithm uses a R*-Tree as internal data struckture, this implementation uses the lti::kdTree. The trees provide an efficient method for searching points in an eps-neighborhood.
DBScan requires three parameters:
The eps is the critical parameter of this algorithm, because the number of clusters depends on it. A good value would be the density of the least dense cluster. But it is very hard to get this information on advance. Normally one does not know the distribution of the points in the space. If no cluster is found, all points are marked as noise!
The type T describes the distance measure used in the internal data structure. The Default value is l2SquareDistance
The original DBScan algorithm was introduced in:
Martin Ester, Hans-Peter Kriegel, Joerg Sander, Xiaowei Xu A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96)
URL: http://www.dbs.informatik.uni-muenchen.de/Publikationen/Papers/ KDD-96.final.frame.ps
|
|||||||||
|
default constructor reads and sets the parameters
|
|
||||||||||
|
copy constructor
|
|
|||||||||
|
destructor
|
|
||||||||||
|
check if elements of node have been clustered, if the node is a leaf returns true if all elements are clusterd, else false
|
|
||||||||||||||||
|
classify
Implements lti::unsupervisedClassifier. |
|
|||||||||
|
returns a pointer to a clone of this clustering.
Implements lti::classifier. |
|
||||||||||||||||
|
The dbScan algorithm. walks rekursively through the tree, if one point ist found, the clustering is started |
|
||||||||||
|
copy data of "other" clustering.
|
|
||||||||||
|
DBScan algorithm working with the kdTree as data struckture.
|
|
||||||||||||||||||||
|
expandCluster tries to expand the cluster found, with respect of eps, minPts, density-reachability and desity-connectivity
|
|
|||||||||
|
returns used parameters
Reimplemented from lti::clustering. |
|
|||||||||
|
returns the name of this type ("DBScan")
Reimplemented from lti::clustering. |
|
||||||||||
|
alias for copy member
|
|
||||||||||
|
Unsupervised training.
The vectors in the
Implements lti::clustering. |
|
||||||||||||||||
|
Unsupervised training.
The vectors in the By default this method uses the other train method train(const dmatrix&) and then calls classify(const dvector&,outputVector&) to get the ids for each train vector. These ids are then returned. Train will return false, if no cluster is found and all points are marked as noise!
Reimplemented from lti::clustering. |
|
|||||
|
bucked-size of the kd-tree number of elements stored in one leaf default value: 2
|
|
|||||
|
clusterID
|
|
|||||
|
eps neighbourhood
|
|
|||||
|
minPts
|
|
|||||
|
kd-tree used for effective search on elements internal used data strukture
|