latest version v1.9 - last update 10 Apr 2010 |
The Earth Mover's Distance (EMD) is a robust distance measure between two distributions. More...
#include <ltiEarthMoversDistance.h>
Classes | |
struct | double_node |
type for a double linked list More... | |
class | parameters |
the parameters for the class earthMoversDistance More... | |
struct | single_node |
type for a single linked list More... | |
Public Types | |
typedef C::value_type | cluster_value_type |
Public Member Functions | |
earthMoversDistance () | |
earthMoversDistance (const parameters &par) | |
earthMoversDistance (const earthMoversDistance &other) | |
virtual | ~earthMoversDistance () |
virtual const char * | getTypeName () const |
earthMoversDistance & | copy (const earthMoversDistance &other) |
earthMoversDistance & | operator= (const earthMoversDistance &other) |
virtual functor * | clone () const |
const parameters & | getParameters () const |
virtual bool | apply (const vector< W > &v, W &norm) const |
virtual W | apply (const vector< W > &v) const |
virtual bool | apply (const matrix< W > &m, vector< W > &norms) const |
virtual bool | apply (const matrix< W > &m, W &norm) const |
virtual W | apply (const matrix< W > &m) const |
virtual bool | apply (const vector< W > &a, const vector< W > &b, W &dist) const |
virtual bool | apply (const vector< W > &a, const vector< W > &b, W &dist, matrix< W > &flow) const |
virtual W | apply (const vector< W > &a, const vector< W > &b) const |
virtual bool | apply (const matrix< W > &a, const matrix< W > &b, vector< W > &dists) const |
virtual bool | apply (const matrix< W > &m, const vector< W > &v, vector< W > &dest) const |
virtual bool | apply (const matrix< W > &a, const matrix< W > &b, W &dist) const |
virtual bool | apply (const matrix< W > &a, const matrix< W > &b, W &dist, matrix< W > &flow) const |
virtual W | apply (const matrix< W > &a, const matrix< W > &b) const |
virtual bool | apply (const std::vector< C > &clusters1, const std::vector< C > &clusters2, const vector< W > &weights1, const vector< W > &weights2, W &dist) const |
virtual bool | apply (const std::vector< C > &clusters1, const std::vector< C > &clusters2, const vector< W > &weights1, const vector< W > &weights2, W &dist, matrix< W > &flow) const |
virtual W | apply (const std::vector< C > &clusters1, const std::vector< C > &clusters2, const vector< W > &weights1, const vector< W > &weights2) const |
virtual bool | apply (const matrix< cluster_value_type > &clusters1, const matrix< cluster_value_type > &clusters2, const vector< W > &weights1, const vector< W > &weights2, W &dist) const |
virtual bool | apply (const matrix< cluster_value_type > &clusters1, const matrix< cluster_value_type > &clusters2, const vector< W > &weights1, const vector< W > &weights2, W &dist, matrix< W > &flow) const |
virtual W | apply (const matrix< cluster_value_type > &clusters1, const matrix< cluster_value_type > &clusters2, const vector< W > &weights1, const vector< W > &weights2) const |
Protected Types | |
typedef struct lti::earthMoversDistance::single_node | single_node |
typedef struct lti::earthMoversDistance::double_node | double_node |
Protected Member Functions | |
bool | init (vector< W > &weights1, vector< W > &weights2) const |
bool | calcCost (const ivector &supply, const ivector &demand) const |
bool | calcCost (const vector< point > &supply, const vector< point > &demand) const |
bool | calcCost (const std::vector< C > &supply, const std::vector< C > &demand) const |
bool | calcCost (const matrix< cluster_value_type > &supply, const matrix< cluster_value_type > &demand) const |
bool | emd (vector< W > &weights1, vector< W > &weights2, const int &slen, const int &dlen, W &totalFlow) const |
bool | russel (vector< W > &weights1, vector< W > &weights2) const |
void | addBasicVariable (int minI, int minJ, vector< W > &weights1, vector< W > &weights2, single_node *PrevUMinI, single_node *PrevVMinJ, single_node *UHead) const |
void | findBasicVariables (single_node *U, single_node *V) const |
int | isOptimal (single_node *U, single_node *V) const |
void | newSolution () const |
int | findLoop (double_node **Loop) const |
bool | findFlow (const int &slen, const int &dlen, matrix< W > &flow) const |
Protected Attributes | |
int | sz1 |
int | sz2 |
matrix< W > | costMatrix |
W | maxCost |
W | maxWeight |
W | minWeight |
double_node * | basicVars |
double_node ** | distr1Basic |
double_node ** | distr2Basic |
double_node * | endBasicVars |
double_node * | enterBasicVars |
genericMatrix< bool > | isBasicVar |
The Earth Mover's Distance (EMD) is a robust distance measure between two distributions.
Each distribution has a number of clusters n,m with corresponding weights. There is a ground distance between two clusters which specifies the cost of moving some weight from one cluster to the other. The EMD measures the relative total cost of transforming one distribution into the other by moving the weights around. This is done by solving the transportation problem.
Template type C specifies the type of the Cluster centers, e.g. dvector or rgbPixel. Type W is the type of vector<W> which is used for the weights of the clusters and template type D specifies the ground distance to be used for the clusters. It must be a Distantor<C> (see e.g. l2Distantor<T>. Note that instead of this calculation the cost matrix can be given in parameters::costMatrix.
There are four types of applys:
The first two options are formulated in this way to achieve compliance with the definition of distanceFunctor<T>. The cluster centers used for the calculation of the cost matrix are taken to be the indices of the vector/matrix. If this is not desired a cost matrix must be given in the parameters since the Distantor D is not used (due to possible type conflicts). Apply-types 3 and 4 use cluster centers and a weights vector. The cluster centers are defined in two ways for convenience. Type 3 is more practical for cluster centers that are C= vector<T> -like. Then the matrix<T> contains cluster centers in the rows. For C=rgbPixel type 4 is more convenient than first constructing a matrix<ubyte> for type 3.
Depending on the apply method used the flow between the two distributions is also returned. The first distribution is indexed in the rows the second in columns. Assuming to clusters for the first and three clusters for the second distribution a flow matrix could look like this:
0.3 0.1 0.4 0 0 0.2
From the first cluster of distribution1 there is a flow of 0.3 to the first cluster, 0.1 to the second and 0.4 to the third cluster of distribution2. From the second cluster of distribution1 there is a flow of 0.2 to the third cluster of distribution2. From the flow matrix we can see that the weight vectors were [0.8 0.2] and [0.3 0.1 0.6] for distributions one and two, respectively. The total cost is calculated by summing up the products between each flow and its cost.
Note that neither the number of clusters nor the sum of weights for the to distributions have to be equal. However, this gives probably unwanted results for distances between vectors and matrices.
Unlike other distanceFunctor<T> classes the EMD does not define a norm on a distribution. All corresponding apply methods throw an InvalidMethod exception.
Note: This functor is not multi-thread safe!
The core of this functor was implemented by Yossi Rubner, Computer Science Department, Stanford University, rubner@cs.stanford.edu, http://vision.stanford.edu/~rubner.
Encapsulation and template support originally designed by Peter Doerfler, LTI, RWTH-Aachnen.
typedef C::value_type lti::earthMoversDistance< W, C, D >::cluster_value_type |
value type of the cluster type C
typedef struct lti::earthMoversDistance::double_node lti::earthMoversDistance< W, C, D >::double_node [protected] |
type for a double linked list
typedef struct lti::earthMoversDistance::single_node lti::earthMoversDistance< W, C, D >::single_node [protected] |
type for a single linked list
lti::earthMoversDistance< W, C, D >::earthMoversDistance | ( | ) |
default constructor
lti::earthMoversDistance< W, C, D >::earthMoversDistance | ( | const parameters & | par | ) |
Construct a functor using the given parameters.
lti::earthMoversDistance< W, C, D >::earthMoversDistance | ( | const earthMoversDistance< W, C, D > & | other | ) |
copy constructor
other | the object to be copied |
virtual lti::earthMoversDistance< W, C, D >::~earthMoversDistance | ( | ) | [virtual] |
destructor
void lti::earthMoversDistance< W, C, D >::addBasicVariable | ( | int | minI, | |
int | minJ, | |||
vector< W > & | weights1, | |||
vector< W > & | weights2, | |||
single_node * | PrevUMinI, | |||
single_node * | PrevVMinJ, | |||
single_node * | UHead | |||
) | const [protected] |
Helper method for russel.
virtual W lti::earthMoversDistance< W, C, D >::apply | ( | const matrix< cluster_value_type > & | clusters1, | |
const matrix< cluster_value_type > & | clusters2, | |||
const vector< W > & | weights1, | |||
const vector< W > & | weights2 | |||
) | const [inline, virtual] |
calculate the EMD between to distributions each defined by its clusters and the respective weights.
clusters1 | each row of the matrix contains a centroid of a cluster of distribution 1 | |
clusters2 | each row of the matrix contains a centroid of a cluster of distribution 2 | |
weights1 | the weights corresponding to clusters1 | |
weights2 | the weights corresponding to clusters2 |
References lti::earthMoversDistance< W, C, D >::apply().
virtual bool lti::earthMoversDistance< W, C, D >::apply | ( | const matrix< cluster_value_type > & | clusters1, | |
const matrix< cluster_value_type > & | clusters2, | |||
const vector< W > & | weights1, | |||
const vector< W > & | weights2, | |||
W & | dist, | |||
matrix< W > & | flow | |||
) | const [inline, virtual] |
calculate the EMD and the flow between to distributions each defined by its clusters and the respective weights.
clusters1 | each row of the matrix contains a centroid of a cluster of distribution 1 | |
clusters2 | each row of the matrix contains a centroid of a cluster of distribution 2 | |
weights1 | the weights corresponding to clusters1 | |
weights2 | the weights corresponding to clusters2 | |
dist | the EMD between the distributions | |
flow | the flow matrix for the two distributions |
References lti::earthMoversDistance< W, C, D >::apply(), lti::earthMoversDistance< W, C, D >::findFlow(), and lti::genericVector< T >::size().
virtual bool lti::earthMoversDistance< W, C, D >::apply | ( | const matrix< cluster_value_type > & | clusters1, | |
const matrix< cluster_value_type > & | clusters2, | |||
const vector< W > & | weights1, | |||
const vector< W > & | weights2, | |||
W & | dist | |||
) | const [virtual] |
calculate the EMD between to distributions each defined by its clusters and the respective weights.
clusters1 | each row of the matrix contains a centroid of a cluster of distribution 1 | |
clusters2 | each row of the matrix contains a centroid of a cluster of distribution 2 | |
weights1 | the weights corresponding to clusters1 | |
weights2 | the weights corresponding to clusters2 | |
dist | the EMD between the distributions |
virtual W lti::earthMoversDistance< W, C, D >::apply | ( | const std::vector< C > & | clusters1, | |
const std::vector< C > & | clusters2, | |||
const vector< W > & | weights1, | |||
const vector< W > & | weights2 | |||
) | const [inline, virtual] |
calculate the EMD between to distributions each defined by its clusters and the respective weights.
clusters1 | each element of the vector contains a centroid of a cluster of distribution 1 | |
clusters2 | each element of the vector contains a centroid of a cluster of distribution 2 | |
weights1 | the weights corresponding to clusters1 | |
weights2 | the weights corresponding to clusters2 |
References lti::earthMoversDistance< W, C, D >::apply().
virtual bool lti::earthMoversDistance< W, C, D >::apply | ( | const std::vector< C > & | clusters1, | |
const std::vector< C > & | clusters2, | |||
const vector< W > & | weights1, | |||
const vector< W > & | weights2, | |||
W & | dist, | |||
matrix< W > & | flow | |||
) | const [inline, virtual] |
calculate the EMD and the flow between to distributions each defined by its clusters and the respective weights.
clusters1 | each element of the vector contains a centroid of a cluster of distribution 1 | |
clusters2 | each element of the vector contains a centroid of a cluster of distribution 2 | |
weights1 | the weights corresponding to clusters1 | |
weights2 | the weights corresponding to clusters2 | |
dist | the EMD between the distributions | |
flow | the flow matrix for the two distributions |
References lti::earthMoversDistance< W, C, D >::apply(), lti::earthMoversDistance< W, C, D >::findFlow(), and lti::genericVector< T >::size().
virtual bool lti::earthMoversDistance< W, C, D >::apply | ( | const std::vector< C > & | clusters1, | |
const std::vector< C > & | clusters2, | |||
const vector< W > & | weights1, | |||
const vector< W > & | weights2, | |||
W & | dist | |||
) | const [virtual] |
calculate the EMD between to distributions each defined by its clusters and the respective weights.
clusters1 | each element of the vector contains a centroid of a cluster of distribution 1 | |
clusters2 | each element of the vector contains a centroid of a cluster of distribution 2 | |
weights1 | the weights corresponding to clusters1 | |
weights2 | the weights corresponding to clusters2 | |
dist | the EMD between the distributions |
virtual W lti::earthMoversDistance< W, C, D >::apply | ( | const matrix< W > & | a, | |
const matrix< W > & | b | |||
) | const [inline, virtual] |
calculate something like the distance between the matrices a and b: both matrices are seen as vectors.
Implements lti::distanceFunctor< W >.
References lti::earthMoversDistance< W, C, D >::apply().
virtual bool lti::earthMoversDistance< W, C, D >::apply | ( | const matrix< W > & | a, | |
const matrix< W > & | b, | |||
W & | dist, | |||
matrix< W > & | flow | |||
) | const [inline, virtual] |
calculate something like the distance and the flow between the matrices a and b: both matrices are seen as vectors.
a | the first matrix<W> | |
b | the second matrix<W> | |
dist | the 'distance' between the matrices | |
flow | the flow matrix for a and b |
References lti::earthMoversDistance< W, C, D >::apply(), lti::genericMatrix< T >::columns(), lti::earthMoversDistance< W, C, D >::findFlow(), and lti::genericMatrix< T >::rows().
virtual bool lti::earthMoversDistance< W, C, D >::apply | ( | const matrix< W > & | a, | |
const matrix< W > & | b, | |||
W & | dist | |||
) | const [virtual] |
calculate something like the distance between the matrices a and b: both matrices are seen as vectors.
Implements lti::distanceFunctor< W >.
virtual bool lti::earthMoversDistance< W, C, D >::apply | ( | const matrix< W > & | m, | |
const vector< W > & | v, | |||
vector< W > & | dest | |||
) | const [virtual] |
virtual bool lti::earthMoversDistance< W, C, D >::apply | ( | const matrix< W > & | a, | |
const matrix< W > & | b, | |||
vector< W > & | dists | |||
) | const [virtual] |
calculate the distances between the rows or columns of the matrices a and b, determined by the parameters rowWise.
a | the first vector<W> | |
b | the second vector<W> | |
dists | the distances between the matrices |
Implements lti::distanceFunctor< W >.
virtual W lti::earthMoversDistance< W, C, D >::apply | ( | const vector< W > & | a, | |
const vector< W > & | b | |||
) | const [inline, virtual] |
calculate the distance between the vectors a and b
a | the first vector<W> | |
b | the second vector<W> |
Implements lti::distanceFunctor< W >.
References lti::earthMoversDistance< W, C, D >::apply().
virtual bool lti::earthMoversDistance< W, C, D >::apply | ( | const vector< W > & | a, | |
const vector< W > & | b, | |||
W & | dist, | |||
matrix< W > & | flow | |||
) | const [inline, virtual] |
calculate the distance and the flow between the vectors a and b.
a | the first vector<W> | |
b | the second vector<W> | |
dist | the distance between the vectors | |
flow | the flow matrix for a and b |
References lti::earthMoversDistance< W, C, D >::apply(), lti::earthMoversDistance< W, C, D >::findFlow(), and lti::genericVector< T >::size().
virtual bool lti::earthMoversDistance< W, C, D >::apply | ( | const vector< W > & | a, | |
const vector< W > & | b, | |||
W & | dist | |||
) | const [virtual] |
calculate the distance between the vectors a and b
a | the first vector<W> | |
b | the second vector<W> | |
dist | the distance between the vectors |
Implements lti::distanceFunctor< W >.
virtual W lti::earthMoversDistance< W, C, D >::apply | ( | const matrix< W > & | m | ) | const [inline, virtual] |
Calculation of norms not available for Earth Mover's Distance!
lti::functor::invalidMethodException |
Implements lti::distanceFunctor< W >.
virtual bool lti::earthMoversDistance< W, C, D >::apply | ( | const matrix< W > & | m, | |
W & | norm | |||
) | const [inline, virtual] |
Calculation of norms not available for Earth Mover's Distance!
lti::functor::invalidMethodException |
Implements lti::distanceFunctor< W >.
virtual bool lti::earthMoversDistance< W, C, D >::apply | ( | const matrix< W > & | m, | |
vector< W > & | norms | |||
) | const [inline, virtual] |
Calculation of norms not available for Earth Mover's Distance!
lti::functor::invalidMethodException |
Implements lti::distanceFunctor< W >.
virtual W lti::earthMoversDistance< W, C, D >::apply | ( | const vector< W > & | v | ) | const [inline, virtual] |
Calculation of norms not available for Earth Mover's Distance!
lti::functor::invalidMethodException |
Implements lti::distanceFunctor< W >.
virtual bool lti::earthMoversDistance< W, C, D >::apply | ( | const vector< W > & | v, | |
W & | norm | |||
) | const [inline, virtual] |
Calculation of norms not available for Earth Mover's Distance!
lti::functor::invalidMethodException |
Implements lti::distanceFunctor< W >.
Referenced by lti::earthMoversDistance< W, C, D >::apply().
bool lti::earthMoversDistance< W, C, D >::calcCost | ( | const matrix< cluster_value_type > & | supply, | |
const matrix< cluster_value_type > & | demand | |||
) | const [protected] |
Finds the costMatrix for the given supply and demand.
bool lti::earthMoversDistance< W, C, D >::calcCost | ( | const std::vector< C > & | supply, | |
const std::vector< C > & | demand | |||
) | const [protected] |
Finds the costMatrix for the given supply and demand.
bool lti::earthMoversDistance< W, C, D >::calcCost | ( | const vector< point > & | supply, | |
const vector< point > & | demand | |||
) | const [protected] |
Finds the costMatrix for the given supply and demand.
bool lti::earthMoversDistance< W, C, D >::calcCost | ( | const ivector & | supply, | |
const ivector & | demand | |||
) | const [protected] |
Finds the costMatrix for the given supply and demand.
virtual functor* lti::earthMoversDistance< W, C, D >::clone | ( | ) | const [virtual] |
returns a pointer to a clone of this functor.
Implements lti::distanceFunctor< W >.
earthMoversDistance& lti::earthMoversDistance< W, C, D >::copy | ( | const earthMoversDistance< W, C, D > & | other | ) |
copy data of "other" functor.
other | the functor to be copied |
Reimplemented from lti::distanceFunctor< W >.
bool lti::earthMoversDistance< W, C, D >::emd | ( | vector< W > & | weights1, | |
vector< W > & | weights2, | |||
const int & | slen, | |||
const int & | dlen, | |||
W & | totalFlow | |||
) | const [protected] |
Actual management of the calculation of the emd.
Parameters are the two weights vectors as well as their original lengths. The total flow is returned.
void lti::earthMoversDistance< W, C, D >::findBasicVariables | ( | single_node * | U, | |
single_node * | V | |||
) | const [protected] |
Extracts the basic variables from isBasisVar.
bool lti::earthMoversDistance< W, C, D >::findFlow | ( | const int & | slen, | |
const int & | dlen, | |||
matrix< W > & | flow | |||
) | const [protected] |
Constructs a matrix containing the flow from distribution1 to distribution2.
Where the first is in rows and the second in columns.
Referenced by lti::earthMoversDistance< W, C, D >::apply().
int lti::earthMoversDistance< W, C, D >::findLoop | ( | double_node ** | Loop | ) | const [protected] |
Tries to find a loop in the current solution.
const parameters& lti::earthMoversDistance< W, C, D >::getParameters | ( | ) | const |
returns used parameters
Reimplemented from lti::distanceFunctor< W >.
virtual const char* lti::earthMoversDistance< W, C, D >::getTypeName | ( | ) | const [virtual] |
returns the name of this type ("earthMoversDistance")
Reimplemented from lti::distanceFunctor< W >.
bool lti::earthMoversDistance< W, C, D >::init | ( | vector< W > & | weights1, | |
vector< W > & | weights2 | |||
) | const [protected] |
inits data structures.
returns true if the sum of weights (earth) is significantly different in the two distributions. If this is the case the weight vector with less weight gets an extra entry with the difference.
int lti::earthMoversDistance< W, C, D >::isOptimal | ( | single_node * | U, | |
single_node * | V | |||
) | const [protected] |
Checks if the current solution is optimal.
void lti::earthMoversDistance< W, C, D >::newSolution | ( | ) | const [protected] |
Find a new/better solution to the transportation problem.
earthMoversDistance& lti::earthMoversDistance< W, C, D >::operator= | ( | const earthMoversDistance< W, C, D > & | other | ) |
bool lti::earthMoversDistance< W, C, D >::russel | ( | vector< W > & | weights1, | |
vector< W > & | weights2 | |||
) | const [protected] |
Initial solution of the transportation problem on the given weights.
double_node* lti::earthMoversDistance< W, C, D >::basicVars [mutable, protected] |
Datastructure containing the basic variables.
matrix<W> lti::earthMoversDistance< W, C, D >::costMatrix [mutable, protected] |
The cost matrix.
If parameters::costMatrix is used, this matrix uses the same data, else the costMatrix is calculated using the ground distance D.
double_node** lti::earthMoversDistance< W, C, D >::distr1Basic [mutable, protected] |
Pointers from first distribution into the basicVars array.
double_node** lti::earthMoversDistance< W, C, D >::distr2Basic [mutable, protected] |
Pointers from second distribution into the basicVars array.
double_node* lti::earthMoversDistance< W, C, D >::endBasicVars [mutable, protected] |
pointer for efficient handling of basicVars
double_node* lti::earthMoversDistance< W, C, D >::enterBasicVars [mutable, protected] |
pointer for efficient handling of basicVars
genericMatrix<bool> lti::earthMoversDistance< W, C, D >::isBasicVar [mutable, protected] |
If true the two clusters i and j of the two distributions form a basic variable.
W lti::earthMoversDistance< W, C, D >::maxCost [mutable, protected] |
The maximum value in the costMatrix.
W lti::earthMoversDistance< W, C, D >::maxWeight [mutable, protected] |
The maximum of the total weights of supply and demand.
W lti::earthMoversDistance< W, C, D >::minWeight [mutable, protected] |
The minimum of the total weights of supply and demand.
int lti::earthMoversDistance< W, C, D >::sz1 [mutable, protected] |
the number of clusters in the first distribution
int lti::earthMoversDistance< W, C, D >::sz2 [mutable, protected] |
the number of clusters in the second distribution