LTI-Lib latest version v1.9 - last update 10 Apr 2010

lti::classifier::outputTemplate Class Reference

The outputTemplate stores the relation between the different positions (sometimes called internal ids) of a classification result and the ids. More...

#include <ltiClassifier.h>

Inheritance diagram for lti::classifier::outputTemplate:
Inheritance graph
[legend]
Collaboration diagram for lti::classifier::outputTemplate:
Collaboration graph
[legend]

List of all members.

Public Types

enum  eMultipleMode { Ignore = 0, Max, Uniform, ObjProb }

Public Member Functions

 outputTemplate ()
 outputTemplate (const outputTemplate &other)
 outputTemplate (const ivector &theIds)
 outputTemplate (const int &size)
outputTemplatecopy (const outputTemplate &other)
outputTemplateoperator= (const outputTemplate &other)
outputTemplateclone () const
void setMultipleMode (const eMultipleMode &mode)
eMultipleMode getMultipleMode () const
bool setIds (const ivector &theIds)
const ivectorgetIds () const
bool setProbs (const int &pos, const ivector &theIds, const dvector &theValues)
bool setProbs (const int &pos, const outputVector &outV)
const outputVectorgetProbs (const int &pos) const
bool setData (const int &pos, const int &realId, const outputVector &outV)
int size () const
bool apply (const dvector &data, outputVector &result) const
bool write (ioHandler &handler, const bool complete=true) const
bool read (ioHandler &handler, const bool complete=true)

Protected Attributes

eMultipleMode multipleMode
std::vector< outputVectorprobList
ivector defaultIds

Detailed Description

The outputTemplate stores the relation between the different positions (sometimes called internal ids) of a classification result and the ids.

Applying the outputTemplate to such a vector results in an outputVector which is not to be confused with the classification result.

There are two data structures within the outputTemplate storing the relevant data:

The calculation of the outputVector using the apply method depends on the value of the parameter multipleType, which is of type eMultipleType. The following settings are available:

Ignore
If default ids have been stored in the outputTemplate via the constructor that receives an ivector, the method setIds or setData these ids are simply copied to the outputVector. I.e. no statistics about the actual classification performance of the classifier are used. If the data is not set, the option Max is used and false is returned by the apply method.
Max
The probability lists are used. For each element of the classification result, the id with the highest probability is found and set to one while all other probabilities for that element are set to zero. This leads to an outputVector which is equal or similar to the one generated by Ignore. There will be differences, however, if a certain element of the classification result was trained for one class, but when building the probability distributions another class caused this element to have the highest value more frequently. This case can be seen for the second element in the example below.
Uniform
The probability lists are used. For each of the classification result the number of ids in the list is found and their probabilities set to be uniformly distributed. This method puts very little trust in the data used for generating the probabilities, i.e. that it represents the true distribution of the data. On the other hand, it is very susceptible to noise in the data: One misclassified example can completely change the outcome of future classifications.
ObjProb
The probablity lists are used. The complete information is used. This has a functionality similar to a rule set: If element A is activated, then there is a probability of 0.3 for class 1 and 0.7 for class 5. This method works quite well, when the training data represents the actual distributions quite well, but the classifier is not able to build the correct models. A typical effect of using this approach rather than Ignore is that misclassified unknown data will have greater probability and thus a higher ranking. On the downside, sometimes data correctly classified by Ignore can be just misclassified.

As mentioned above for all cases but Ignore, the outputTemplate contains a list of class probabilities for each element of the classification result. These are interpreted as dependent probablities: P(o|x) where o stands for the id and x for the position in the classification result. Each element of the classification results is also taken as a probability p(x). Thus the values for each id are calculated as $P(o)=\sum_x p(x)\cdot P(o|x)$.

Here is a short example for the behavior of an outputTemplate when applied to a classification result. The figure shows the classification result on the lefthand side, the default ids which are used with the option Ignore in the middle and the probabiltity lists which are used for Max, Uniform and ObjProb on the righthand side.

outputTemplate.png

Depending on the value of multipleMode the following outputVector is generated by calling apply:

1

3

5

6

17

22

41

Ignore

0.15

0.50

---

---

0.03

0.30

0.02

Max

0.15

---

---

0.50

0.03

0.30

0.02

Uniform

0.15

0.35

0.10

0.25

0.04

0.10

0.01

ObjProb

0.15

0.33

0.05

0.27

0.04

0.15

0.01

If the use of all four options is desired, the constructor outputTemplate(int) receiving an int value must be used. All data can be set using methods setIds, setProbs and/or setData. If the other constructors are used, no space is reserved for the lists of probabilities, since these take much space and some, especially unsupervised, classifiers do not need or have no means to gather this information.


Member Enumeration Documentation

This type specifies how the output element probability and the probabilities in the list should be combined.

See description of outputTemplate.

Enumerator:
Ignore 

ignore the object probability

Max 

set the prob of the id with max prob to 1, others to zero.

Uniform 

assume that all objects in the list of one output element have the same probability (1/number of elements).

ObjProb 

consider the given object probabilities


Constructor & Destructor Documentation

lti::classifier::outputTemplate::outputTemplate (  ) 

Default constructor.

multipleMode is ObjProb.

lti::classifier::outputTemplate::outputTemplate ( const outputTemplate other  ) 

Copy constructor.

lti::classifier::outputTemplate::outputTemplate ( const ivector theIds  ) 

Constructor.

Since a vector of ids is given multipleMode is Ignore and the probability lists are not initialized and thus cannot be set later.

lti::classifier::outputTemplate::outputTemplate ( const int &  size  ) 

Constructor.

The number of output units is given. multipleMode is ObjeProb. Default ids as well as lists of probabilities can be set.


Member Function Documentation

bool lti::classifier::outputTemplate::apply ( const dvector data,
outputVector result 
) const

Uses the information stored in the outputTemplate to generate an outputVector from a dvector.

See description of outputTemplate for details. The classification result should contain only positive values which are greater for better fit. The best interpretability is obtained if data is a probability distribution.

Parameters:
data the classification result
result outputVector calculted using the outputTemplate.
Returns:
false on error (check getStatusString())
outputTemplate* lti::classifier::outputTemplate::clone (  )  const

clone

outputTemplate& lti::classifier::outputTemplate::copy ( const outputTemplate other  ) 

copy

Reimplemented from lti::ioObject.

Referenced by operator=().

const ivector& lti::classifier::outputTemplate::getIds (  )  const

Returns a const reference to the id vector.

Referenced by lti::kNNClassifier::getColumnId().

eMultipleMode lti::classifier::outputTemplate::getMultipleMode (  )  const

Get the setting of multipleMode.

const outputVector& lti::classifier::outputTemplate::getProbs ( const int &  pos  )  const

Returns a const reference to the probability distribution at the given position of the template.

outputTemplate& lti::classifier::outputTemplate::operator= ( const outputTemplate other  )  [inline]

assigment operator (alias for copy(other)).

Parameters:
other the outputTemplate to be copied
Returns:
a reference to the actual outputTemplate

Reimplemented from lti::ioObject.

References copy().

bool lti::classifier::outputTemplate::read ( ioHandler handler,
const bool  complete = true 
) [virtual]

read the outputTemplate from the given ioHandler

Parameters:
handler the ioHandler to be used
complete if true (the default) the enclosing begin/end will be also written, otherwise only the data block will be written.
Returns:
true if write was successful

Reimplemented from lti::ioObject.

bool lti::classifier::outputTemplate::setData ( const int &  pos,
const int &  realId,
const outputVector outV 
)

Set the probabilities and the default id of one unit.

This information must be set for all elements of the classification result. Then is can be used by the apply method for any value of multipleMode.

Parameters:
pos the posision in the classification result this distribution is for.
realId the expected or desired id of this posision of the classification result.
outV list of ids and corresponding probabilities of classes possibly correct, when this position has high probability.
Returns:
false on error, e.g. illegal pos
bool lti::classifier::outputTemplate::setIds ( const ivector theIds  ) 

Set the default id vector.

These are used when multipleMode is set to Ignore.

void lti::classifier::outputTemplate::setMultipleMode ( const eMultipleMode mode  ) 

Change the setting of how the object probabilities of each unit are taken into account when calculating the outputVector.

See description of outputTemplate.

bool lti::classifier::outputTemplate::setProbs ( const int &  pos,
const outputVector outV 
)

Set the probabilities of one unit.

This information must be set for all elements of the classification result. Then is can be used by the apply method when multipleMode is set to one of Max, Uniform or ObjProb.

Parameters:
pos the posision in the classification result this distribution is for-
outV list of ids and corresponding probabilities of classes possibly correct, when this position has high probability.
Returns:
false on error, e.g. illegal pos
bool lti::classifier::outputTemplate::setProbs ( const int &  pos,
const ivector theIds,
const dvector theValues 
)

Set the probabilities of one unit.

This information must be set for all elements of the classification result. Then is can be used by the apply method when multipleMode is set to one of Max, Uniform or ObjProb.

Parameters:
pos the posision in the classification result this distribution is for-
theIds list of ids of classes possibly correct, when this position has high probability.
theValues probabilities of each of these ids.
Returns:
false on errer, e.g. illegal pos
int lti::classifier::outputTemplate::size (  )  const

returns the number of output units handled by this outputTemplate

bool lti::classifier::outputTemplate::write ( ioHandler handler,
const bool  complete = true 
) const [virtual]

write the outputTemplate in the given ioHandler

Parameters:
handler the ioHandler to be used
complete if true (the default) the enclosing begin/end will be also written, otherwise only the data block will be written.
Returns:
true if write was successful

Reimplemented from lti::ioObject.


Member Data Documentation

List of ids for each output unit.

Determines what data is used for calculation of an outputVector from the classification result and the outputTemplate.

See eUseObjectProb and outputTemplate for detailed description.

Contains one outputVector for each output unit.

These hold the probabilities for the ids being correct when this unit is activated.


The documentation for this class was generated from the following file:

Generated on Sat Apr 10 15:26:46 2010 for LTI-Lib by Doxygen 1.6.1