GClasses
GClasses::GFuzzyKMeans Class Reference

Detailed Description

A K-means clustering algorithm where every point has partial membership in each cluster. This algorithm is specified in Li, D. and Deogun, J. and Spaulding, W. and Shuart, B., Towards missing data imputation: A study of fuzzy K-means clustering method, In Rough Sets and Current Trends in Computing, Springer, pages 573–579, 2004.

#include <GCluster.h>

Inheritance diagram for GClasses::GFuzzyKMeans:
GClasses::GClusterer GClasses::GTransform

Public Member Functions

 GFuzzyKMeans (size_t nClusters, GRand *pRand)
 
 ~GFuzzyKMeans ()
 
GMatrixcentroids ()
 Returns a k x d matrix, where each row is one of the k centroids. More...
 
virtual void cluster (const GMatrix *pData)
 Performs clustering. More...
 
void init (const GMatrix *pData)
 Selects random centroids and initializes internal data structures. More...
 
void recomputeCentroids (const GMatrix *pData)
 Computes new centroids for each cluster. More...
 
double recomputeWeights (const GMatrix *pData)
 Assigns each row to partial membership in each cluster, as measured with the dissimilarity metric. Returns the weighted-sum-distance of each row with the centroids. More...
 
void setFuzzifier (double d)
 Specifies how fuzzy the membership in each cluster should be. d should be greater than 1, and is typically about 1.3. More...
 
void setReps (size_t r)
 Specify the number of times to cluster the data. The best clustering (as measured by the weighted-sum-difference between each point with the centroids) will be kept. More...
 
virtual size_t whichCluster (size_t nVector)
 Identifies the cluster of the specified row. More...
 
- Public Member Functions inherited from GClasses::GClusterer
 GClusterer (size_t nClusterCount)
 
virtual ~GClusterer ()
 
size_t clusterCount ()
 Return the number of clusters. More...
 
virtual GMatrixreduce (const GMatrix &in)
 Clusters pIn and outputs a dataset with one column that specifies the cluster number for each row. More...
 
void setMetric (GDistanceMetric *pMetric, bool own)
 If own is true, then this object will delete pMetric when it is destroyed. More...
 
- Public Member Functions inherited from GClasses::GTransform
 GTransform ()
 
 GTransform (const GDomNode *pNode)
 
virtual ~GTransform ()
 

Protected Member Functions

bool clusterAttempt (size_t nMaxIterations)
 
bool selectSeeds (const GMatrix *pSeeds)
 
- Protected Member Functions inherited from GClasses::GTransform
virtual GDomNodebaseDomNode (GDom *pDoc, const char *szClassName) const
 Child classes should use this in their implementation of serialize. More...
 

Protected Attributes

double m_fuzzifier
 
GMatrixm_pCentroids
 
GRandm_pRand
 
GMatrixm_pWeights
 
size_t m_reps
 
- Protected Attributes inherited from GClasses::GClusterer
size_t m_clusterCount
 
bool m_ownMetric
 
GDistanceMetricm_pMetric
 

Constructor & Destructor Documentation

GClasses::GFuzzyKMeans::GFuzzyKMeans ( size_t  nClusters,
GRand pRand 
)
GClasses::GFuzzyKMeans::~GFuzzyKMeans ( )

Member Function Documentation

GMatrix* GClasses::GFuzzyKMeans::centroids ( )
inline

Returns a k x d matrix, where each row is one of the k centroids.

virtual void GClasses::GFuzzyKMeans::cluster ( const GMatrix pData)
virtual

Performs clustering.

Implements GClasses::GClusterer.

bool GClasses::GFuzzyKMeans::clusterAttempt ( size_t  nMaxIterations)
protected
void GClasses::GFuzzyKMeans::init ( const GMatrix pData)

Selects random centroids and initializes internal data structures.

void GClasses::GFuzzyKMeans::recomputeCentroids ( const GMatrix pData)

Computes new centroids for each cluster.

double GClasses::GFuzzyKMeans::recomputeWeights ( const GMatrix pData)

Assigns each row to partial membership in each cluster, as measured with the dissimilarity metric. Returns the weighted-sum-distance of each row with the centroids.

bool GClasses::GFuzzyKMeans::selectSeeds ( const GMatrix pSeeds)
protected
void GClasses::GFuzzyKMeans::setFuzzifier ( double  d)
inline

Specifies how fuzzy the membership in each cluster should be. d should be greater than 1, and is typically about 1.3.

void GClasses::GFuzzyKMeans::setReps ( size_t  r)
inline

Specify the number of times to cluster the data. The best clustering (as measured by the weighted-sum-difference between each point with the centroids) will be kept.

virtual size_t GClasses::GFuzzyKMeans::whichCluster ( size_t  nVector)
virtual

Identifies the cluster of the specified row.

Implements GClasses::GClusterer.

Member Data Documentation

double GClasses::GFuzzyKMeans::m_fuzzifier
protected
GMatrix* GClasses::GFuzzyKMeans::m_pCentroids
protected
GRand* GClasses::GFuzzyKMeans::m_pRand
protected
GMatrix* GClasses::GFuzzyKMeans::m_pWeights
protected
size_t GClasses::GFuzzyKMeans::m_reps
protected