GClasses
GClasses::GKMeans Class Reference

Detailed Description

An implementation of the K-means clustering algorithm.

#include <GCluster.h>

Inheritance diagram for GClasses::GKMeans:
GClasses::GClusterer GClasses::GTransform

Public Member Functions

 GKMeans (size_t nClusters, GRand *pRand)
 
 ~GKMeans ()
 
double assignClusters (const GMatrix *pData)
 Assigns each row to the cluster of the nearest centroid as measured with the dissimilarity metric. Returns the sum-squared-distance of each row with its centroid. More...
 
GMatrixcentroids ()
 Returns a k x d matrix, where each row is one of the k centroids. More...
 
virtual void cluster (const GMatrix *pData)
 Performs clustering. More...
 
void init (const GMatrix *pData)
 Selects random centroids and initializes internal data structures. More...
 
void recomputeCentroids (const GMatrix *pData)
 Computes new centroids for each cluster. More...
 
void setReps (size_t r)
 Specify the number of times to cluster the data. The best clustering (as measured by the sum-squared-difference between each point and its cluster-centroid) will be kept. More...
 
virtual size_t whichCluster (size_t nVector)
 Identifies the cluster of the specified row. More...
 
- Public Member Functions inherited from GClasses::GClusterer
 GClusterer (size_t nClusterCount)
 
virtual ~GClusterer ()
 
size_t clusterCount ()
 Return the number of clusters. More...
 
virtual GMatrixreduce (const GMatrix &in)
 Clusters pIn and outputs a dataset with one column that specifies the cluster number for each row. More...
 
void setMetric (GDistanceMetric *pMetric, bool own)
 If own is true, then this object will delete pMetric when it is destroyed. More...
 
- Public Member Functions inherited from GClasses::GTransform
 GTransform ()
 
 GTransform (const GDomNode *pNode)
 
virtual ~GTransform ()
 

Protected Member Functions

bool clusterAttempt (size_t nMaxIterations)
 
bool selectSeeds (const GMatrix *pSeeds)
 
- Protected Member Functions inherited from GClasses::GTransform
virtual GDomNodebaseDomNode (GDom *pDoc, const char *szClassName) const
 Child classes should use this in their implementation of serialize. More...
 

Protected Attributes

GMatrixm_pCentroids
 
size_t * m_pClusters
 
GRandm_pRand
 
size_t m_reps
 
- Protected Attributes inherited from GClasses::GClusterer
size_t m_clusterCount
 
bool m_ownMetric
 
GDistanceMetricm_pMetric
 

Constructor & Destructor Documentation

GClasses::GKMeans::GKMeans ( size_t  nClusters,
GRand pRand 
)
GClasses::GKMeans::~GKMeans ( )

Member Function Documentation

double GClasses::GKMeans::assignClusters ( const GMatrix pData)

Assigns each row to the cluster of the nearest centroid as measured with the dissimilarity metric. Returns the sum-squared-distance of each row with its centroid.

GMatrix* GClasses::GKMeans::centroids ( )
inline

Returns a k x d matrix, where each row is one of the k centroids.

virtual void GClasses::GKMeans::cluster ( const GMatrix pData)
virtual

Performs clustering.

Implements GClasses::GClusterer.

bool GClasses::GKMeans::clusterAttempt ( size_t  nMaxIterations)
protected
void GClasses::GKMeans::init ( const GMatrix pData)

Selects random centroids and initializes internal data structures.

void GClasses::GKMeans::recomputeCentroids ( const GMatrix pData)

Computes new centroids for each cluster.

bool GClasses::GKMeans::selectSeeds ( const GMatrix pSeeds)
protected
void GClasses::GKMeans::setReps ( size_t  r)
inline

Specify the number of times to cluster the data. The best clustering (as measured by the sum-squared-difference between each point and its cluster-centroid) will be kept.

virtual size_t GClasses::GKMeans::whichCluster ( size_t  nVector)
virtual

Identifies the cluster of the specified row.

Implements GClasses::GClusterer.

Member Data Documentation

GMatrix* GClasses::GKMeans::m_pCentroids
protected
size_t* GClasses::GKMeans::m_pClusters
protected
GRand* GClasses::GKMeans::m_pRand
protected
size_t GClasses::GKMeans::m_reps
protected