GClasses
|
An implementation of the K-means clustering algorithm.
#include <GCluster.h>
Public Member Functions | |
GKMeans (size_t nClusters, GRand *pRand) | |
~GKMeans () | |
double | assignClusters (const GMatrix *pData) |
Assigns each row to the cluster of the nearest centroid as measured with the dissimilarity metric. Returns the sum-squared-distance of each row with its centroid. More... | |
GMatrix * | centroids () |
Returns a k x d matrix, where each row is one of the k centroids. More... | |
virtual void | cluster (const GMatrix *pData) |
Performs clustering. More... | |
void | init (const GMatrix *pData) |
Selects random centroids and initializes internal data structures. More... | |
void | recomputeCentroids (const GMatrix *pData) |
Computes new centroids for each cluster. More... | |
void | setReps (size_t r) |
Specify the number of times to cluster the data. The best clustering (as measured by the sum-squared-difference between each point and its cluster-centroid) will be kept. More... | |
virtual size_t | whichCluster (size_t nVector) |
Identifies the cluster of the specified row. More... | |
Public Member Functions inherited from GClasses::GClusterer | |
GClusterer (size_t nClusterCount) | |
virtual | ~GClusterer () |
size_t | clusterCount () |
Return the number of clusters. More... | |
virtual GMatrix * | reduce (const GMatrix &in) |
Clusters pIn and outputs a dataset with one column that specifies the cluster number for each row. More... | |
void | setMetric (GDistanceMetric *pMetric, bool own) |
If own is true, then this object will delete pMetric when it is destroyed. More... | |
Public Member Functions inherited from GClasses::GTransform | |
GTransform () | |
GTransform (const GDomNode *pNode) | |
virtual | ~GTransform () |
Protected Member Functions | |
bool | clusterAttempt (size_t nMaxIterations) |
bool | selectSeeds (const GMatrix *pSeeds) |
Protected Member Functions inherited from GClasses::GTransform | |
virtual GDomNode * | baseDomNode (GDom *pDoc, const char *szClassName) const |
Child classes should use this in their implementation of serialize. More... | |
Protected Attributes | |
GMatrix * | m_pCentroids |
size_t * | m_pClusters |
GRand * | m_pRand |
size_t | m_reps |
Protected Attributes inherited from GClasses::GClusterer | |
size_t | m_clusterCount |
bool | m_ownMetric |
GDistanceMetric * | m_pMetric |
GClasses::GKMeans::GKMeans | ( | size_t | nClusters, |
GRand * | pRand | ||
) |
GClasses::GKMeans::~GKMeans | ( | ) |
double GClasses::GKMeans::assignClusters | ( | const GMatrix * | pData | ) |
Assigns each row to the cluster of the nearest centroid as measured with the dissimilarity metric. Returns the sum-squared-distance of each row with its centroid.
|
inline |
Returns a k x d matrix, where each row is one of the k centroids.
|
virtual |
Performs clustering.
Implements GClasses::GClusterer.
|
protected |
void GClasses::GKMeans::init | ( | const GMatrix * | pData | ) |
Selects random centroids and initializes internal data structures.
void GClasses::GKMeans::recomputeCentroids | ( | const GMatrix * | pData | ) |
Computes new centroids for each cluster.
|
protected |
|
inline |
Specify the number of times to cluster the data. The best clustering (as measured by the sum-squared-difference between each point and its cluster-centroid) will be kept.
|
virtual |
Identifies the cluster of the specified row.
Implements GClasses::GClusterer.
|
protected |
|
protected |
|
protected |
|
protected |