GClasses
|
Represents a matrix or a database table.
Elements can be discrete or continuous.
References a GRelation object, which stores the meta-information about each column.
#include <GMatrix.h>
Public Member Functions | |
GMatrix () | |
Makes an empty 0x0 matrix. More... | |
GMatrix (size_t rows, size_t cols) | |
Construct a rows x cols matrix with all elements of the matrix assumed to be continuous. More... | |
GMatrix (std::vector< size_t > &attrValues) | |
Construct a matrix with a mixed relation. That is, one with some continuous attributes (columns), and some nominal attributes (columns). More... | |
GMatrix (GRelation *pRelation) | |
Create an empty matrix whose attributes/column types are specified by pRelation. More... | |
GMatrix (const GMatrix &orig, size_t rowStart=0, size_t colStart=0, size_t rowCount=(size_t)-1, size_t colCount=(size_t)-1) | |
Copy-constructor. More... | |
GMatrix (const GDomNode *pNode) | |
Load from a DOM. More... | |
~GMatrix () | |
void | add (const GMatrix *pThat, bool transpose=false, double scalar=1.0) |
Matrix add. More... | |
GVec & | back (size_t reverse_index=0) |
Returns a pointer to a row indexed from the back of the matrix. index 0 (default) is the last row, index 1 is the second-to-last row, etc. More... | |
const GVec & | back (size_t reverse_index=0) const |
double | baselineValue (size_t nAttribute) const |
Returns the mean if the specified attribute is continuous, otherwise returns the most common nominal value in the attribute. More... | |
double | boundingSphere (GVec &outCenter, size_t *pIndexes, size_t indexCount, GDistanceMetric *pMetric) const |
Finds a sphere that tightly bounds all the points in the specified vector of row-indexes. More... | |
void | centerMeanAtOrigin () |
Shifts the data such that the mean occurs at the origin. Only continuous values are affected. Nominal values are left unchanged. More... | |
void | centroid (GVec &outCentroid, const double *pWeights=NULL) const |
Computes the arithmetic means of all attributes If pWeights is non-NULL, then it is assumed to be a vector of weights, one for each row in this matrix. More... | |
GMatrix * | cholesky (bool tolerant=false) |
This computes the square root of this matrix. (If you take the matrix that this returns and multiply it by its transpose, you should get the original dataset again.) (Returns a lower-triangular matrix.) More... | |
void | clipColumn (size_t col, double dMin, double dMax) |
Clips the values in the specified column to fall beween dMin and dMax (inclusively). More... | |
void | col (size_t index, double *pOutVector) |
Copies the specified column into pOutVector. More... | |
size_t | cols () const |
Returns the number of columns in the dataset. More... | |
double | columnMax (size_t nAttribute) const |
Returns the maximum value in the specified column (not counting UNKNOWN_REAL_VALUE). Returns -1e300 if there are no known values in the column. More... | |
double | columnMean (size_t nAttribute, const double *pWeights=NULL, bool throwIfEmpty=true) const |
Computes the arithmetic mean of the values in the specified column If pWeights is NULL, then each row is given equal weight. If pWeights is non-NULL, then it is assumed to be a vector of weights, one for each row in this matrix. If there are no values in this column with any weight, then it will throw an exception if throwIfEmpty is true, or else return UNKNOWN_REAL_VALUE. More... | |
double | columnMedian (size_t nAttribute, bool throwIfEmpty=true) const |
Computes the median of the values in the specified column If there are no values in this column, then it will throw an exception if throwIfEmpty is true, or else return UNKNOWN_REAL_VALUE. More... | |
double | columnMin (size_t nAttribute) const |
Returns the minimum value in the specified column (not counting UNKNOWN_REAL_VALUE). Returns 1e300 if there are no known values in the column. More... | |
double | columnSquaredMagnitude (size_t col) const |
Returns the squared magnitude of the vector in the specified column. More... | |
double | columnSum (size_t col) const |
Returns the sum of the values in the specified column. More... | |
double | columnSumSquaredDifference (const GMatrix &that, size_t col, double *pOutSAE=NULL) const |
Computes the sum-squared distance between the specified column of this and that. If the column is a nominal attribute, then Hamming distance is used. if pOutSAE is not NULL, the sum absolute error will be placed there. More... | |
double | columnVariance (size_t nAttr, double mean) const |
Computes the sample variance of a single attribute. More... | |
void | copy (const GMatrix &that, size_t rowStart=0, size_t colStart=0, size_t rowCount=(size_t)-1, size_t colCount=(size_t)-1) |
Copies (deep) all the data and metadata from pThat. More... | |
void | copyBlock (const GMatrix &source, size_t srcRow=0, size_t srcCol=0, size_t hgt=INVALID_INDEX, size_t wid=INVALID_INDEX, size_t destRow=0, size_t destCol=0, bool checkMetaData=true) |
Copies values from a rectangular region of the source matrix into this matrix. The wid and hgt values are clipped if they exceed the size of the source matrix. An exception is thrown if the destination is not big enough to hold the values at the specified location. If checkMetaData is true, then this will throw an exception if the data types are incompatible. More... | |
void | copyCols (const GMatrix &that, size_t firstCol, size_t colCount) |
Copies the specified range of columns (including meta-data) from that matrix into this matrix, replacing all data currently in this matrix. More... | |
size_t | countPrincipalComponents (double d, GRand *pRand) const |
Computes the minimum number of principal components necessary so that less than the specified portion of the deviation in the data is unaccounted for. More... | |
size_t | countUniqueValues (size_t col, size_t maxCount=(size_t)-1) const |
Counts the number of unique values in the specified column. If maxCount unique values are found, it immediately returns maxCount. More... | |
size_t | countValue (size_t attribute, double value) const |
Returns the number of ocurrences of the specified value in the specified attribute. More... | |
double | covariance (size_t nAttr1, double dMean1, size_t nAttr2, double dMean2, const double *pWeights=NULL) const |
Computes the covariance between two attributes. If pWeights is NULL, each row is given a weight of 1. If pWeights is non-NULL, then it is assumed to be a vector of weights, one for each row in this matrix. More... | |
GMatrix * | covarianceMatrix () const |
Computes the covariance matrix of the data. More... | |
void | deleteColumns (size_t index, size_t count) |
Deletes some columns. This does not reallocate the rows, but it does shift the elements, which is a slow operation, especially if there are many columns that follow those being deleted. More... | |
void | deleteRow (size_t index) |
Swaps the specified row with the last row, and then deletes it. More... | |
void | deleteRowPreserveOrder (size_t index) |
Deletes the specified row and shifts everything after it up one slot. More... | |
double | determinant () |
Computes the determinant of this matrix. More... | |
double | dihedralCorrelation (const GMatrix *pThat, GRand *pRand) const |
Computes the cosine of the dihedral angle between this subspace and pThat subspace. More... | |
bool | doesHaveAnyMissingValues () const |
Returns true iff this matrix is missing any values. More... | |
void | dropValue (size_t attr, int val) |
Drops any occurrences of the specified value, and removes it as a possible value. More... | |
double | eigenValue (const GVec &eigenVector) |
Computes the eigenvalue that corresponds to the specified eigenvector of this matrix. More... | |
double | eigenValue (const double *pMean, const double *pEigenVector, GRand *pRand) const |
Computes the eigenvalue that corresponds to *pEigenvector. More... | |
void | eigenVector (double eigenvalue, GVec &outVector) |
Computes the eigenvector that corresponds to the specified eigenvalue of this matrix. Note that this method trashes this matrix, so make a copy first if you care. More... | |
GMatrix * | eigs (size_t nCount, GVec &eigenVals, GRand *pRand, bool mostSignificant) |
Computes nCount eigenvectors and the corresponding eigenvalues using the power method (which is only accurate if a small number of eigenvalues/vectors are needed.) More... | |
void | ensureDataHasNoMissingNominals () const |
Throws an exception if this data contains any missing values in a nominal attribute. More... | |
void | ensureDataHasNoMissingReals () const |
Throws an exception if this data contains any missing values in a continuous attribute. More... | |
double | entropy (size_t nColumn) const |
Measures the entropy of the specified attribute. More... | |
void | fill (double val, size_t colStart=0, size_t colCount=INVALID_INDEX) |
Fills all elements in the specified range of columns with the specified value. If no column ranges are specified, the default is to set all of them. More... | |
void | fillNormal (GRand &rand, double deviation=1.0) |
Fills all elements with random values from a Normal distribution. More... | |
void | fillUniform (GRand &rand, double min=0.0, double max=1.0) |
Fills all elements with random values from a uniform distribution. More... | |
void | fixNans () |
Replaces any occurrences of NAN in the matrix with the corresponding values from an identity matrix. More... | |
void | flush () |
Deletes all the rows in this matrix. More... | |
void | fromVector (const double *pVector, size_t nRows) |
Copies the data from pVector over this dataset. More... | |
GVec & | front () |
Returns a pointer to the first row. More... | |
const GVec & | front () const |
bool | gaussianElimination (double *pVector) |
Computes y in the equation M*y=x (or y=M^(-1)x), where M is this dataset, which must be a square matrix, and x is pVector as passed in, and y is pVector after the call. More... | |
bool | isAttrHomogenous (size_t col) const |
Returns true iff the specified attribute contains homogenous values. (Unknowns are counted as homogenous with anything) More... | |
bool | isHomogenous () const |
Returns true iff each of the last labelDims columns in the data are homogenous. More... | |
bool | leastCorrelatedVector (GVec &out, const GMatrix *pThat, GRand *pRand) const |
Computes the vector in this subspace that has the greatest distance from its projection into pThat subspace. More... | |
double | linearCorrelationCoefficient (size_t attr1, double attr1Origin, size_t attr2, double attr2Origin) const |
Computes the linear coefficient between the two specified attributes. More... | |
void | load (const char *szFilename) |
Loads a file and automatically detects ARFF or raw (binary) More... | |
void | loadArff (const char *szFilename) |
Loads an ARFF file and replaces the contents of this matrix with it. More... | |
void | loadRaw (const char *szFilename) |
Loads a raw (binary) file and replaces the contents of this matrix with it. More... | |
void | LUDecomposition () |
Performs an in-place LU-decomposition, such that the lower triangle of this matrix (including the diagonal) specifies L, and the uppoer triangle of this matrix (not including the diagonal) specifies U, and all values of U along the diagonal are ones. (The upper triangle of L and the lower triangle of U are all zeros.) More... | |
void | makeIdentity () |
Sets this dataset to an identity matrix. (It doesn't change the number of columns or rows. It just stomps over existing values.) More... | |
double | measureInfo () const |
Computes the sum entropy of the data (or the sum variance for continuous attributes) More... | |
void | mergeVert (GMatrix *pData, bool ignoreMismatchingName=false) |
Steals all the rows from pData and adds them to this set. (You still have to delete pData.) Both datasets must have the same number of columns. More... | |
void | mirrorTriangle (bool upperToLower) |
copies one of the triangular submatrices over the other, making a symmetric matrix. More... | |
void | multiply (double scalar) |
Multiplies every element in the dataset by scalar. Behavior is undefined for nominal columns. More... | |
void | multiply (const GVec &vectorIn, GVec &vectorOut, bool transpose=false) const |
Multiplies this matrix by the column vector pVectorIn to get pVectorOut. More... | |
void | newColumns (size_t n) |
Adds 'n' new columns to the matrix. (This resizes every row and copies all the existing data, which is rather inefficient.) The values in the new columns are not initialized. More... | |
GVec & | newRow () |
Adds a new row to the matrix. (The values in the row are not initialized.) Returns a reference to the new row. More... | |
void | newRows (size_t nRows) |
Adds "nRows" uninitialized rows to this matrix. More... | |
void | normalizeColumn (size_t col, double dInMin, double dInMax, double dOutMin=0.0, double dOutMax=1.0) |
Normalizes the specified column. More... | |
GMatrix & | operator= (const GMatrix &orig) |
Make *this into a copy of orig. More... | |
bool | operator== (const GMatrix &other) const |
Returns true iff all the entries in *this and other are identical and their relations are compatible, and they are the same size. More... | |
GVec & | operator[] (size_t index) |
Returns a pointer to the specified row. More... | |
const GVec & | operator[] (size_t index) const |
Returns a const pointer to the specified row. More... | |
void | pairedTTest (size_t *pOutV, double *pOutT, size_t attr1, size_t attr2, bool normalize) const |
Performs a paired T-Test with data from the two specified attributes. More... | |
void | parseArff (const char *szFile, size_t nLen) |
Parses an ARFF file and replaces the contents of this matrix with it. More... | |
void | parseArff (GArffTokenizer &tok) |
Parses an ARFF file and replaces the contents of this matrix with it. More... | |
void | principalComponent (GVec &outVector, const GVec ¢roid, GRand *pRand) const |
This is an efficient algorithm for iteratively computing the principal component vector (the eigenvector of the covariance matrix) of the data. More... | |
void | principalComponentAboutOrigin (GVec &outVector, GRand *pRand) const |
Computes the first principal component assuming the mean is already subtracted out of the data. More... | |
void | principalComponentIgnoreUnknowns (GVec &outVector, const GVec ¢roid, GRand *pRand) const |
Computes principal components, while ignoring missing values. More... | |
void | print (std::ostream &stream=std::cout, char separator= ',') const |
Prints this matrix in ARFF format to the specified stream. More... | |
GMatrix * | pseudoInverse () |
Computes the Moore-Penrose pseudoinverse of this matrix (using the SVD method). You are responsible to delete the matrix this returns. More... | |
const GRelation & | relation () const |
Returns a const pointer to the relation object, which holds meta-data about the attributes (columns) More... | |
void | releaseAllRows () |
Abandons (leaks) all the rows in this matrix. More... | |
GVec * | releaseRow (size_t index) |
Swaps the specified row with the last row, and then releases it from the dataset. More... | |
GVec * | releaseRowPreserveOrder (size_t index) |
Releases the specified row from the dataset and shifts everything after it up one slot. More... | |
void | removeComponent (const GVec ¢roid, const GVec &component) |
Removes the component specified by pComponent from the data. (pComponent should already be normalized.) More... | |
void | removeComponentAboutOrigin (const GVec &component) |
Removes the specified component assuming the mean is zero. More... | |
void | replaceMissingValuesRandomly (size_t nAttr, GRand *pRand) |
Replaces all missing values by copying a randomly selected non-missing value in the same attribute. More... | |
void | replaceMissingValuesWithBaseline (size_t nAttr) |
Replace missing values with the appropriate measure of central tendency. More... | |
void | reserve (size_t n) |
Allocates space for the specified number of patterns (to avoid superfluous resizing) More... | |
void | resize (size_t rows, size_t cols) |
Resizes this matrix. Assigns all columns to be continuous, and replaces all element values with garbage. More... | |
void | reverseRows () |
Reverses the row order. More... | |
GVec & | row (size_t index) |
Returns a pointer to the specified row. More... | |
const GVec & | row (size_t index) const |
Returns a const pointer to the specified row. More... | |
size_t | rows () const |
Returns the number of rows in the dataset. More... | |
void | saveArff (const char *szFilename) |
Saves the dataset to a file in ARFF format. More... | |
void | saveRaw (const char *szFilename) |
Saves the dataset to a file in raw (binary) format. More... | |
void | scaleColumn (size_t col, double scalar) |
Scales the column by the specified scalar. More... | |
GDomNode * | serialize (GDom *pDoc) const |
Marshalls this object to a DOM, which may be saved to a variety of serial formats. More... | |
void | setCol (size_t index, const double *pVector) |
Copies pVector over the specified column. More... | |
void | setRelation (GRelation *pRelation) |
Sets the relation for this dataset, which specifies the number of columns, and their data types. If there are one or more rows in this matrix, and the new relation does not have the same number of columns as the old relation, then this will throw an exception. Takes ownership of pRelation. That is, the destructor will delete it. More... | |
void | shuffle (GRand &rand, GMatrix *pExtension=NULL) |
Randomizes the order of the rows. More... | |
void | shuffle2 (GRand &rand, GMatrix &other) |
Shuffles the order of the rows. Also shuffles the rows in "other" in the same way, such that corresponding rows are preserved. More... | |
void | shuffleLikeCards () |
This is an inferior way to shuffle the data. More... | |
void | singularValueDecomposition (GMatrix **ppU, double **ppDiag, GMatrix **ppV, bool throwIfNoConverge=false, size_t maxIters=80) |
Performs SVD on A, where A is this m-by-n matrix. More... | |
void | sort (size_t nDimension) |
Sorts the data from smallest to largest in the specified dimension. More... | |
template<typename CompareFunc > | |
void | sort (CompareFunc &pComparator) |
Sorts rows according to the specified compare function. (Return true to indicate that the first row comes before the second row.) More... | |
void | sortPartial (size_t row, size_t col) |
This partially sorts the specified column, such that the specified row will contain the same row as if it were fully sorted, and previous rows will contain a value <= to it in that column, and later rows will contain a value >= to it in that column. Unlike sort, which has O(m*log(m)) complexity, this method has O(m) complexity. This might be useful, for example, for efficiently finding the row with a median value in some attribute, or for separating data by a threshold in some value. More... | |
void | splitByPivot (GMatrix *pGreaterOrEqual, size_t nAttribute, double dPivot, GMatrix *pExtensionA=NULL, GMatrix *pExtensionB=NULL) |
Splits this set of data into two sets. Values greater-than-or-equal-to dPivot stay in this data set. Values less than dPivot go into pLessThanPivot. More... | |
void | splitBySize (GMatrix &other, size_t nOtherRows) |
Removes the last nOtherRows rows from this data set and puts them in "other". (Order is preserved.) More... | |
void | splitCategoricalKeepIfEqual (GMatrix *pOtherValues, size_t nAttr, int nValue, GMatrix *pExtensionA=NULL, GMatrix *pExtensionB=NULL) |
Moves all rows with the specified value in the specified attribute into pOtherValues. More... | |
void | splitCategoricalKeepIfNotEqual (GMatrix *pSingleClass, size_t nAttr, int nValue, GMatrix *pExtensionA=NULL, GMatrix *pExtensionB=NULL) |
Moves all rows with the specified value in the specified attribute into pSingleClass. More... | |
void | subtract (const GMatrix *pThat, bool transpose) |
Matrix subtract. Subtracts the values in *pThat from *this. More... | |
double | sumSquaredDifference (const GMatrix &that, bool transpose=false) const |
Computes the squared distance between this and that. More... | |
double | sumSquaredDiffWithIdentity () |
Returns the sum squared difference between this matrix and an identity matrix. More... | |
double | sumSquaredDistance (const GVec &point) const |
Computes the sum-squared distance between pPoint and all of the points in the dataset. More... | |
void | swapColumns (size_t nAttr1, size_t nAttr2) |
Swaps two columns. More... | |
GVec * | swapRow (size_t i, GVec *pNewRow) |
Swap pNewRow in for row i, and return row i. The caller is then responsible to delete the row that is returned. More... | |
void | swapRows (size_t a, size_t b) |
Swaps the two specified rows. More... | |
void | takeRow (GVec *pRow, size_t pos=(size_t)-1) |
Adds an already-allocated row to this dataset. If pos is specified, the new row will be inserted and the speicified position. More... | |
size_t | toReducedRowEchelonForm () |
Converts the matrix to reduced row echelon form. More... | |
void | toVector (double *pVector) const |
Copies all the data from this dataset into pVector. More... | |
double | trace () |
Returns the sum of the diagonal elements. More... | |
GMatrix * | transpose () |
Returns a pointer to a new dataset that is this dataset transposed. (All columns in the returned dataset will be continuous.) More... | |
void | weightedPrincipalComponent (GVec &outVector, const GVec ¢roid, const double *pWeights, GRand *pRand) const |
Computes the first principal component of the data with each row weighted according to the vector pWeights. (pWeights must have an element for each row.) More... | |
void | wilcoxonSignedRanksTest (size_t attr1, size_t attr2, double tolerance, int *pNum, double *pWMinus, double *pWPlus) const |
Performs the Wilcoxon signed ranks test from the two specified attributes. More... | |
Static Public Member Functions | |
static GMatrix * | align (GMatrix *pA, GMatrix *pB) |
This uses the Kabsch algorithm to rotate and translate pB in order to minimize RMS with pA. (pA and pB must have the same number of rows and columns.) More... | |
static GSimpleAssignment | bipartiteMatching (GMatrix &a, GMatrix &b, GDistanceMetric &metric) |
Projects pPoint onto this hyperplane (where each row defines one of the orthonormal basis vectors of this hyperplane) More... | |
static GMatrix * | kabsch (GMatrix *pA, GMatrix *pB) |
This computes K=kabsch(A,B), such that K is an n-by-n matrix, where n is pA->cols(). K is the optimal orthonormal rotation matrix to align A and B, such that A(K^T) minimizes sum-squared error with B, and BK minimizes sum-squared error with A. (This rotates around the origin, so typically you will want to subtract the centroid from both pA and pB before calling this.) More... | |
static GMatrix * | mergeHoriz (const GMatrix *pSetA, const GMatrix *pSetB) |
Merges two datasets side-by-side. The resulting dataset will contain the attributes of both datasets. Both pSetA and pSetB (and the resulting dataset) must have the same number of rows. More... | |
static GMatrix * | multiply (const GMatrix &a, const GMatrix &b, bool transposeA, bool transposeB) |
Matrix multiply. More... | |
static double | normalizeValue (double dVal, double dInMin, double dInMax, double dOutMin=0.0, double dOutMax=1.0) |
Normalize a value from the input min and max to the output min and max. More... | |
static void | test () |
Performs unit tests for this class. Throws an exception if there is a failure. More... | |
Protected Member Functions | |
double | determinantHelper (size_t nEndRow, size_t *pColumnList) |
void | inPlaceSquareTranspose () |
void | singularValueDecompositionHelper (GMatrix **ppU, double **ppDiag, GMatrix **ppV, bool throwIfNoConverge, size_t maxIters) |
Protected Attributes | |
GRelation * | m_pRelation |
std::vector< GVec * > | m_rows |
GClasses::GMatrix::GMatrix | ( | ) |
Makes an empty 0x0 matrix.
GClasses::GMatrix::GMatrix | ( | size_t | rows, |
size_t | cols | ||
) |
Construct a rows x cols matrix with all elements of the matrix assumed to be continuous.
It is okay to initially set rows to 0 and later call newRow to add each row. Adding columns later, however, is not very computationally efficient.)
GClasses::GMatrix::GMatrix | ( | std::vector< size_t > & | attrValues | ) |
Construct a matrix with a mixed relation. That is, one with some continuous attributes (columns), and some nominal attributes (columns).
attrValues specifies the number of nominal values suppored in each attribute (column), or 0 for a continuous attribute.
Initially, this matrix will have 0 rows, but you can add more rows by calling newRow or newRows.
GClasses::GMatrix::GMatrix | ( | GRelation * | pRelation | ) |
Create an empty matrix whose attributes/column types are specified by pRelation.
Takes ownership of pRelation. That is, the destructor will delete pRelation.
Initially, this matrix will have 0 rows, but you can add more rows by calling newRow or newRows.
GClasses::GMatrix::GMatrix | ( | const GMatrix & | orig, |
size_t | rowStart = 0 , |
||
size_t | colStart = 0 , |
||
size_t | rowCount = (size_t)-1 , |
||
size_t | colCount = (size_t)-1 |
||
) |
Copy-constructor.
Copies orig, making a new relation object and new storage for the rows (with the same content).
orig | the GMatrix object to copy |
GClasses::GMatrix::GMatrix | ( | const GDomNode * | pNode | ) |
Load from a DOM.
GClasses::GMatrix::~GMatrix | ( | ) |
void GClasses::GMatrix::add | ( | const GMatrix * | pThat, |
bool | transpose = false , |
||
double | scalar = 1.0 |
||
) |
Matrix add.
Adds scalar * pThat to this. (If transpose is true, adds scalar * the transpose of pThat to this.) Both datasets must have the same dimensions. Behavior is undefined for nominal columns.
This uses the Kabsch algorithm to rotate and translate pB in order to minimize RMS with pA. (pA and pB must have the same number of rows and columns.)
|
inline |
Returns a pointer to a row indexed from the back of the matrix. index 0 (default) is the last row, index 1 is the second-to-last row, etc.
|
inline |
double GClasses::GMatrix::baselineValue | ( | size_t | nAttribute | ) | const |
Returns the mean if the specified attribute is continuous, otherwise returns the most common nominal value in the attribute.
|
static |
Projects pPoint onto this hyperplane (where each row defines one of the orthonormal basis vectors of this hyperplane)
This computes (A^T)Ap, where A is this matrix, and p is pPoint. Projects pPoint onto this hyperplane (where each row defines one of the orthonormal basis vectors of this hyperplane) Performs a bipartite matching between the rows of a and b using the Linear Assignment Problem (LAP) optimizer
Treats the rows of the matrices a and b as vectors and calculates the distances between these vectors using cost. Returns an optimal assignment from rows of a to rows of b that minimizes sum of the costs of the assignments.
Each row is considered to be a vector in multidimensional space. The cost is the distance given by cost when called on each row of a and row of b in turn. The cost must not be for any pair of rows. Other than that, there are no limitations on the cost function.
Because of the limitations of GDistanceMetric, a and b must have the same number of columns.
If is then this routine requires memory and time.
a | the matrix containing the vectors of set a. Must have the same number of columns as the matrix containing the vectors of set b. Each row is considered to be a vector in multidimensional space. |
b | the matrix containing the vectors of set b. Must have the same number of columns as the matrix containing the vectors of set a. Each row is considered to be a vector in multidimensional space. |
metric | given a row of a and a row of b, returns the cost of assigning a to b. |
double GClasses::GMatrix::boundingSphere | ( | GVec & | outCenter, |
size_t * | pIndexes, | ||
size_t | indexCount, | ||
GDistanceMetric * | pMetric | ||
) | const |
Finds a sphere that tightly bounds all the points in the specified vector of row-indexes.
Returns the squared radius of the sphere, and stores its center in pOutCenter.
void GClasses::GMatrix::centerMeanAtOrigin | ( | ) |
Shifts the data such that the mean occurs at the origin. Only continuous values are affected. Nominal values are left unchanged.
void GClasses::GMatrix::centroid | ( | GVec & | outCentroid, |
const double * | pWeights = NULL |
||
) | const |
Computes the arithmetic means of all attributes If pWeights is non-NULL, then it is assumed to be a vector of weights, one for each row in this matrix.
GMatrix* GClasses::GMatrix::cholesky | ( | bool | tolerant = false | ) |
This computes the square root of this matrix. (If you take the matrix that this returns and multiply it by its transpose, you should get the original dataset again.) (Returns a lower-triangular matrix.)
Behavior is undefined if there are nominal attributes. If tolerant is true, it will return even if it cannot compute accurate results. If tolerant is false (the default) and this matrix is not positive definite, it will throw an exception.
void GClasses::GMatrix::clipColumn | ( | size_t | col, |
double | dMin, | ||
double | dMax | ||
) |
Clips the values in the specified column to fall beween dMin and dMax (inclusively).
void GClasses::GMatrix::col | ( | size_t | index, |
double * | pOutVector | ||
) |
Copies the specified column into pOutVector.
|
inline |
Returns the number of columns in the dataset.
double GClasses::GMatrix::columnMax | ( | size_t | nAttribute | ) | const |
Returns the maximum value in the specified column (not counting UNKNOWN_REAL_VALUE). Returns -1e300 if there are no known values in the column.
double GClasses::GMatrix::columnMean | ( | size_t | nAttribute, |
const double * | pWeights = NULL , |
||
bool | throwIfEmpty = true |
||
) | const |
Computes the arithmetic mean of the values in the specified column If pWeights is NULL, then each row is given equal weight. If pWeights is non-NULL, then it is assumed to be a vector of weights, one for each row in this matrix. If there are no values in this column with any weight, then it will throw an exception if throwIfEmpty is true, or else return UNKNOWN_REAL_VALUE.
double GClasses::GMatrix::columnMedian | ( | size_t | nAttribute, |
bool | throwIfEmpty = true |
||
) | const |
Computes the median of the values in the specified column If there are no values in this column, then it will throw an exception if throwIfEmpty is true, or else return UNKNOWN_REAL_VALUE.
double GClasses::GMatrix::columnMin | ( | size_t | nAttribute | ) | const |
Returns the minimum value in the specified column (not counting UNKNOWN_REAL_VALUE). Returns 1e300 if there are no known values in the column.
double GClasses::GMatrix::columnSquaredMagnitude | ( | size_t | col | ) | const |
Returns the squared magnitude of the vector in the specified column.
double GClasses::GMatrix::columnSum | ( | size_t | col | ) | const |
Returns the sum of the values in the specified column.
double GClasses::GMatrix::columnSumSquaredDifference | ( | const GMatrix & | that, |
size_t | col, | ||
double * | pOutSAE = NULL |
||
) | const |
Computes the sum-squared distance between the specified column of this and that. If the column is a nominal attribute, then Hamming distance is used. if pOutSAE is not NULL, the sum absolute error will be placed there.
double GClasses::GMatrix::columnVariance | ( | size_t | nAttr, |
double | mean | ||
) | const |
Computes the sample variance of a single attribute.
void GClasses::GMatrix::copy | ( | const GMatrix & | that, |
size_t | rowStart = 0 , |
||
size_t | colStart = 0 , |
||
size_t | rowCount = (size_t)-1 , |
||
size_t | colCount = (size_t)-1 |
||
) |
Copies (deep) all the data and metadata from pThat.
void GClasses::GMatrix::copyBlock | ( | const GMatrix & | source, |
size_t | srcRow = 0 , |
||
size_t | srcCol = 0 , |
||
size_t | hgt = INVALID_INDEX , |
||
size_t | wid = INVALID_INDEX , |
||
size_t | destRow = 0 , |
||
size_t | destCol = 0 , |
||
bool | checkMetaData = true |
||
) |
Copies values from a rectangular region of the source matrix into this matrix. The wid and hgt values are clipped if they exceed the size of the source matrix. An exception is thrown if the destination is not big enough to hold the values at the specified location. If checkMetaData is true, then this will throw an exception if the data types are incompatible.
void GClasses::GMatrix::copyCols | ( | const GMatrix & | that, |
size_t | firstCol, | ||
size_t | colCount | ||
) |
Copies the specified range of columns (including meta-data) from that matrix into this matrix, replacing all data currently in this matrix.
size_t GClasses::GMatrix::countPrincipalComponents | ( | double | d, |
GRand * | pRand | ||
) | const |
Computes the minimum number of principal components necessary so that less than the specified portion of the deviation in the data is unaccounted for.
For example, if the data projected onto the first 3 principal components contains 90 percent of the deviation that the original data contains, then if you pass the value 0.1 to this method, it will return 3.
size_t GClasses::GMatrix::countUniqueValues | ( | size_t | col, |
size_t | maxCount = (size_t)-1 |
||
) | const |
Counts the number of unique values in the specified column. If maxCount unique values are found, it immediately returns maxCount.
size_t GClasses::GMatrix::countValue | ( | size_t | attribute, |
double | value | ||
) | const |
Returns the number of ocurrences of the specified value in the specified attribute.
double GClasses::GMatrix::covariance | ( | size_t | nAttr1, |
double | dMean1, | ||
size_t | nAttr2, | ||
double | dMean2, | ||
const double * | pWeights = NULL |
||
) | const |
Computes the covariance between two attributes. If pWeights is NULL, each row is given a weight of 1. If pWeights is non-NULL, then it is assumed to be a vector of weights, one for each row in this matrix.
GMatrix* GClasses::GMatrix::covarianceMatrix | ( | ) | const |
Computes the covariance matrix of the data.
void GClasses::GMatrix::deleteColumns | ( | size_t | index, |
size_t | count | ||
) |
Deletes some columns. This does not reallocate the rows, but it does shift the elements, which is a slow operation, especially if there are many columns that follow those being deleted.
void GClasses::GMatrix::deleteRow | ( | size_t | index | ) |
Swaps the specified row with the last row, and then deletes it.
void GClasses::GMatrix::deleteRowPreserveOrder | ( | size_t | index | ) |
Deletes the specified row and shifts everything after it up one slot.
double GClasses::GMatrix::determinant | ( | ) |
Computes the determinant of this matrix.
|
protected |
Computes the cosine of the dihedral angle between this subspace and pThat subspace.
bool GClasses::GMatrix::doesHaveAnyMissingValues | ( | ) | const |
Returns true iff this matrix is missing any values.
void GClasses::GMatrix::dropValue | ( | size_t | attr, |
int | val | ||
) |
Drops any occurrences of the specified value, and removes it as a possible value.
double GClasses::GMatrix::eigenValue | ( | const GVec & | eigenVector | ) |
Computes the eigenvalue that corresponds to the specified eigenvector of this matrix.
double GClasses::GMatrix::eigenValue | ( | const double * | pMean, |
const double * | pEigenVector, | ||
GRand * | pRand | ||
) | const |
Computes the eigenvalue that corresponds to *pEigenvector.
After you compute the principal component, you can call this to obtain the eigenvalue that corresponds to that principal component vector (eigenvector).
void GClasses::GMatrix::eigenVector | ( | double | eigenvalue, |
GVec & | outVector | ||
) |
Computes the eigenvector that corresponds to the specified eigenvalue of this matrix. Note that this method trashes this matrix, so make a copy first if you care.
GMatrix* GClasses::GMatrix::eigs | ( | size_t | nCount, |
GVec & | eigenVals, | ||
GRand * | pRand, | ||
bool | mostSignificant | ||
) |
Computes nCount eigenvectors and the corresponding eigenvalues using the power method (which is only accurate if a small number of eigenvalues/vectors are needed.)
If mostSignificant is true, the largest eigenvalues are found. If mostSignificant is false, the smallest eigenvalues are found.
void GClasses::GMatrix::ensureDataHasNoMissingNominals | ( | ) | const |
Throws an exception if this data contains any missing values in a nominal attribute.
void GClasses::GMatrix::ensureDataHasNoMissingReals | ( | ) | const |
Throws an exception if this data contains any missing values in a continuous attribute.
double GClasses::GMatrix::entropy | ( | size_t | nColumn | ) | const |
Measures the entropy of the specified attribute.
void GClasses::GMatrix::fill | ( | double | val, |
size_t | colStart = 0 , |
||
size_t | colCount = INVALID_INDEX |
||
) |
Fills all elements in the specified range of columns with the specified value. If no column ranges are specified, the default is to set all of them.
void GClasses::GMatrix::fillNormal | ( | GRand & | rand, |
double | deviation = 1.0 |
||
) |
Fills all elements with random values from a Normal distribution.
void GClasses::GMatrix::fillUniform | ( | GRand & | rand, |
double | min = 0.0 , |
||
double | max = 1.0 |
||
) |
Fills all elements with random values from a uniform distribution.
void GClasses::GMatrix::fixNans | ( | ) |
Replaces any occurrences of NAN in the matrix with the corresponding values from an identity matrix.
void GClasses::GMatrix::flush | ( | ) |
Deletes all the rows in this matrix.
void GClasses::GMatrix::fromVector | ( | const double * | pVector, |
size_t | nRows | ||
) |
Copies the data from pVector over this dataset.
nRows specifies the number of rows of data in pVector.
|
inline |
Returns a pointer to the first row.
|
inline |
bool GClasses::GMatrix::gaussianElimination | ( | double * | pVector | ) |
Computes y in the equation M*y=x (or y=M^(-1)x), where M is this dataset, which must be a square matrix, and x is pVector as passed in, and y is pVector after the call.
If there are multiple solutions, it finds the one for which all the variables in the null-space have a value of 1. If there are no solutions, it returns false. Note that this method trashes this dataset (so make a copy first if you care).
|
protected |
bool GClasses::GMatrix::isAttrHomogenous | ( | size_t | col | ) | const |
Returns true iff the specified attribute contains homogenous values. (Unknowns are counted as homogenous with anything)
bool GClasses::GMatrix::isHomogenous | ( | ) | const |
Returns true iff each of the last labelDims columns in the data are homogenous.
This computes K=kabsch(A,B), such that K is an n-by-n matrix, where n is pA->cols(). K is the optimal orthonormal rotation matrix to align A and B, such that A(K^T) minimizes sum-squared error with B, and BK minimizes sum-squared error with A. (This rotates around the origin, so typically you will want to subtract the centroid from both pA and pB before calling this.)
bool GClasses::GMatrix::leastCorrelatedVector | ( | GVec & | out, |
const GMatrix * | pThat, | ||
GRand * | pRand | ||
) | const |
Computes the vector in this subspace that has the greatest distance from its projection into pThat subspace.
Returns true if the results are computed.
Returns false if the subspaces are so nearly parallel that pOut cannot be computed with accuracy.
double GClasses::GMatrix::linearCorrelationCoefficient | ( | size_t | attr1, |
double | attr1Origin, | ||
size_t | attr2, | ||
double | attr2Origin | ||
) | const |
Computes the linear coefficient between the two specified attributes.
Usually you will want to pass the mean values for attr1Origin and attr2Origin.
void GClasses::GMatrix::load | ( | const char * | szFilename | ) |
Loads a file and automatically detects ARFF or raw (binary)
void GClasses::GMatrix::loadArff | ( | const char * | szFilename | ) |
Loads an ARFF file and replaces the contents of this matrix with it.
void GClasses::GMatrix::loadRaw | ( | const char * | szFilename | ) |
Loads a raw (binary) file and replaces the contents of this matrix with it.
void GClasses::GMatrix::LUDecomposition | ( | ) |
Performs an in-place LU-decomposition, such that the lower triangle of this matrix (including the diagonal) specifies L, and the uppoer triangle of this matrix (not including the diagonal) specifies U, and all values of U along the diagonal are ones. (The upper triangle of L and the lower triangle of U are all zeros.)
void GClasses::GMatrix::makeIdentity | ( | ) |
Sets this dataset to an identity matrix. (It doesn't change the number of columns or rows. It just stomps over existing values.)
double GClasses::GMatrix::measureInfo | ( | ) | const |
Computes the sum entropy of the data (or the sum variance for continuous attributes)
|
static |
Merges two datasets side-by-side. The resulting dataset will contain the attributes of both datasets. Both pSetA and pSetB (and the resulting dataset) must have the same number of rows.
void GClasses::GMatrix::mergeVert | ( | GMatrix * | pData, |
bool | ignoreMismatchingName = false |
||
) |
Steals all the rows from pData and adds them to this set. (You still have to delete pData.) Both datasets must have the same number of columns.
void GClasses::GMatrix::mirrorTriangle | ( | bool | upperToLower | ) |
copies one of the triangular submatrices over the other, making a symmetric matrix.
upperToLower | If true, copies the upper triangle of this matrix over the lower triangle. Otherwise, copies the lower triangle of this matrix over the upper triangle |
void GClasses::GMatrix::multiply | ( | double | scalar | ) |
Multiplies every element in the dataset by scalar. Behavior is undefined for nominal columns.
void GClasses::GMatrix::multiply | ( | const GVec & | vectorIn, |
GVec & | vectorOut, | ||
bool | transpose = false |
||
) | const |
Multiplies this matrix by the column vector pVectorIn to get pVectorOut.
(If transpose is true, then it multiplies the transpose of this matrix by pVectorIn to get pVectorOut.)
pVectorIn should have the same number of elements as columns (or rows if transpose is true)
pVectorOut should have the same number of elements as rows (or cols, if transpose is true.)
|
static |
Matrix multiply.
For convenience, you can also specify that neither, one, or both of the inputs are virtually transposed prior to the multiplication. (If you want the results to come out transposed, you can use the equality (AB)^T=(B^T)(A^T) to figure out how to specify the parameters.)
void GClasses::GMatrix::newColumns | ( | size_t | n | ) |
Adds 'n' new columns to the matrix. (This resizes every row and copies all the existing data, which is rather inefficient.) The values in the new columns are not initialized.
GVec& GClasses::GMatrix::newRow | ( | ) |
Adds a new row to the matrix. (The values in the row are not initialized.) Returns a reference to the new row.
void GClasses::GMatrix::newRows | ( | size_t | nRows | ) |
Adds "nRows" uninitialized rows to this matrix.
void GClasses::GMatrix::normalizeColumn | ( | size_t | col, |
double | dInMin, | ||
double | dInMax, | ||
double | dOutMin = 0.0 , |
||
double | dOutMax = 1.0 |
||
) |
Normalizes the specified column.
|
static |
Normalize a value from the input min and max to the output min and max.
bool GClasses::GMatrix::operator== | ( | const GMatrix & | other | ) | const |
Returns true iff all the entries in *this and other are identical and their relations are compatible, and they are the same size.
|
inline |
Returns a pointer to the specified row.
|
inline |
Returns a const pointer to the specified row.
void GClasses::GMatrix::pairedTTest | ( | size_t * | pOutV, |
double * | pOutT, | ||
size_t | attr1, | ||
size_t | attr2, | ||
bool | normalize | ||
) | const |
Performs a paired T-Test with data from the two specified attributes.
pOutV will hold the degrees of freedom. pOutT will hold the T-value. You can use GMath::tTestAlphaValue to convert these to a P-value.
void GClasses::GMatrix::parseArff | ( | const char * | szFile, |
size_t | nLen | ||
) |
Parses an ARFF file and replaces the contents of this matrix with it.
void GClasses::GMatrix::parseArff | ( | GArffTokenizer & | tok | ) |
Parses an ARFF file and replaces the contents of this matrix with it.
void GClasses::GMatrix::principalComponent | ( | GVec & | outVector, |
const GVec & | centroid, | ||
GRand * | pRand | ||
) | const |
This is an efficient algorithm for iteratively computing the principal component vector (the eigenvector of the covariance matrix) of the data.
See "EM Algorithms for PCA and SPCA" by Sam Roweis, 1998 NIPS.
The size of pOutVector will be the number of columns in this matrix. (To compute the next principal component, call RemoveComponent, then call this method again.)
Computes the first principal component assuming the mean is already subtracted out of the data.
void GClasses::GMatrix::principalComponentIgnoreUnknowns | ( | GVec & | outVector, |
const GVec & | centroid, | ||
GRand * | pRand | ||
) | const |
Computes principal components, while ignoring missing values.
void GClasses::GMatrix::print | ( | std::ostream & | stream = std::cout , |
char | separator = ',' |
||
) | const |
Prints this matrix in ARFF format to the specified stream.
GMatrix* GClasses::GMatrix::pseudoInverse | ( | ) |
Computes the Moore-Penrose pseudoinverse of this matrix (using the SVD method). You are responsible to delete the matrix this returns.
|
inline |
Returns a const pointer to the relation object, which holds meta-data about the attributes (columns)
void GClasses::GMatrix::releaseAllRows | ( | ) |
Abandons (leaks) all the rows in this matrix.
GVec* GClasses::GMatrix::releaseRow | ( | size_t | index | ) |
Swaps the specified row with the last row, and then releases it from the dataset.
The caller is responsible to delete the row (array of doubles) this method returns.
GVec* GClasses::GMatrix::releaseRowPreserveOrder | ( | size_t | index | ) |
Releases the specified row from the dataset and shifts everything after it up one slot.
The caller is responsible to delete the row this method returns.
Removes the component specified by pComponent from the data. (pComponent should already be normalized.)
This might be useful, for example, to remove the first principal component from the data so you can then proceed to compute the second principal component, and so forth.
void GClasses::GMatrix::removeComponentAboutOrigin | ( | const GVec & | component | ) |
Removes the specified component assuming the mean is zero.
void GClasses::GMatrix::replaceMissingValuesRandomly | ( | size_t | nAttr, |
GRand * | pRand | ||
) |
Replaces all missing values by copying a randomly selected non-missing value in the same attribute.
void GClasses::GMatrix::replaceMissingValuesWithBaseline | ( | size_t | nAttr | ) |
Replace missing values with the appropriate measure of central tendency.
If the specified attribute is continuous, replaces all missing values in that attribute with the mean. If the specified attribute is nominal, replaces all missing values in that attribute with the most common value.
|
inline |
Allocates space for the specified number of patterns (to avoid superfluous resizing)
void GClasses::GMatrix::resize | ( | size_t | rows, |
size_t | cols | ||
) |
Resizes this matrix. Assigns all columns to be continuous, and replaces all element values with garbage.
void GClasses::GMatrix::reverseRows | ( | ) |
Reverses the row order.
|
inline |
Returns a pointer to the specified row.
|
inline |
Returns a const pointer to the specified row.
|
inline |
Returns the number of rows in the dataset.
void GClasses::GMatrix::saveArff | ( | const char * | szFilename | ) |
Saves the dataset to a file in ARFF format.
void GClasses::GMatrix::saveRaw | ( | const char * | szFilename | ) |
Saves the dataset to a file in raw (binary) format.
void GClasses::GMatrix::scaleColumn | ( | size_t | col, |
double | scalar | ||
) |
Scales the column by the specified scalar.
Marshalls this object to a DOM, which may be saved to a variety of serial formats.
void GClasses::GMatrix::setCol | ( | size_t | index, |
const double * | pVector | ||
) |
Copies pVector over the specified column.
void GClasses::GMatrix::setRelation | ( | GRelation * | pRelation | ) |
Sets the relation for this dataset, which specifies the number of columns, and their data types. If there are one or more rows in this matrix, and the new relation does not have the same number of columns as the old relation, then this will throw an exception. Takes ownership of pRelation. That is, the destructor will delete it.
Randomizes the order of the rows.
If pExtension is non-NULL, then it will also be shuffled such that corresponding rows are preserved.
Shuffles the order of the rows. Also shuffles the rows in "other" in the same way, such that corresponding rows are preserved.
void GClasses::GMatrix::shuffleLikeCards | ( | ) |
This is an inferior way to shuffle the data.
void GClasses::GMatrix::singularValueDecomposition | ( | GMatrix ** | ppU, |
double ** | ppDiag, | ||
GMatrix ** | ppV, | ||
bool | throwIfNoConverge = false , |
||
size_t | maxIters = 80 |
||
) |
Performs SVD on A, where A is this m-by-n matrix.
You are responsible to delete(*ppU), delete(*ppV), and delete[] *ppDiag.
ppU | *ppU will be set to an m-by-m matrix where the columns are the *eigenvectors of A(A^T). |
ppDiag | *ppDiag will be set to an array of n doubles holding the square roots of the corresponding eigenvalues. |
ppV | *ppV will be set to an n-by-n matrix where the rows are the eigenvectors of (A^T)A. |
throwIfNoConverge | if true, throws an exception if the SVD solver does not converge. does nothing otherwise |
maxIters | the maximum number of iterations to perform in the SVD solver |
|
protected |
void GClasses::GMatrix::sort | ( | size_t | nDimension | ) |
Sorts the data from smallest to largest in the specified dimension.
|
inline |
Sorts rows according to the specified compare function. (Return true to indicate that the first row comes before the second row.)
void GClasses::GMatrix::sortPartial | ( | size_t | row, |
size_t | col | ||
) |
This partially sorts the specified column, such that the specified row will contain the same row as if it were fully sorted, and previous rows will contain a value <= to it in that column, and later rows will contain a value >= to it in that column. Unlike sort, which has O(m*log(m)) complexity, this method has O(m) complexity. This might be useful, for example, for efficiently finding the row with a median value in some attribute, or for separating data by a threshold in some value.
void GClasses::GMatrix::splitByPivot | ( | GMatrix * | pGreaterOrEqual, |
size_t | nAttribute, | ||
double | dPivot, | ||
GMatrix * | pExtensionA = NULL , |
||
GMatrix * | pExtensionB = NULL |
||
) |
Splits this set of data into two sets. Values greater-than-or-equal-to dPivot stay in this data set. Values less than dPivot go into pLessThanPivot.
If pExtensionA is non-NULL, then it will also split pExtensionA such that corresponding rows are preserved.
void GClasses::GMatrix::splitBySize | ( | GMatrix & | other, |
size_t | nOtherRows | ||
) |
Removes the last nOtherRows rows from this data set and puts them in "other". (Order is preserved.)
void GClasses::GMatrix::splitCategoricalKeepIfEqual | ( | GMatrix * | pOtherValues, |
size_t | nAttr, | ||
int | nValue, | ||
GMatrix * | pExtensionA = NULL , |
||
GMatrix * | pExtensionB = NULL |
||
) |
Moves all rows with the specified value in the specified attribute into pOtherValues.
If pExtensionA is non-NULL, then it will also split pExtensionA such that corresponding rows are preserved.
void GClasses::GMatrix::splitCategoricalKeepIfNotEqual | ( | GMatrix * | pSingleClass, |
size_t | nAttr, | ||
int | nValue, | ||
GMatrix * | pExtensionA = NULL , |
||
GMatrix * | pExtensionB = NULL |
||
) |
Moves all rows with the specified value in the specified attribute into pSingleClass.
If pExtensionA is non-NULL, then it will also split pExtensionA such that corresponding rows are preserved.
void GClasses::GMatrix::subtract | ( | const GMatrix * | pThat, |
bool | transpose | ||
) |
Matrix subtract. Subtracts the values in *pThat from *this.
(If transpose is true, subtracts the transpose of *pThat from this.) Both datasets must have the same dimensions. Behavior is undefined for nominal columns.
pThat | pointer to the matrix to subtract from *this |
transpose | If true, the transpose of *pThat is subtracted. If false, *pThat is subtracted |
double GClasses::GMatrix::sumSquaredDifference | ( | const GMatrix & | that, |
bool | transpose = false |
||
) | const |
Computes the squared distance between this and that.
If transpose is true, computes the difference between this and the transpose of that.
double GClasses::GMatrix::sumSquaredDiffWithIdentity | ( | ) |
Returns the sum squared difference between this matrix and an identity matrix.
double GClasses::GMatrix::sumSquaredDistance | ( | const GVec & | point | ) | const |
Computes the sum-squared distance between pPoint and all of the points in the dataset.
If pPoint is NULL, it computes the sum-squared distance with the origin.
void GClasses::GMatrix::swapColumns | ( | size_t | nAttr1, |
size_t | nAttr2 | ||
) |
Swaps two columns.
Swap pNewRow in for row i, and return row i. The caller is then responsible to delete the row that is returned.
void GClasses::GMatrix::swapRows | ( | size_t | a, |
size_t | b | ||
) |
Swaps the two specified rows.
void GClasses::GMatrix::takeRow | ( | GVec * | pRow, |
size_t | pos = (size_t)-1 |
||
) |
Adds an already-allocated row to this dataset. If pos is specified, the new row will be inserted and the speicified position.
|
static |
Performs unit tests for this class. Throws an exception if there is a failure.
size_t GClasses::GMatrix::toReducedRowEchelonForm | ( | ) |
Converts the matrix to reduced row echelon form.
void GClasses::GMatrix::toVector | ( | double * | pVector | ) | const |
double GClasses::GMatrix::trace | ( | ) |
Returns the sum of the diagonal elements.
GMatrix* GClasses::GMatrix::transpose | ( | ) |
Returns a pointer to a new dataset that is this dataset transposed. (All columns in the returned dataset will be continuous.)
The returned matrix must be deleted by the caller.
void GClasses::GMatrix::weightedPrincipalComponent | ( | GVec & | outVector, |
const GVec & | centroid, | ||
const double * | pWeights, | ||
GRand * | pRand | ||
) | const |
Computes the first principal component of the data with each row weighted according to the vector pWeights. (pWeights must have an element for each row.)
void GClasses::GMatrix::wilcoxonSignedRanksTest | ( | size_t | attr1, |
size_t | attr2, | ||
double | tolerance, | ||
int * | pNum, | ||
double * | pWMinus, | ||
double * | pWPlus | ||
) | const |
Performs the Wilcoxon signed ranks test from the two specified attributes.
If two values are closer than tolerance, they are considered to be equal.
|
protected |
|
protected |