GClasses
GClasses::GContentBasedFilter Class Reference

#include <GRecommender.h>

Inheritance diagram for GClasses::GContentBasedFilter:
GClasses::GCollaborativeFilter

Public Member Functions

 GContentBasedFilter (GArgReader copy)
 General-purpose constructor. More...
 
virtual ~GContentBasedFilter ()
 Destructor. More...
 
void clear ()
 Delete all of the learners. More...
 
std::map< size_t, size_t > getItemMap ()
 
std::map< size_t, size_t > getUserMap ()
 
std::multimap< size_t, size_t > getUserRatings ()
 
virtual void impute (GVec &vec, size_t dims)
 See the comment for GCollaborativeFilter::impute. More...
 
virtual double predict (size_t user, size_t item)
 This returns a prediction for how the specified user will rate the specified item. (The model must be trained before this method is called. Also, some values for that user and item should have been included in the training set, or else this method will have no basis to make a good prediction.) More...
 
virtual GDomNodeserialize (GDom *pDoc) const
 See the comment for GCollaborativeFilter::serialize. More...
 
void setItemAttributes (GMatrix &itemAttrs)
 
virtual void train (GMatrix &data)
 Trains this recommender system. Let R be an m-by-n sparse matrix of known ratings from m users of n items. pData should contain 3 columns, and one row for each known element in R. Column 0 in pData specifies the user index from 0 to m-1, column 1 in pData specifies the item index from 0 to n-1, and column 2 in pData specifies the rating vector for that user-item pair. All attributes in pData should be continuous. More...
 
- Public Member Functions inherited from GClasses::GCollaborativeFilter
 GCollaborativeFilter ()
 
 GCollaborativeFilter (const GDomNode *pNode, GLearnerLoader &ll)
 
virtual ~GCollaborativeFilter ()
 
void basicTest (double minMSE)
 Performs a basic unit test on this collaborative filter. More...
 
double crossValidate (GMatrix &data, size_t folds, double *pOutMAE=NULL)
 This randomly assigns each rating to one of the folds. Then, for each fold, it calls train with a dataset that contains everything except for the ratings in that fold. It predicts values for the items in the fold, and returns the mean-squared difference between the predictions and the actual ratings. If pOutMAE is non-NULL, it will be set to the mean-absolute error. More...
 
GMatrixprecisionRecall (GMatrix &data, bool ideal=false)
 This divides the data into two equal-size parts. It trains on one part, and then measures the precision/recall using the other part. It returns a three-column data set with recall scores in column 0 and corresponding precision scores in column 1. The false-positive rate is in column 2. (So, if you want a precision-recall plot, just drop column 2. If you want an ROC curve, drop column 1 and swap the remaining two columns.) This method assumes the ratings range from 0 to 1, so be sure to scale the ratings to fit that range before calling this method. If ideal is true, then it will ignore your model and report the ideal results as if your model always predicted the correct rating. (This is useful because it shows the best possible results.) More...
 
GRandrand ()
 Returns a reference to the pseudo-random number generator associated with this object. More...
 
double trainAndTest (GMatrix &train, GMatrix &test, double *pOutMAE=NULL)
 This trains on the training set, and then tests on the test set. Returns the mean-squared difference between actual and target predictions. More...
 
void trainDenseMatrix (const GMatrix &data, const GMatrix *pLabels=NULL)
 Train from an m-by-n dense matrix, where m is the number of users and n is the number of items. All attributes must be continuous. Missing values are indicated with UNKNOWN_REAL_VALUE. If pLabels is non-NULL, then the labels will be appended as additional items. More...
 

Protected Attributes

GArgReader m_args
 
int m_init_pos
 
GMatrixm_itemAttrs
 
std::map< size_t, size_t > m_itemMap
 
size_t m_items
 
std::vector< GSupervisedLearner * > m_learners
 
std::map< size_t, size_t > m_userMap
 
std::multimap< size_t, size_t > m_userRatings
 
size_t m_users
 
- Protected Attributes inherited from GClasses::GCollaborativeFilter
GRand m_rand
 

Additional Inherited Members

- Static Public Member Functions inherited from GClasses::GCollaborativeFilter
static double areaUnderCurve (GMatrix &data)
 Pass in the data returned by the precisionRecall function (unmodified), and this will compute the area under the ROC curve. More...
 
- Protected Member Functions inherited from GClasses::GCollaborativeFilter
GDomNodebaseDomNode (GDom *pDoc, const char *szClassName) const
 Child classes should use this in their implementation of serialize. More...
 

Constructor & Destructor Documentation

GClasses::GContentBasedFilter::GContentBasedFilter ( GArgReader  copy)
inline

General-purpose constructor.

virtual GClasses::GContentBasedFilter::~GContentBasedFilter ( )
virtual

Destructor.

Member Function Documentation

void GClasses::GContentBasedFilter::clear ( )

Delete all of the learners.

std::map<size_t, size_t> GClasses::GContentBasedFilter::getItemMap ( )
inline
std::map<size_t, size_t> GClasses::GContentBasedFilter::getUserMap ( )
inline
std::multimap<size_t, size_t> GClasses::GContentBasedFilter::getUserRatings ( )
inline
virtual void GClasses::GContentBasedFilter::impute ( GVec vec,
size_t  dims 
)
virtual
virtual double GClasses::GContentBasedFilter::predict ( size_t  user,
size_t  item 
)
virtual

This returns a prediction for how the specified user will rate the specified item. (The model must be trained before this method is called. Also, some values for that user and item should have been included in the training set, or else this method will have no basis to make a good prediction.)

Implements GClasses::GCollaborativeFilter.

virtual GDomNode* GClasses::GContentBasedFilter::serialize ( GDom pDoc) const
virtual
void GClasses::GContentBasedFilter::setItemAttributes ( GMatrix itemAttrs)
virtual void GClasses::GContentBasedFilter::train ( GMatrix data)
virtual

Trains this recommender system. Let R be an m-by-n sparse matrix of known ratings from m users of n items. pData should contain 3 columns, and one row for each known element in R. Column 0 in pData specifies the user index from 0 to m-1, column 1 in pData specifies the item index from 0 to n-1, and column 2 in pData specifies the rating vector for that user-item pair. All attributes in pData should be continuous.

Implements GClasses::GCollaborativeFilter.

Member Data Documentation

GArgReader GClasses::GContentBasedFilter::m_args
protected
int GClasses::GContentBasedFilter::m_init_pos
protected
GMatrix* GClasses::GContentBasedFilter::m_itemAttrs
protected
std::map<size_t, size_t> GClasses::GContentBasedFilter::m_itemMap
protected
size_t GClasses::GContentBasedFilter::m_items
protected
std::vector<GSupervisedLearner*> GClasses::GContentBasedFilter::m_learners
protected
std::map<size_t, size_t> GClasses::GContentBasedFilter::m_userMap
protected
std::multimap<size_t, size_t> GClasses::GContentBasedFilter::m_userRatings
protected
size_t GClasses::GContentBasedFilter::m_users
protected