Detailed Description

This is the base class of supervised learning algorithms (that may or may not have an internal model allowing them to generalize rows that were not available at training time). Note that the literature typically refers to supervised learning algorithms that can't generalize (because they lack an internal hypothesis model) as "Semi-supervised". (You cannot generalize with a semi-supervised algorithm–you have to train again with the new rows.)

#include <GLearner.h>

Inheritance diagram for GClasses::GTransducer:

Public Member Functions
	GTransducer ()
	General-purpose constructor. More...

	GTransducer (const GTransducer &that)
	Copy-constructor. Throws an exception to prevent models from being copied by value. More...

virtual	~GTransducer ()

virtual bool	canGeneralize ()
	Returns false because semi-supervised learners have no internal model, so they can't evaluate previously unseen rows. More...

virtual bool	canImplicitlyHandleContinuousFeatures ()
	Returns true iff this algorithm can implicitly handle continuous features. If it cannot, then the GDiscretize transform will be used to convert continuous features to nominal values before passing them to it. More...

virtual bool	canImplicitlyHandleContinuousLabels ()
	Returns true iff this algorithm can implicitly handle continuous labels (a.k.a. regression). If it cannot, then the GDiscretize transform will be used during training to convert nominal labels to continuous values, and to convert nominal predictions back to continuous labels. More...

virtual bool	canImplicitlyHandleMissingFeatures ()
	Returns true iff this algorithm supports missing feature values. If it cannot, then an imputation filter will be used to predict missing values before any feature-vectors are passed to the algorithm. More...

virtual bool	canImplicitlyHandleNominalFeatures ()
	Returns true iff this algorithm can implicitly handle nominal features. If it cannot, then the GNominalToCat transform will be used to convert nominal features to continuous values before passing them to it. More...

virtual bool	canImplicitlyHandleNominalLabels ()
	Returns true iff this algorithm can implicitly handle nominal labels (a.k.a. classification). If it cannot, then the GNominalToCat transform will be used during training to convert nominal labels to continuous values, and to convert categorical predictions back to nominal labels. More...

virtual bool	canTrainIncrementally ()
	Returns false because semi-supervised learners cannot be trained incrementally. More...

double	crossValidate (const GMatrix &features, const GMatrix &labels, size_t nFolds, double pOutSAE=NULL, RepValidateCallback pCB=NULL, size_t nRep=0, void pThis=NULL)
	Perform n-fold cross validation on pData. Returns sum-squared error. Uses trainAndTest for each fold. pCB is an optional callback method for reporting intermediate stats. It can be NULL if you don't want intermediate reporting. nRep is just the rep number that will be passed to the callback. pThis is just a pointer that will be passed to the callback for you to use however you want. It doesn't affect this method. if pOutSAE is not NULL, the sum absolute error will be placed there. More...

GTransducer &	operator= (const GTransducer &other)
	Throws an exception to prevent models from being copied by value. More...

GRand &	rand ()
	Returns a reference to the random number generator associated with this object. For example, you could use it to change the random seed, to make this algorithm behave differently. This might be important, for example, in an ensemble of learners. More...

double	repValidate (const GMatrix &features, const GMatrix &labels, size_t reps, size_t nFolds, double pOutSAE=NULL, RepValidateCallback pCB=NULL, void pThis=NULL)
	Perform cross validation "nReps" times and return the average score. pCB is an optional callback method for reporting intermediate stats It can be NULL if you don't want intermediate reporting. pThis is just a pointer that will be passed to the callback for you to use however you want. It doesn't affect this method. if pOutSAE is not NULL, the sum absolute error will be placed there. More...

virtual bool	supportedFeatureRange (double pOutMin, double pOutMax)
	Returns true if this algorithm supports any feature value, or if it does not implicitly handle continuous features. If a limited range of continuous values is supported, returns false and sets pOutMin and pOutMax to specify the range. More...

virtual bool	supportedLabelRange (double pOutMin, double pOutMax)
	Returns true if this algorithm supports any label value, or if it does not implicitly handle continuous labels. If a limited range of continuous values is supported, returns false and sets pOutMin and pOutMax to specify the range. More...

virtual double	trainAndTest (const GMatrix &trainFeatures, const GMatrix &trainLabels, const GMatrix &testFeatures, const GMatrix &testLabels, double *pOutSAE=NULL)
	Trains and tests this learner. Returns the sum-squared-error. if pOutSAE is not NULL, the sum absolute error will be placed there. More...

std::unique_ptr< GMatrix >	transduce (const GMatrix &features1, const GMatrix &labels1, const GMatrix &features2)
	Predicts a set of labels to correspond with features2, such that these labels will be consistent with the patterns exhibited by features1 and labels1. More...

void	transductiveConfusionMatrix (const GMatrix &trainFeatures, const GMatrix &trainLabels, const GMatrix &testFeatures, const GMatrix &testLabels, std::vector< GMatrix * > &stats)
	Makes a confusion matrix for a transduction algorithm. More...

Protected Member Functions
virtual std::unique_ptr< GMatrix >	transduceInner (const GMatrix &features1, const GMatrix &labels1, const GMatrix &features2)=0
	This is the algorithm's implementation of transduction. (It is called by the transduce method.) More...

Protected Attributes
GRand	m_rand

Constructor & Destructor Documentation

GClasses::GTransducer::GTransducer ( )

General-purpose constructor.

GClasses::GTransducer::GTransducer ( const GTransducer & that )

inline

Copy-constructor. Throws an exception to prevent models from being copied by value.

virtual GClasses::GTransducer::~GTransducer ( )

virtual

Member Function Documentation

virtual bool GClasses::GTransducer::canGeneralize ( )

inlinevirtual

Returns false because semi-supervised learners have no internal model, so they can't evaluate previously unseen rows.

Reimplemented in GClasses::GSupervisedLearner.

virtual bool GClasses::GTransducer::canImplicitlyHandleContinuousFeatures ( )

inlinevirtual

Returns true iff this algorithm can implicitly handle continuous features. If it cannot, then the GDiscretize transform will be used to convert continuous features to nominal values before passing them to it.

Reimplemented in GClasses::GNaiveBayes.

virtual bool GClasses::GTransducer::canImplicitlyHandleContinuousLabels ( )

inlinevirtual

Returns true iff this algorithm can implicitly handle continuous labels (a.k.a. regression). If it cannot, then the GDiscretize transform will be used during training to convert nominal labels to continuous values, and to convert nominal predictions back to continuous labels.

Reimplemented in GClasses::GResamplingAdaBoost, GClasses::GGraphCutTransducer, GClasses::GBayesianModelCombination, GClasses::GBayesianModelAveraging, GClasses::GNeighborTransducer, GClasses::GBomb, GClasses::GAgglomerativeTransducer, and GClasses::GNaiveBayes.

virtual bool GClasses::GTransducer::canImplicitlyHandleMissingFeatures ( )

inlinevirtual

Returns true iff this algorithm supports missing feature values. If it cannot, then an imputation filter will be used to predict missing values before any feature-vectors are passed to the algorithm.

Reimplemented in GClasses::GReservoirNet, GClasses::GNeuralNetLearner, GClasses::GKNN, GClasses::GLinearDistribution, GClasses::GGaussianProcess, and GClasses::GLinearRegressor.

virtual bool GClasses::GTransducer::canImplicitlyHandleNominalFeatures ( )

inlinevirtual

Returns true iff this algorithm can implicitly handle nominal features. If it cannot, then the GNominalToCat transform will be used to convert nominal features to continuous values before passing them to it.

Reimplemented in GClasses::GReservoirNet, GClasses::GNeuralNetLearner, GClasses::GWag, GClasses::GSparseInstance, GClasses::GInstanceTable, GClasses::GNeighborTransducer, GClasses::GMeanMarginsTree, GClasses::GLinearDistribution, GClasses::GGaussianProcess, GClasses::GNaiveInstance, GClasses::GLinearRegressor, and GClasses::GPolynomial.

virtual bool GClasses::GTransducer::canImplicitlyHandleNominalLabels ( )

inlinevirtual

Returns true iff this algorithm can implicitly handle nominal labels (a.k.a. classification). If it cannot, then the GNominalToCat transform will be used during training to convert nominal labels to continuous values, and to convert categorical predictions back to nominal labels.

Reimplemented in GClasses::GReservoirNet, GClasses::GNeuralNetLearner, GClasses::GWag, GClasses::GSparseInstance, GClasses::GMeanMarginsTree, GClasses::GLinearDistribution, GClasses::GGaussianProcess, GClasses::GNaiveInstance, GClasses::GLinearRegressor, and GClasses::GPolynomial.

virtual bool GClasses::GTransducer::canTrainIncrementally ( )

inlinevirtual

Returns false because semi-supervised learners cannot be trained incrementally.

Reimplemented in GClasses::GFilter, and GClasses::GIncrementalLearner.

double GClasses::GTransducer::crossValidate	(	const GMatrix &	features,
		const GMatrix &	labels,
		size_t	nFolds,
		double *	pOutSAE = `NULL`,
		RepValidateCallback	pCB = `NULL`,
		size_t	nRep = `0`,
		void *	pThis = `NULL`
	)

Perform n-fold cross validation on pData. Returns sum-squared error. Uses trainAndTest for each fold. pCB is an optional callback method for reporting intermediate stats. It can be NULL if you don't want intermediate reporting. nRep is just the rep number that will be passed to the callback. pThis is just a pointer that will be passed to the callback for you to use however you want. It doesn't affect this method. if pOutSAE is not NULL, the sum absolute error will be placed there.

GTransducer& GClasses::GTransducer::operator= ( const GTransducer & other )

inline

Throws an exception to prevent models from being copied by value.

GRand& GClasses::GTransducer::rand ( )

inline

Returns a reference to the random number generator associated with this object. For example, you could use it to change the random seed, to make this algorithm behave differently. This might be important, for example, in an ensemble of learners.

double GClasses::GTransducer::repValidate	(	const GMatrix &	features,
		const GMatrix &	labels,
		size_t	reps,
		size_t	nFolds,
		double *	pOutSAE = `NULL`,
		RepValidateCallback	pCB = `NULL`,
		void *	pThis = `NULL`
	)

Perform cross validation "nReps" times and return the average score. pCB is an optional callback method for reporting intermediate stats It can be NULL if you don't want intermediate reporting. pThis is just a pointer that will be passed to the callback for you to use however you want. It doesn't affect this method. if pOutSAE is not NULL, the sum absolute error will be placed there.

virtual bool GClasses::GTransducer::supportedFeatureRange	(	double *	pOutMin,
		double *	pOutMax
	)

inlinevirtual

Returns true if this algorithm supports any feature value, or if it does not implicitly handle continuous features. If a limited range of continuous values is supported, returns false and sets pOutMin and pOutMax to specify the range.

Reimplemented in GClasses::GReservoirNet, and GClasses::GNeuralNetLearner.

virtual bool GClasses::GTransducer::supportedLabelRange	(	double *	pOutMin,
		double *	pOutMax
	)

inlinevirtual

Returns true if this algorithm supports any label value, or if it does not implicitly handle continuous labels. If a limited range of continuous values is supported, returns false and sets pOutMin and pOutMax to specify the range.

Reimplemented in GClasses::GReservoirNet, and GClasses::GNeuralNetLearner.

virtual double GClasses::GTransducer::trainAndTest	(	const GMatrix &	trainFeatures,
		const GMatrix &	trainLabels,
		const GMatrix &	testFeatures,
		const GMatrix &	testLabels,
		double *	pOutSAE = `NULL`
	)

virtual

Trains and tests this learner. Returns the sum-squared-error. if pOutSAE is not NULL, the sum absolute error will be placed there.

Reimplemented in GClasses::GSupervisedLearner.

std::unique_ptr<GMatrix> GClasses::GTransducer::transduce	(	const GMatrix &	features1,
		const GMatrix &	labels1,
		const GMatrix &	features2
	)

Predicts a set of labels to correspond with features2, such that these labels will be consistent with the patterns exhibited by features1 and labels1.

virtual std::unique_ptr<GMatrix> GClasses::GTransducer::transduceInner	(	const GMatrix &	features1,
		const GMatrix &	labels1,
		const GMatrix &	features2
	)

protectedpure virtual

This is the algorithm's implementation of transduction. (It is called by the transduce method.)

Implemented in GClasses::GGraphCutTransducer, GClasses::GSupervisedLearner, GClasses::GNeighborTransducer, and GClasses::GAgglomerativeTransducer.

void GClasses::GTransducer::transductiveConfusionMatrix	(	const GMatrix &	trainFeatures,
		const GMatrix &	trainLabels,
		const GMatrix &	testFeatures,
		const GMatrix &	testLabels,
		std::vector< GMatrix * > &	stats
	)

Makes a confusion matrix for a transduction algorithm.

Member Data Documentation

GRand GClasses::GTransducer::m_rand

protected

Detailed Description

Public Member Functions

Protected Member Functions

Protected Attributes

Constructor & Destructor Documentation

Member Function Documentation

Member Data Documentation