GClasses
GClasses::GStemmer Class Reference

Detailed Description

This class just wraps the Porter Stemmer. It finds the stems of words. Examples: "cats"->"cat" "dogs"->"dog" "fries"->"fri" "fishes"->"fish" "pies"->"pi" "lovingly"->"lovingli" "candy"->"candi" "babies"->"babi" "bus"->"bu" "busses"->"buss" "women"->"women" "hasty"->"hasti" "hastily"->"hastili" "fly"->"fly" "kisses"->"kiss" "goes"->"goe" "brought"->"brought" As you can see the stems aren't always real words, but that's okay as long as it produces the same stem for words that have the same etymological roots. Even then it still isn't perfect (notice it got "bus" wrong), but it should still improve analysis somewhat in many cases.

#include <GStemmer.h>

Public Member Functions

 GStemmer ()
 
 ~GStemmer ()
 
const char * getStem (const char *szWord, size_t nLen)
 Pass in a word (you don't need to lowercase or null-terminate it) and this will return its stem. The buffer it returns is only valid until the next time you call GetStem. More...
 

Protected Attributes

struct stemmer * m_pPorterStemmer
 
char m_szBuf [GSTEMMER_MAX_WORD_SIZE]
 

Constructor & Destructor Documentation

GClasses::GStemmer::GStemmer ( )
GClasses::GStemmer::~GStemmer ( )

Member Function Documentation

const char* GClasses::GStemmer::getStem ( const char *  szWord,
size_t  nLen 
)

Pass in a word (you don't need to lowercase or null-terminate it) and this will return its stem. The buffer it returns is only valid until the next time you call GetStem.

Member Data Documentation

struct stemmer* GClasses::GStemmer::m_pPorterStemmer
protected
char GClasses::GStemmer::m_szBuf[GSTEMMER_MAX_WORD_SIZE]
protected