GClasses
GClasses::GHtml Class Reference

Detailed Description

This class is for parsing HTML files. It's designed to be very simple. This class might be useful, for example, for building a web-crawler or for extracting readable text from a web page.

#include <GHtml.h>

Public Member Functions

 GHtml (const char *pDoc, size_t nSize)
 
virtual ~GHtml ()
 
virtual void onComment (const char *pComment, size_t len)
 This method is called when an HTML comment () is found. More...
 
virtual void onTag (const char *pTagName, size_t tagNameLen)
 This method is called whenever a new tag is found. More...
 
virtual void onTagParam (const char *pTagName, size_t tagNameLen, const char *pParamName, size_t paramNameLen, const char *pValue, size_t valueLen)
 This method is called for each parameter in the tag. More...
 
virtual void onTextChunk (const char *pChunk, size_t chunkSize)
 This method will be called whenever the parser finds a section of display text. More...
 
bool parseSomeMore ()
 You should call this method in a loop until it returns false. It parses a little bit more of the document each time you call it. It returns false if there was nothing more to parse. The various virtual methods are called whenever it finds something interesting. More...
 

Protected Member Functions

void parseTag ()
 

Protected Attributes

size_t m_nPos
 
size_t m_nSize
 
const char * m_pDoc
 

Constructor & Destructor Documentation

GClasses::GHtml::GHtml ( const char *  pDoc,
size_t  nSize 
)
virtual GClasses::GHtml::~GHtml ( )
virtual

Member Function Documentation

virtual void GClasses::GHtml::onComment ( const char *  pComment,
size_t  len 
)
inlinevirtual

This method is called when an HTML comment () is found.

virtual void GClasses::GHtml::onTag ( const char *  pTagName,
size_t  tagNameLen 
)
inlinevirtual

This method is called whenever a new tag is found.

virtual void GClasses::GHtml::onTagParam ( const char *  pTagName,
size_t  tagNameLen,
const char *  pParamName,
size_t  paramNameLen,
const char *  pValue,
size_t  valueLen 
)
inlinevirtual

This method is called for each parameter in the tag.

virtual void GClasses::GHtml::onTextChunk ( const char *  pChunk,
size_t  chunkSize 
)
inlinevirtual

This method will be called whenever the parser finds a section of display text.

bool GClasses::GHtml::parseSomeMore ( )

You should call this method in a loop until it returns false. It parses a little bit more of the document each time you call it. It returns false if there was nothing more to parse. The various virtual methods are called whenever it finds something interesting.

void GClasses::GHtml::parseTag ( )
protected

Member Data Documentation

size_t GClasses::GHtml::m_nPos
protected
size_t GClasses::GHtml::m_nSize
protected
const char* GClasses::GHtml::m_pDoc
protected