Stemming

An important pre-processing step before indexing input documents for text mining is the stemming of words. The term stemming refers to the reduction of words to their roots so that, for example, different grammatical forms or declinations of verbs are identified and indexed (counted) as the same word. For example, stemming will ensure that both "travel" and "traveled" will be recognized by the program as the same word.

For more information, see Manning and Schütze (2002) and Miner, G.; Elder, J., Hill, T., Nisbet, R., Delen, D., Fast, A. (2012); see also the STATISTICA Text Mining and Document Retrieval Introductory Overview.