Challenge 26
🌴
Word Stemmer
For data analysis, indexing/searching and other tasks, we often want to 'stem' words - this involves reducing them to their base form. Taking the example of searching, this ensures a user can type different variations (tenses, plurality etc) and still find valid matches. Your job is to create a function that takes a word and stems it according to the following simple rules
- Remove any "ed", "er", "ing", "ly" suffixes
- De-pluralise by removing the final "s" from words ending with a single "s"
- Replace "ies" suffixes with "y"
Examples:
stem("walk") returns "walk"
stem("walked") returns "walk"
stem("walker") returns "walk"
stem("walking") returns "walk"
stem("quickly") returns "quick"
stem("apples") returns "apple"
stem("candies") returns "candy"
Extension:
These are some simple stemming rules. Search online to find some more stemming rules and implement them - here is the .txt version (https://tartarus.org/martin/PorterStemmer/def.txt) of the original Porter stemmer paper. Steps 1a to 5b towards the middle-end of the document have the rules