Stemming Information Retrieval in Bahasa Indonesia

A Study of Stemming Effects on Information
Retrieval in Bahasa Indonesia
Fadillah Z Tala
0086975
Master of Logic Project
Institute for Logic, Language and Computation
Universiteit van Amsterdam
The NetherlandsContents
1 Introduction 1
2 A Purely Rule-based Stemmer for Bahasa Indonesia 3
2.1 Morphological Structure of Bahasa Indonesia Words . . . . . . . . . . . . . . . . . 3
2.2 The Porter Stemming Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Porter Stemmer for Bahasa Indonesia . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Evaluation of the Stemming Algorithm 11
3.1 Stemmer Quality Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1.1 The Paice Evaluation Method . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1.2 The Paice Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Error Analysis . . . . . . . … Stemming Information Retrieval in Bahasa Indonesia