Project By: Deepak Paudel
Supervisor: Dr. Aman Shakya
The Nepali Lemmatizer project focuses on converting Nepali words to their base forms, a process known as lemmatization. This is crucial for natural language processing tasks, as it helps in understanding the root meaning of words. To do so, the project utilizes two primary methods:
- TRIE-Based Approach: This method uses a TRIE data structure to efficiently store and retrieve word forms, facilitating quick lookup of base forms.
- Hybrid Approach: Combining the TRIE structure with additional algorithms, this approach aims to improve lemmatization accuracy by handling exceptions and irregular word forms.
The project’s repository includes scripts that allows users to input Nepali text and choose between the TRIE or hybrid method for lemmatization. This can be useful for developers and researchers working on Nepali language processing.
Github Repository:
URL: https://github.com/dpakpdl/NepaliLemmatizer