Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Development and Enhancement of a Stemmer for the Greek Language
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
2016 (English)In: Proceedings of the 20th Pan-Hellenic Conference on Informatics, Association for Computing Machinery (ACM), 2016, 3Conference paper, Published paper (Refereed)
Abstract [en]

Although there are three stemmers published for the Greek language, only the one presented in this paper and called Ntais’ stemmer is freely open and available, together with its enhancements and extensions according to Saroukos’ algorithm. The primary algorithm (Ntais’ algorithm) uses only capital letters and works with better performance than other past stemming algorithms for the Greek language, giving 92.1 percent correct results. Further extensions of the proposed stemming system (e.g. from capital to small letters) and more evaluation methods are presented according to a new and improved algorithm, Saroukos’ algorithm. Stemmer performance metrics are further used for evaluating the existing stemming system and algorithm and show how its accuracy and completeness are enhanced. The improvements were possible by providing an alternative implementation in the programming language PHP, which offers more syntactical rules and exceptions. The two versions of the stemming algorithm are tested and compared.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2016. 3
Keyword [en]
Stemming algorithm, stemmer metrics, Greek language, performance evaluation metrics, Natural Language Processing (NLP), Information Retrieval (IR)
National Category
Information Systems
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-136584DOI: 10.1145/3003733.3003775ISBN: 978-1-4503-4789-1 (electronic)OAI: oai:DiVA.org:su-136584DiVA: diva2:1055448
Conference
20th Pan-Hellenic Conference on Informatics, Patra, Greece, November 10-12, 2016
Available from: 2016-12-12 Created: 2016-12-12 Last updated: 2017-08-17Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Dalianis, Hercules
By organisation
Department of Computer and Systems Sciences
Information Systems

Search outside of DiVA

GoogleGoogle Scholar

Altmetric score

Total: 36 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf