Maintenance

  • Home
  • Log in
  • « Db backups
  • Building eXist: some progress, but no champagne yet »

Stemming and analyzers for eXist

Posted by mholmes on 18 Nov 2010 in R & D, Activity log, Documentation

As part of the Moses project search engine, I've been looking at the possibility of using a more sophisticated Analyzer for Lucene, to enable stemming in indexing and searching. There is a Lucene analyzer called the SnowballAnalyzer which does stemming for many languages, but unfortunately it can't be used directly in eXist because its constructor requires different parameters than the default constructor for an Analyzer that eXist is expecting. This post suggests a way around this by creating a wrapper class, but this seems a bit complicated, especially in contexts where we expect to rebuild eXist regularly.

However, this post refers to a package that contains stemming-capable analyzers that will work with eXist as it is. However, none of them appears to handle English. So for the moment, we're a bit stuck, but it may be that someone patches eXist so it can use the Snowball analyzer.

This entry was posted by Martin and filed under R & D, Activity log, Documentation.

Maintenance

This blog is the location for all work involving software and hardware maintenance, updates, installs, etc., both routine and urgent.
  • Home
  • Recently
  • Archives
  • Categories

Search

Categories

  • All
  • Announcements
  • Hit by a bus
  • Labs
    • Activity log
    • Documentation
  • Notes
  • R & D
    • Activity log
    • Documentation
  • Servers
    • Activity log
    • Documentation
  • Tasks

All blogs

  • Academic
  • AdaptiveDB
  • Admin
  • Announcements
  • CanMys
  • Cascade
  • CGWP
  • ColDesp
  • Depts
  • DVPP
  • Endings
  • HCMC Blogs
  • Landscapes
  • LEMDO
  • Linguistics
  • Maint
  • LondonMap
  • Mariage
  • MoM
  • Moses
  • Pro-D
  • Projects
  • ScanCan
  • HumsSites
  • Wendat

This collection ©2022 by admin • Help • Multiblog engine