Approach to variant spellings
I've been looking at how we might do variant spelling searches on the fly, using our plethora of name attributes, and this is as far as I've got. The code below searches for the search term in name elements, and then for each name element where it finds a hit, it gets the @target attribute. It then gets all the text from all the name elements with the same @target attributes as the hits, tokenizes them, and returns a list of the distinct values (and, in the code below, the hits themselves):
xquery version "1.0"; declare default element namespace "http://www.tei-c.org/ns/1.0"; declare namespace tei="http://www.tei-c.org/ns/1.0"; declare namespace ft="http://exist-db.org/xquery/lucene"; import module namespace kwic="http://exist-db.org/xquery/kwic"; declare namespace util="http://exist-db.org/xquery/util"; declare namespace exist="http://exist.sourceforge.net/NS/exist"; declare namespace xh="http://www.w3.org/1999/xhtml"; let $hits := //ref[ft:query(., 'abchurch')], $targs := distinct-values($hits//@target), $targString := concat(string-join($targs, ','), ','), $refs := //ref[@target][contains($targString, concat(@target, ','))], $strings := (for $r in $refs return tokenize($r/text(), '[\s]+')), $dist := distinct-values($strings) return ($dist, $refs)
This might be further refined by using a fuzzy-match between the search term and each of the distinct values, so that you could reject any that are completely different (e.g. "Mary"), and end up with a list of those which are similar (such as "Vpchurch").