For many projects it will be useful to have a way of calling a java lib which can make a universal similarity metric measurement of two strings. I've started working from this documentation to create a class and the necessary wrappers to make this work. I'm still trying to resolve some dependencies, but I think this will be practical, and we'll be able to use the USM module in the context of oXygen (where we're allowed to use Saxon EE). The testbed for this will be the matching of ContentDM records with our TEI metadata for maps.
No Pingbacks for this post yet...
The Colonial Despatches is an XML database project which is creating a digital archive containing the original correspondence between the British Colonial Office and the colonies of Vancouver Island and British Columbia. The project lives at http://bcgenesis.uvic.ca, and the web application runs on the Pear dev Tomcat. The XML data is managed in SVN at http://revision.tapor.uvic.ca/svn/coldesp/.
|<< <||> >>|