Graves basically finished
I've got the new search working, and it's a lot cleaner and more straightforward. I've also added keyword highlighting when you go from the search to a document containing a hit (which wasn't there in the original site).
As far as I can see, the Graves project is pretty much ready for Endings now. We have:
- A complete static build, created fresh after every change here on Jenkins. (No functional search in the static build of course, but the browse options work fine.)
- An eXist XAR app which can be deployed in a standalone eXist instance, also built on Jenkins.
- A diagnostics report which shows only four minor issues (all instances of the same thing) remaining to be fixed in the data.
We're now ready to do some testing with the Heritrix crawler. I know the current version of Graves is useless for the crawler because everything is a db query rather than a URL; the new one all URL-based, so I'd like to set up a test crawl asap and compare the results with what's in archive.org at the moment. I'll enlist CD's help with that.
The other thing I'd like to start working on is creating a site configuration for Solr. I think we have a decision to make at this point, though: do we go ahead and deploy the new site on a fresh eXist instance and then point the graves.uvic.ca domain name at it, or should we be a little cautious and wait till the Heritrix testing is done before trying that?
One more thing to think about:
The page-images all currently live in a folder on our Tomcat server, served by the old web application, and the static build is simply pointing at them. Obviously we should eventually pull them into SVN and build them into the eXist webapp and the static build. That will add about 160MB to the size of the svn repo (which is no problem) and twice that much to the products on Jenkins (which is a problem, because the Jenkins server has very little disk space free at this point). What I propose to do is to parameterize the build so that it can be run with a switch that says "point to remote images" (default for now) and "assume local images" (which I won't use till we're ready). Then I need to get sysadmin to give me more disk space on Jenkins, and we can go ahead and switch to a local build.