I've implemented search result caching in Mariage, and done a bunch more work to bring it up to speed with what I learned in the Graves project. However, I'm now faced with a problem in search design which also afflicts MoEML, summarized here:
Imagine you want to find "amour" in your documents. You search for "amour".
It finds (say) thirty documents which contains "amour". It returns the first ten (it's paging in sets of ten results), and it sets about giving you all the keyword-in-context display results for each document.
Now, the first document has 200 instances of "amour". So the search code has to do a kwic expand operation on all 200 of those results in order to give you 200 keyword-in-context fragments for that document. These operations take a long time, so it takes ages for your results to come back.
If your results page contains ten documents, each of which has 200 hits, you're now processing 2,000 hits to give a single page of results.
In the Graves project, this isn't an issue, because all the documents are tiny (one diary entry). But in Mariage and MoEML, we have a combination of very small (one poem, one little article) and very large (Satire Menipée, Stow) documents.
One option is that instead of returning all the hits for a document, you just return (say) the first five, with a note "195 more", and the option to search only that document. If you take that option, you see hits only from that document, but paged in sets of ten.
Another option is to treat the search as a search of the collection itself, so that every hit is a separate "result"; in that case, in our imaginary scenario, the first 200 hits (i.e. the first 20 pages of results) come from the first large document, and you have to get to page 21 before you see anything from the next document.
Another option is to search at the granularity of smaller fragments rather than full-scale documents (Stow chapters, etc.). The problem with that can be seen in this example, where search results from the same play are scattered around because each scene is searched as if it were a separate document.
I have a vague notion that you might let users search "FOR DOCUMENTS" (in which case they'd get summaries with the first one or two hits, with documents ordered by hit-count) or "IN DOCUMENTS" (in which case each individual hit in a document would be a separate "result" on the page. But I'm not sure how easy that would be for users to understand.