18/04/17

Permalink 05:45:18 pm, by mholmes, 62 words, 13 views   English (CA)
Categories: Activity log; Mins. worked: 240

Fixed bugs

Fixed a number of things today:

  • Footnotes are now working even when embedded in split lists. I've added an earlier step which puts their number in @n.
  • The "Tous" TOC now includes gravures, which it didn't before.
  • The dates in TOCs correctly reflect the certainty and ranges of unknown dates.
  • XML files are now displaying.
  • The X-Frame-Option header is set to DENY.

12/04/17

Permalink 05:02:54 pm, by mholmes, 200 words, 8 views   English (CA)
Categories: Activity log; Mins. worked: 180

Half-done on a bug in list handling

There is a notorious problem in converting TEI lists to HTML, whereby if there are embedded things (such as formeworks or page breaks) that will result in element content, you have to split out the list into separate component lists to avoid embedding non-list element content, resulting in invalid HTML. I had some code that was supposed to be doing this with for-each-group, but it wasn't working. I debugged and fixed it, so lists are now coming out OK, but there's now a problem with notes embedded in these structures, because they're being processed as part of a constructed fragment; that means they lose their context, and end up being numbered "1" and not generating a popup correctly. This is exemplified in the TOC for Forest Nuptiale. Possible solutions:

  • Do a first pass to pre-process notes to give them an @n attribute, then do the note processing based on that attribute rather than on counting preceding notes.
  • Do a first pass to split out the lists in XML, so that the problem doesn't arise.
  • Instead of using the very limited ol/ul/li elements in XHTML5, use instead simple divs with display: list and display: list-item.

I'm still thinking on this.

31/03/17

Permalink 02:33:14 pm, by mholmes, 114 words, 12 views   English (CA)
Categories: Academic; Mins. worked: 120

Work on normalized texts and title pages

CC pointed out a number of flaws in the way both primary source and normalized versions are being rendered. The previous site had an assumption that title page contents were centred; we want to make that explicit in texts, but then handle it, so I've added a handler for the titlePage element. Where possible, flow content in paras in normalized texts should be justified, so I've made that happen by adding a class on the root div which enables us to apply override styles for normalization display. I fixed some encoding errors in a couple of texts, and I've also tweaked a bunch of the CSS. We're getting closer to a publishable version now.

10/02/17

Permalink 03:40:59 pm, by mholmes, 70 words, 56 views   English (CA)
Categories: Activity log; Mins. worked: 180

Refactoring for eXist 3.0

The new 3.0 has a bug with namespaces which can be worked around by refactoring a bit; since the refactoring actually produces better code, I've done it for several projects including Mariage. I've also reworked the search functionality so that it handles the problem case of a large document with hundreds of hits. Other layout and style bugfixes also done, and a couple of obvious things added to the stopword list.

22/12/16

Permalink 03:44:59 pm, by mholmes, 82 words, 56 views   English (CA)
Categories: Activity log; Mins. worked: 120

Generic search functionality work

Made a number of tweaks to the way the search currently works, but principally worked on generic code in the hcmc/xquery/xq-utils.xqm library to convert user-friendly search-box input into the XML syntax that eXist can use to talk to Lucene. This seems to be working well, although I haven't yet found a way to put it into practice because we're still using a string-construct-and-eval approach to filtered queries. It may be just a case of using the XQuery serialize() function.

21/12/16

Permalink 04:35:19 pm, by mholmes, 235 words, 58 views   English (CA)
Categories: Activity log; Mins. worked: 360

More work on search and nuances of search links

As I hack away at search testing, I'm discovering more and more little tweaks that are more than nice-to-have. Today I fixed a bunch of bugs in processing of ambitious search strings (quoted phrases are not supported yet, although I have half-a-plan for that). I also decided that search-string highlighting in a document that you have found is better done using a much simpler search string than the one you used to find documents in the collection (for instance, you don't want minused terms in the document highlighter because it causes eXist to return nothing, for some reason). So I now have a clever conversion of the original search string that is appended to the URL of the document link in the initial search results.

I've also fixed the display of the gravures so that a search result link will pop up the containing annotation, and also so that a link to the id of an element which is not an annotation itself, but is inside one, will cause the annotation to be shown.

We're clearly down to minor tweaks at this stage, so we're close. PS is still working on a couple of cosmetic issues. I'm thinking that there should be some more sophisticated diagnostics to catch broken links; I don't think that check is currently finding links that point to an element in a document which is not one of the ref docs.

20/12/16

Permalink 04:12:59 pm, by mholmes, 68 words, 64 views   English (CA)
Categories: Activity log; Mins. worked: 180

Last tweaks to eXist search and some display bugs

PS is working on the styling of the results page, and fixing a bug in scrolling of marginal page-numbers in normalized documents; I've fixed some other bits and pieces related to search, parameterized the build process so that I can easily build a full eXist XAR (1.4GB) locally without making Jenkins do it, and tested the big XAR on a local eXist (it works well). We're getting closer.

16/12/16

Permalink 02:48:23 pm, by mholmes, 43 words, 66 views   English (CA)
Categories: Activity log; Mins. worked: 180

Search with image fragments now working

I think this was the last piece of the puzzle for the Mariage eXist app. I haven't yet tested building the complete webapp; I'll do that soon. Meanwhile, there's one issue regarding the display of the gravures that I'm working with PS on.

09/11/16

Permalink 03:27:37 pm, by mholmes, 429 words, 64 views   English (CA)
Categories: Activity log; Mins. worked: 120

Working on search

I've implemented search result caching in Mariage, and done a bunch more work to bring it up to speed with what I learned in the Graves project. However, I'm now faced with a problem in search design which also afflicts MoEML, summarized here:

Imagine you want to find "amour" in your documents. You search for "amour".

It finds (say) thirty documents which contains "amour". It returns the first ten (it's paging in sets of ten results), and it sets about giving you all the keyword-in-context display results for each document.

Now, the first document has 200 instances of "amour". So the search code has to do a kwic expand operation on all 200 of those results in order to give you 200 keyword-in-context fragments for that document. These operations take a long time, so it takes ages for your results to come back.

If your results page contains ten documents, each of which has 200 hits, you're now processing 2,000 hits to give a single page of results.

In the Graves project, this isn't an issue, because all the documents are tiny (one diary entry). But in Mariage and MoEML, we have a combination of very small (one poem, one little article) and very large (Satire Menipée, Stow) documents.

One option is that instead of returning all the hits for a document, you just return (say) the first five, with a note "195 more", and the option to search only that document. If you take that option, you see hits only from that document, but paged in sets of ten.

Another option is to treat the search as a search of the collection itself, so that every hit is a separate "result"; in that case, in our imaginary scenario, the first 200 hits (i.e. the first 20 pages of results) come from the first large document, and you have to get to page 21 before you see anything from the next document.

Another option is to search at the granularity of smaller fragments rather than full-scale documents (Stow chapters, etc.). The problem with that can be seen in this example, where search results from the same play are scattered around because each scene is searched as if it were a separate document.

I have a vague notion that you might let users search "FOR DOCUMENTS" (in which case they'd get summaries with the first one or two hits, with documents ordered by hit-count) or "IN DOCUMENTS" (in which case each individual hit in a document would be a separate "result" on the page. But I'm not sure how easy that would be for users to understand.

08/11/16

Permalink 04:38:00 pm, by mholmes, 26 words, 68 views   English (CA)
Categories: Activity log; Mins. worked: 60

Fix for repeating title bug; addition of XAR icon

Fixed the bug where the docTitle was repeated at the beginning of introductory fragment documents. Also created an application icon for use in the XAR file.

<< Previous Page :: Next Page >>

Mariage

Faut-il se marier? La question de Panurge s’avère incontournable en Occident, surtout à partir de la contre-réforme. Des débuts de la Concile de Trente en 1545 jusqu’à la fin du règne de Louis XIV, la tentative de renouveler le mariage se heurte en France à l’intervention croissante de la monarchie dans cette institution dominée auparavent par l’Église. La rencontre entre ces deux autorités fut tumultueuse mais propice au foisonnement des documents qui font l’objet de ce site : « l’imaginaire nuptial » se compose de divers genres textuels, chacun ayant son caractère propre, mais tous traitant des peurs, des désirs et des fantasmes de plus en plus visibles dans la société d’Ancien Régime grâce aux débats soulevés par la nouvelle problématique de l’union conjugale. L’accent pour le moment est sur les textes et images misogames qui font partie d’un renouveau de la Querelle des femmes pendant les 25 premières années du XVIIe siècle.

Reports

XML Feeds