Noticed some display oddities on the references page on the site, and found that there was an empty <quote>
element in the entry for Egypt; this was causing a self-closing <q>
tag in the output. Put a comment into the original file as a quick fix, the put in a trap for this in the XSLT, which now generates a comment where a <quote>
tag is empty.
Category: "Activity log"
Some guidelines for creating the xml:id attributes for reference items:
- Use camelback notation (the first word starts with a lower case letter, but any subsequent words have upper case first letters): oneTwoThree.
- Don't include any accents at all.
- Create an
xml:id
that will cause the item to be sorted in the right place in alphabetical order on the References page. For instance, if your entry is about "Saint Autel", you want the item to sort under "Autel", not "Saint", so thexml:id
should begin with "autel". - In the case where there are groups of similar entries (such as saints, of which there are several), it's helpful to include something at the end of the xml:id that is common to all of them; so for instance we use "autelSaint", "pierreSaint" and so on. With this system, we can easily find all the entries for saints using XPath like this:
//item[ends-with(@xml:id, 'Saint')]
.
A couple of changes based on feedback from CC, and some things I noticed:
- Changed the
xml:id
"historienTranquillus" to "suetone" to match the title. Made the corresponding change to the<ref>
tag invarin.xml
- Changed the entry for "Pierre" so that its
<head>
tag reads "Saint Pierre" and itsxml:id
is "pierreSaint", for consistency with other saintly references. Made a corresponding change incomplaintes.xml
. - Eliminated a duplicate reference: "Alcmène" was in both the Sonnet and main reference files, and the Sonnet version was shorter and was a pure quote, so I've deleted that one.
- Corrected the title of a Wikipedia link in the entry for Mont Hélicon, which was incorrectly titled "Mont Les Héroïdes". I think this resulted from a copy/paste error.
- Corrected the URL in the entry for Actéon to point to the actual Wikipedia Actéon entry, rather than the entry for Mont Hélicon (another copy/paste error, I think).
- Deleted an irrelevant second
<bibl>
item in the entry for Hypermnestre, which was pointing at the Petit Robert entry for "Égyptos en gr. Aiguptos". - Found several entries (an example is "Lazare") where a TEI
<quote>
tag was enclosing a series of block-level elements (meaning that the quotation was a blockquote containing multiple list items, paragraphs, etc.). The normal handling for quotes doesn't deal with this properly; it causes invalid XHTML. Rewrote the XSLT to detect this situation and avoid spitting out an XHTML<q>
tag in this context; to do this I had to addtype="block"
to the<quote>
tag, in order to detect this situation. I'm now emitting hard-coded guillemets at the beginning and end of a blockquote, which is not what we want, but it's a reasonable stop-gap. I need to consult the team on this. - Changed some
<list type="numbered">
instances to<list type="ordered">
, which is our convention. - Fixed an XSLT rendering issue that was putting multiple bibliographical references onto a single line; they're now rendered as a list, each on a new line.
Found the reference pointing at "libie" in the 1621 Sonnet, and changed both it and the reference file entry to "libye". Uploaded all the latest changes, and checked that the link works. Had to get Greg to restart the server and re-index the references after an eXist stumble. This is getting tiresome...
Greg followed my instructions below, and also figured out how to get the JNLP file working; we've now revised those instructions to include more steps. The result works fine for Mariage, with the single exception of the missing <exist:match>
tags, which are needed for the search to work properly. I'm still working on that, but at the very least I can work around it using other methods which are a bit slower, but which could be compensated for by adding lots of handy indexes. We're getting somewhere at last!
I've now got the main bibliography page rendering appropriately as part of the site; still waiting for an OK from CC to add it to the menu. In the process, I decided to disentangle some potential sources of confusion in the file names and site structure: we had one set of documents and pipelines labelled "biblio" which were actually related to the TOC for the anthology, and another set which were actually the bibliography (list of works referenced). I've now renamed files and pipeline triggers to use "toc" for anything related to the TOC, and "works_list" for anything related to the list of works referenced.
Next is markup of the first of the articles.
My current attempt, based on some back-and-forth with WM, goes like this:
- Replace the
build.xml
,exist-gump.xsl
andexist-jars.xsl
with the ones from here:
http://exist.svn.sourceforge.net/viewvc/exist?view=rev&revision=10205 - Download a fresh Cocoon 2.1.11.
- Patch the Cocoon blocks.properties with a local.blocks.properties which disables Lucene and querybean, per WM's instructions.
- Check the build/scripts/cocoon/build.properties file in $EXIST_HOME to make sure it points to your local $COCOON_HOME instead of WM's
- Sign the eXist jars: in $EXIST_HOME
./build.sh -f build/scripts/jarsigner.xml
- Build eXist: in $EXIST_HOME
./build.sh -f build/scripts/cocoon/build.xml
- Clean up the Cocoon tree: in $COCOON_HOME
./build.sh clean
- Rebuild the Cocoon webapp: in $COCOON_HOME
./build.sh
- Copy
ws-commons-util-1.0.2.jar
andsunxacml-1.2.jar
fromEXIST_HOME/lib/core
into the generatedCOCOON/build/webapp/WEB-INF/lib
- Copy my own TitleSortComparator.jar and xqSearchUtils.jar into the same folder (optional for the moment)
- Copy the generated
webapp
folder to [Tomcat]/webapps, and rename itcocoon
. - Start Tomcat.
- Confirm that Cocoon and eXist are running.
- Try the JNLP: It should startup OK.
- Connect using the client: $EXIST_HOME/bin,
./client.sh
, then connect toxmldb:exist://localhost:8080/cocoon/xmlrpc
- Using either the client or the JNLP, upload documents, then create a backup to make the process simpler in future.
- Shut down Tomcat.
- Add
saxon9he.jar
to[cocoon]/WEB-INF/lib
. - Change
web.xml
to set all encodings to UTF-8 (some are still 8859-1 in the default setup). This assumes that the Tomcat process is itself being launched with a UTF-8 flag in the VM, as part of the Java launch command. I do this using a startup_as_utf8.sh file in [Tomcat]/bin:#!/bin/bash ./startup.sh dFile.encoding="UTF-8"
- Configure Cocoon so that Saxon can be called. First, open cocoon/WEB-INF/cocoon.xconf, and find the bit that refers to Saxon XSLT, which is commented out by default. Uncomment the code and change it according to the instructions in the file, so that it enables Saxon 9:
<component logger="core.xslt" role="org.apache.excalibur.xml.xslt.XSLTProcessor/saxon" class="org.apache.cocoon.components.xslt.TraxProcessor"> <parameter name="use-store" value="true"/> <parameter name="transformer-factory" value="net.sf.saxon.TransformerFactoryImpl"/> </component>
- Now we need to edit
cocoon/sitemap.xmap
to enable the Saxon transformer. In the<map:transformers>
section, add this below the other XSLT transformers:<map:transformer name="saxon" pool-grow="2" pool-max="32" pool-min="8" src="org.apache.cocoon.transformation.TraxTransformer"> <use-request-parameters>false</use-request-parameters> <use-browser-capabilities-db>false</use-browser-capabilities-db> <xslt-processor-role>saxon</xslt-processor-role> </map:transformer>
- Add the following in the
<map:serializers>
section, to enable a couple more useful output formats:<!-- Customization: compatibility setting for IE6 --> <map:serializer logger="sitemap.serializer.xhtml" mime-type="text/html" name="xhtml11_compat" pool-grow="2" pool-max="64" pool-min="2" src="org.apache.cocoon.serialization.XMLSerializer"> <doctype-public>-//W3C//DTD XHTML 1.1//EN</doctype-public> <doctype-system>http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd</doctype-system> <encoding>UTF-8</encoding> </map:serializer> <!-- Customization: set text output to UTF-8 --> <map:serializer logger="sitemap.serializer.text" mime-type="text/plain" name="text" src="org.apache.cocoon.serialization.TextSerializer"> <encoding>UTF-8</encoding> </map:serializer>
- In eXist's configuration (
WEB-INF/conf.xml
), make change the indexing settings from this:<indexer caseSensitive="yes" index-depth="5" preserve-whitespace-mixed-content="no" stemming="no" suppress-whitespace="none" tokenizer="org.exist.storage.analysis.SimpleTokenizer" track-term-freq="yes">
to this:<indexer caseSensitive="no" index-depth="8" preserve-whitespace-mixed-content="yes" stemming="no" suppress-whitespace="none" tokenizer="org.exist.storage.analysis.SimpleTokenizer" track-term-freq="yes">
- In eXist's
<conf.xml>
, change this:<serializer add-exist-id="none" compress-output="no" enable-xinclude="yes" enable-xsl="no" indent="yes" match-tagging-attributes="no" match-tagging-elements="no">
to this:<serializer add-exist-id="none" compress-output="no" enable-xinclude="yes" enable-xsl="no" indent="yes" match-tagging-attributes="no" match-tagging-elements="yes">
(just the final attribute changes) to turn on exist:match tagging. - In eXist's
<conf.xml>
, change this:<index> <fulltext attributes="false" default="none"> <exclude path="/auth"/> </fulltext> </index>
to this:<index> <fulltext attributes="false" default="all"> <exclude path="/auth"/> </fulltext> </index>
to turn on fulltext indexing. You can also do this by collection in the admin client. - Restart Tomcat. Check that things work.
I've now made all those changes in the actual files themselves, so the right @target
attributes should be pointing at the right reference items. These are things that still need to be cleaned up:
The references for Pyramus and Thisbe are now separate, but they're duplicated. LCC will need to write separate content for each of them.
There are multiple "Humeur" entries (humeurSpermatique, humeurSanguine etc.). These will need to be consolidated so that the single "Humeur" entry covers all the varieties (it might already do that), and the individual sub-entries get deleted. The references in the Sonnet 1621 are already all pointing to "#humeur", the main entry.
Biblio markup is now completed; I just need to add the mechanism for sorting it by various parameters on the actual page.
These are changes I've made to xml:ids
and <head>
elements:
- Removed all initial and final spaces from
<head>
tags. - Normalized all saints' names in head tags to: Saint Paul, Saint Autel, etc.
- Changed the following
xml:ids
:aecus -> eaque aeolus -> eole aeneas -> enee aegysthus -> egisthe aeta -> aetes aeneides -> eneide iapetus -> japet iason -> jason ieanneDeFrance -> jeanneDeFranceSaint isleDeCornouaille -> cornouaille liureDeJuges -> juges laCiteDeDieu -> citeDeDieu royCharlesVIII -> charlesVIII sainctAutel -> autelSaint saintPrix -> prixSaint saintAngeRaphael-> raphaelAnge sainctPaul -> paulSaint timothee -> timotheeSaint sainctIeanBaptiste-> jeanBaptisteSaint augustin -> augustinSaint saintCene -> ceneSaint cocus -> martial libra -> balance
- Removed "(Cocus)" from the Martial entry -- seems very odd that it was there in the first place.
- Fixed some oddities in
<head>
contents. - Deleted the "example reference".
- Found the blank entries, which turned out to be more "Humeur" entries (there are now several of these) which had no
<head>
contents. I've added the appropriate<head>
contents so that we can now see what they are; ultimately, Lauren or Leanna will need to check that the overall "humeur" entry covers all these subtopics, then delete the subtopics, and fix any ref tags pointing at them. - Duplicated the Pyrame entry to create a separate Thisbe entry next to it; someone will need to write the two entries to be complementary, and make sure any ref tags point at the appropriate one.
- Deleted the "cocu" entry. Any ref tags pointing to it will also need to be deleted.
- Deleted the "moliere" entry. Any ref tags will have to be removed in the documents.
- Fixed the entry for Tartare, which had no title but all of its content in the
<head>
tag.
I've uploaded these changes to references.xml, but I haven't fixed any of the documents which might be pointing to old xml:id
s yet. I'll try that tomorrow. In the meantime, I notice there are some titles which have periods at the end, which looks odd...