MJ provided me with a preview of the new version of the ACH site, which combines the old web.uvic.ca front-end with the eXist-based abstract/program pages, in a single Cocoon project. This is working well, with only one or two little glitches to solve. Took a look through the code, and sent back a couple of suggestions for solving an outstanding issue with abstract 203.
Category: "Activity log"
I have enough hardware to replace all of the noisy old TAPoR machines, including a groovy and cheap Acer Revo nettop (which looks suitable for use in our context - and really cheap). Three of the machines are built, and I have some RAM and GFx cards coming to spruce them up for dual-monitor use. This *should* allow us to make the most of what we have for a couple of years in B045. I haven't tracked my time on this as it's kind of pedestrian stuff - not really blog worthy - but very time consuming.
I've also been working with CS on a login/auth scheme that will allow us to configure each machine to use LDAP to allow Netlink holders access to the box. The PAM stack config has been done, and we're now looking at the skel/profile config so that new users will have a homedir etc. when they log in for the first time.
MJ provided a download of the Graves webapp, which I grabbed and tested locally. It worked fine. Pushed it up to the server, and had a couple of failures; I then discovered the tree was full of shadow files beginning with dot+underscore, which I think came from MJ's Mac. Deleted all of those, and then uploaded to the main Tomcat. To make it work, we then had to get the Data folder chowned to apachsrv by sysadmin -- reminder to self and Greg, to follow up on this, because when we met with RE, he was going to look into the possibility of giving us access to a script we could run to do this.
One remaining problem is the character encoding in the submission form, which is suffering from the issue with server.xml listed here. We're working on that. Once that's fixed, there's a possibility it will break existing forms in e.g. Scancan, but that's probably going to be easy to fix; and in any case, Scancan is another candidate for a port to eXist 1.4.
Had a meeting with MJ to look at the last few remaining problems. It looks as though the character encoding issue in the search form may just be a question of starting up Tomcat with UTF-8. Other display issues look like they may be related to extra whitespace in the XML, caused either by the expansion of entities, or by errors or laxness in the original markup. In any case, they may be fixable by judicious use of <xsl:strip-space> combined with <xsl:preserve-space>. Hoping to get this port finished by the end of the week.
MDH posting some edited highlights of MJ's work porting the Graves project from eXist 0.9 to 1.4, as a guide to future porting of old projects, and then the subsequent port of the ACH abstracts project. This will be enhanced and updated over time.
- Moved the Graves application code into a Cocoon subfolder called
site. This is our current preferred method of doing things; it allows us to retain ownership of the files containing the application logic, while the WEB-INF directory can be owned by the user under which Tomcat is running, so it can work with the database. - Moved XQuery files from the project root into an /xq/ subfolder. This is common sense, and is our current practice in all our projects.
- Added the session name space to XQuery files:
declare namespace session="http://exist-db.org/xquery/session";. - Changed
request:request-parametertorequest:get-parametereverywhere. - Changed
util:evaltoutil:eval-with-context. [Note: I'm not clear on why this was needed, or what the differences are; we should revisit this. I useutil:evalin many projects. - In addition to the changes from util:eval() to util:eval-with-context(), in some cases I was able to simply remove the erroneous second parameter in the function call:
let $index_query := f:build-index-query("/db/graves/index", $year, $month),
OLD$index := util:eval($index_query, "")
NEW$index := util:eval($index_query) - Forced some variables to be cast as
xs:ints, as eXist is more strict about type checking. This was required throughout the original XQuery.
OLD$prev_day := if(($month eq 3) and ($year eq 1935)) then '22' else '01',
NEW$prev_day := if(($month cast as xs:int eq 3) and ($year cast as xs:int eq 1935)) then '22' else '01', - Updated calls to
request:encode-url, which now requiresanyURItype parameters.
OLD:$doc_uri_request := request:encode-url("xrequest.xq")
NEW$doc_uri_request := session:encode-url(request:request-uri()) - Properly escaped ampersands in URL like places.
OLD<prev href="{$doc_uri_request}?action=search&collection=...
NEW<prev href="{$doc_uri_request}?action=search&collection=... - Updated calls to request:set-session-attribute() to session:set-attribute().
OLDrequest:set-session-attribute("query_stored", "true")
NEWsession:set-attribute("query_stored", "true") - Changes either to the definition or the functionality of
substringmean that "bad" code that used to function OK now needs to be fixed; an initial start-position parameter of 0 should have been 1, and although the code used to work (presumably where an invalid value was encountered, eXist substituted 1), now a zero is used, so one fewer characters is returned. Check any start-position params insubstringto make sure they're 1-based. - Image width and height attributes have been fixed (960 vs 906px).
- Added
@summaryto all tables, with an empty value if it's a layout table. The attribute is required for validation. - Fixed retrieval of multiple instances of reference items when only one was required (and since they have unique ids, the result was invalidity due to duplicate ids)
OLD<xsl:apply-templates select="//rs" mode="annotation" />
NEW<xsl:for-each-group select="//rs" group-by="@key">
<xsl:apply-templates select="." mode="annotation" />
</xsl:for-each-group> - Where older processors expecting an atomic value, when passed a sequence, would use the first element of the sequence, Saxon seems to be more picky. You have to make sure you're passing an atomic value rather than a sequence if that's what the processor expects:
OLD<xsl:sort select="author/name/@reg" />
NEW<xsl:value-of select="author[1]/name[1]/@reg" /> - All
<xsl:element>and<xsl:attribute>constructors were replaced by simple inline XHTML code, except where they're necessary. This makes for simplicity and readability. - The Cocoon XInclude transformer appears to be rather unintelligent when it comes to the output of xmlns attributes; it leaves all sorts of unnecessary ones in the output code. MJ came up with a much better solution: a template in the XSLT (2.0) that matches
<xi:include>, and does this:<xsl:copy-of select="fn:doc('../includes/header.inc')" />The details are still being worked out; one possible issue is the requirement to know the relative location of the included document during the XSLT process, and another might be when the included document is in the eXist db, not on the file system. - For the ACH, in which the project includes merging a static website (hosted originally on web.uvic.ca) with the Cocoon webapp, there was also a need to change the directory structure slightly, so that all documents are served from URLs in the project root (
cocoon/site). Previously some were served out of subfolders. This standardization is a good idea generally, and makes it much easier to map the inclusion of images and so on. All my recent projects already work this way.
These are errors relating to the XML of the Graves project itself, not likely to be encountered in other projects:
<div type="Abstract">
should be<div type="abstract">
.- @TEIform attributes are unnecessary and have been deleted; these were, we think, a byproduct of the DTD expansion process that also expanded the entities (which we wanted).
And errors specific to the ACH:
- Missing "http://" at the beginning of a URL in abstract 154.
- Failure of FO for abstract 203 to load (still under investigation).
- Layout issue with the panelist list in abstract 147 (still under investigation).
Created a download archive of the project, along with the XML files, and posted those for MJ; also trawled through any docs I could find, and sent those, along with three initial points to look at:
- One namespace is different:
OLD:
declare namespace f="http://exist-db.org/xquery/local-functions";
NEW:
declare namespace f="http://exist-db.org/xquery/f-functions"; - One function name has changed:
OLD:
request:request-parameter
NEW:
request:get-parameter - The need to expand external entities from the DTD in the XML files before uploading; they're trivial entities that map to perfectly good Unicode chars. The files should be free of external entities and encoded in UTF-8.
MJ, working on the Coldesp dollar to relieve me, is going to be looking at the old Graves and Abstracts sites and migrating them to a modern Cocoon/eXist, each in a separate one, for deployment on Lettuce. Today I tried for the first time to build eXist/Cocoon again since the stable 1.4 forked, and found a pile more problems that I had to solve -- mainly jars.xml issues, which I was able to fix by creating my own patch XSLT file that's run after the one packaged in eXist. After a couple of hours I got it working, and posted a working Cocoon/eXist on lancenrd for MJ to download. It's down to 59MB.
The eXist on Mustard isn't responding to anything except XQuery; no client seems to be available, or JNLP. I guess we'll have to recreate the db in the new setup.
As a test, I set up a local tomcat with two webapps directories.
In server.xml, at the very end, is a stanza like this:
<Host name="localhost" appBase="webapps" unpackWARs="true" autoDeploy="true" xmlValidation="false" xmlNamespaceAware="false"> </Host>
If you add another one, with a different name (e.g. name="something.else.com") you can deploy webapps from all over the place. This could be useful when you want to set up a complete Cocoon in a homedir, and not on the server's local disk, as we're doing with mariage v4. I successfully added a webapps directory that was in a folder in my SAN homedir that had been mounted over samba.
Hopefully this means that we can do the same on the web servers. The only issue that might come up is those pesky apachsrv permissions. Not sure how best tot deal with those...
Removed the exist-samples directory, and confirmed that it's not needed; both Mariage and the eXist client work, and so do the admin interface features of the web interface. I've added that deletion into our script. I've also revised the script so that it saves a backup of the core eXist folder from SVN, and allows you to choose to use that if you'd prefer it, rather than a completely fresh checkout. That'll save time when the build process is being run many times, and might get aborted because of problems further down in the script.
Then went on to remove all the Cocoon samples which aren't needed, but replacing the file structure, the sitemaps, and the eXist block structure. That gives us the usual eXist admin pages, which are sometimes handy to have. The result is a total size of 100MB before a project is deployed in it. That's pretty darned good, I think. There may be other places I can trim, but the size of the project code is going to be a more significant factor than this in many cases, so I think this is a good basis to start with. Tested the new build script, and it works fine. Now I'm going to look at providing a test case for the XQuery Generator bug.
While there is still an annoying bug in the eXist trunk (no <exist:match> tags when running a query through the XQuery Generator in Cocoon, while we do get them when running the same query through the admin client), the ongoing disaster that is now the main eXist 1.1.1 install on the production Tomcat is now so flaky that I absolutely must move forward with a deployment of Mariage to a new stack. I noticed that there's a limited amount of space on the tomcat-dev file space, so I now have to trim down Cocoon as far as I can. I've started by trying to build a really lean Cocoon, by working through my local.blocks.properties and disabling as many blocks as I think I can get away with.
This is rather a trial-and-error thing, because although the file does provide some documentation on what blocks are dependent on others, the comments say explicitly that it's partial and you can't rely on it. My first attempt failed to build. Judging by the error messages, there may be a requirement for the forms block, so I've now gone back and enabled the ajax, forms and template blocks (which are interdependent). That didn't work, so I went back again and decided I needed to enable javaflow and ojb. Tried again.
That got me a complete, working Cocoon, with a working eXist, to which I restored the Mariage project -- but the site failed to come up because of a missing text generator! I figure this might be in the chaperon block, so I tried enabling that and...
SUCCESS! A working stack, into which I was able to restore the Mariage site and confirm all is working well. This stack comes in at only 495MB, compared with the 1.2GB we had before, so it's a huge step forward. It might still be possible to remove a lot of cruft and get the size of the thing down a bit, but 356MB of that is the Mariage site itself, which is heavy on hi-res images. So the Cocoon/eXist block comes in at about 140MB, which is not a great deal. Now it becomes practical to deploy leaner projects. We're go for tomorrow on the dev Tomcat, I think...