Old scanner died.
Bought a new one (Canon CanoScan 8800F).
SANE website says it's fully supported by the PIXMA library.
Experience: under Lucid (libsane v1.0.20) I get nothing. I found a reference to a hack which requires editing a udev rule if it doesn't include the scanner info, but it apparently relies on a newer version of SANE (v1.0.21) - of course.
I tried it in Maverick, which includes v1.0.21 and it works.
I've updated my builder scripts for Maverick (Ubuntu 10.10). So far everything seems OK, but I've yet to test in a real-world environment (just in a VM, which has never been a problem).
It includes a backup builder that constructs a customized script for users to use for backing up data.
I'll build an ISO later this week and try it out on Radish.
find . \( -name '.DS_Store' -o -name '._*' -o -name '*~' \) -delete
I have a really solid build script working now. At the beginning, it checks whether Tomcat is running, and if so, shuts it down; then it does the build, does all the patching (with a placeholder for the Analyzer patch if we get a working one), and links in the testing material; then it restarts Tomcat, and sends Firefox to the test site. That's pretty much perfect. I tried turning on the webapp.samples and webapp.test-suite in the Cocoon local.build.properties, but the results were mixed. The Samples page works, but not the "Blocks with samples" page; the eXist page comes up, but all the links are broken and you can't get the server status. Doubtless there are ways to fix this, and if I can do that, then we can really release our build alongside the main builds on the site, because it would be functionally complete. I think it might be worth putting time into that. Alternatively, I could just fix the links manually in a static build and post that, but that would make updates a bit annoying. We should figure out whether all that's wrong is weblinks, or whether there's something else that needs fixing.
This morning I learned how to do patching as part of an Ant build, and integrated the patch for the Snowball Analyzer into the script. However, the patch process fails, so I think the patch must have been generated against an earlier version of the target file. Waiting for confirmation from AR about this. Meanwhile, I'm having trouble making a specified target run or not based on a property in the properties file, for some reason.
Progress today:
I think I'm almost done with this. Remaining bits and pieces:
I've been looking into the differences between the Lucene syntax and the old index system, and realized that I'll have to rewrite some of my backend code for e.g. Mariage to get the advantage of Lucene; and also that Lucene itself requires the use of its XML syntax in order to get support for wildcards. I think the best solution to this is to rewrite my xqSearchUtils Java library so that it can spit out Lucene XML instead of the code aimed at the old eXist. This shouldn't be too hard to do, and it'll make creating good Lucene search interfaces easier.
RE and co tried following the procedures outlined in my previous post yesterday, to enable Apache to talk to Tomcat over UTF-8, but the final stage in the Apache config screwed up all our virtual host mappings, so that's no good. I went back and looked at the problem again, and determined that the only issue remaining was the quicksearch feature on the Mariage site, and this was because it used a straightforward form submission. I ended up coding a workaround. What I do is to use a JavaScript function instead of a straight form submission. The JS constructs a URL, which is then encoded with encodeURIComponent (this hexes the UTF-8 octets); then as the search page loads, more JS parses the search string and decodes the results back to UTF-8, which it can then submit to the regular search.
This seems to be working, so we'll stick with it for the moment. If any more problems show up with sites proxied through Apache, we might have to revisit this, but I'm not inclined to take risks with the virtual host sites; they're our most public and important, in some cases.
Today I expanded the test site so that it covers tests of Unicode characters and ft:query searches with match-tagging. These work fine. Then I started working on the FOP configuration, to make fonts available to FOP through a portable configuration in the sitemap, following my own instructions here. However, various things have changed. First of all, to generate the font-metrics files, you need slightly different jar file names. Change to the cocoon/build/webapp/WEB-INF/lib directory, and then issue something like this:
java -cp fop.jar:xercesImpl-2.9.1.jar:xml-apis-1.3.04.jar org.apache.fop.fonts.apps.TTFReader /home/mholmes/cocoon_with_exist/testsite/fop-fonts/GentiumPlus-R.ttf /home/mholmes/cocoon_with_exist/testsite/fop-fonts/GentiumPlus-R.ttf.xml
where the first is the path to the TTF, and the second the path to the metric file you're creating. I did this for the two Gentium Plus fonts, and four base DejaVu Sans fonts.
Next, I created this customized fo2pdf serializer definition in the sitemap:
<map:components>
<map:serializers>
<map:serializer logger="sitemap.serializer.fo2pdf" mime-type="application/pdf" name="fo2pdf_custom" src="org.apache.cocoon.serialization.FOPSerializer">
<user-config>cocoon://testsite/fop-config.xml</user-config>
</map:serializer>
</map:serializers>
This tells FOP to use a specific config file, which is delivered from a pipeline elsewhere in the sitemap. That pipeline looks like this:
<map:match pattern="fop-config.xml">
<map:generate src="fop-fonts/fop-config-src.xml" />
<map:transform type="saxon" src="fop-config.xsl">
<map:parameter name="fontPath" value="{realpath:/}testsite/" />
</map:transform>
<map:serialize type="xml"/>
</map:match>
It gets its source from fop-config-src.xml, which looks like this:
<fop version="1.0">
<renderers>
<renderer mime="application/pdf">
<fonts>
<font metrics-url="fop-fonts/GentiumPlus-I.ttf.xml"
kerning="yes" embed-url="fop-fonts/GentiumPlus-I.ttf">
<font-triplet name="GentiumPlus" style="italic" weight="normal"/>
</font>
<font metrics-url="fop-fonts/GentiumPlus-R.ttf.xml"
kerning="yes" embed-url="fop-fonts/GentiumPlus-R.ttf">
<font-triplet name="GentiumPlus" style="normal" weight="normal"/>
</font>
<font metrics-url="fop-fonts/DejaVuSans.ttf.xml"
kerning="yes" embed-url="fop-fonts/DejaVuSans.ttf">
<font-triplet name="DejaVuSans" style="normal" weight="normal"/>
</font>
<font metrics-url="fop-fonts/DejaVuSans-Bold.ttf.xml"
kerning="yes" embed-url="fop-fonts/DejaVuSans-Bold.ttf">
<font-triplet name="DejaVuSans" style="normal" weight="bold"/>
</font>
<font metrics-url="fop-fonts/DejaVuSans-Oblique.ttf.xml"
kerning="yes" embed-url="fop-fonts/DejaVuSans-Oblique.ttf">
<font-triplet name="DejaVuSans" style="italic" weight="normal"/>
</font>
<font metrics-url="fop-fonts/DejaVuSans-BoldOblique.ttf.xml"
kerning="yes" embed-url="fop-fonts/DejaVuSans-BoldOblique.ttf">
<font-triplet name="DejaVuSans" style="italic" weight="bold"/>
</font>
</fonts>
</renderer>
</renderers>
</fop>
That source is transformed into the actual config file using this XSLT:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs xd"
version="2.0">
<xsl:param name="fontPath" />
<!-- On Windows, Cocoon's {realpath:/} omits the trailing slash. Since we
know where to expect it, we should be able to fix this. -->
<xsl:variable name="fixedFontPath" select="if (contains($fontPath, '\')) then replace(replace($fontPath, 'testsite', '/testsite'), '/', '\\') else $fontPath" />
<!-- XSLT Template to copy anything, priority="-1" -->
<xsl:template match="@*|node()|text()|comment()|processing-instruction()" priority="-1">
<xsl:copy>
<xsl:apply-templates select="@*|node()|text()|comment()|processing-instruction()"/>
</xsl:copy>
</xsl:template>
<!-- Massage the path attributes. -->
<xsl:template match="@metrics-url">
<xsl:attribute name="metrics-url"><xsl:value-of select="$fixedFontPath"/><xsl:value-of select="."/></xsl:attribute>
</xsl:template>
<xsl:template match="@embed-url">
<xsl:attribute name="embed-url"><xsl:value-of select="$fixedFontPath"/><xsl:value-of select="."/></xsl:attribute>
</xsl:template>
</xsl:stylesheet>
Note: the structure of this file has changed considerably for the new FOP 1.0. The reason we use this setup instead of a hard-coded fop-config.xml is that we want our projects to be entirely portable without configuration changes; this system fills in the required hard-coded paths to the fonts directory on the fly, using Cocoon's realpath module, so it effectively makes those paths relative in the source XML file.
This actually works, in that I can use the GentiumPlus font family to render Russian characters (which will not render with the default fonts). One gotcha: I didn't realize how important it is to restart the servlet (Cocoon, or better, Tomcat); just making changes to the sitemap and other files seems to have little effect without this, so perhaps Cocoon reads and caches the config file for FOP on startup (or FOP gets started up when Cocoon is started, and reads it then).
I've also set up my test setup so that:
Next steps:
I've refined the build a bit to add a couple of things we need, and modified some of RVDB's preferences to match our own; there will probably be some more of this work to do. I've also written a tiny test site that we can use to quickly confirm that everything is working. And it is working! We now have what looks like a reliable build script that can be run any time. We should run it regularly (once a month?) and archive the builds, so that we have the potential to roll things back if a future build goes bad. Next steps:
To access taporshare on the MacOS:
command-k to open the dialogue
URI - smb://taporshare.tapor.uvic.ca/<sharename>
Username - uvic\<netlinkId>
If you forget the "uvic\" in the username field the authentication will fail.
Investigation and testing related to the container-encoding setting in the new Cocoon build process led me to discover a bug that's currently affecting sites on Pear's Tomcat-dev when accessed through Apache. Here's an illustration of the problem:
If you go to the Mariage site search page on Pear, accessed on its Tomcat port, and search for "mariée", you'll get correct results. However, if you access the site through Apache and the virtual domain and do the same search, you'll get garbled results.
The problem seems to be this:
We build our recent Cocoon stacks as all-UTF-8, and set up Tomcat as well to use UTF-8, but it appears that the last stage in the process, when Apache talks to Tomcat, is not working in UTF-8. We've done a bit of research, and based on this page:
http://confluence.atlassian.com/display/DOC/Using+Apache+with+mod_jkTwo things may need to be changed:
<!-- Define an AJP 1.3 Connector on port 8009 -->
<Connector port="8019" protocol="AJP/1.3" redirectPort="8081" />
changed to:
<!-- Define an AJP 1.3 Connector on port 8009 -->
<Connector port="8019" protocol="AJP/1.3" redirectPort="8081" URIEncoding="UTF-8" />JkOptions +ForwardURICompatUnparsed
For the moment, this only applies to Tomcat-dev; Pear's Tomcat-stable is running legacy projects which operate in 8859-1 encoding, and they're working fine.
Wrote to sysadmin to request that they look at this and see if it makes sense.
I've made the following changes to RVDB's Ant script:
build.xml is located), created a fop subdirectory, and inside that, a jars_replaced directory, where I'm going to store Cocoon jars that have been superceded by newer versions from FOP.fop-jars.xsl file inside the fop directory, to process and update Cocoon's jars.xml file (modelled on exist-jars.xsl, which does the same thing).
<target description="Replace FOP 0.95 with FOP 1.0" name="upgrade_fop">
<echo message="-----------------------------------------------"/>
<echo message="Upgrading FOP from 0.95 to 1.0 "/>
<echo message="-----------------------------------------------"/>
<echo message="-----------------------------------------------"/>
<echo message="Patching jars.xml "/>
<echo message="-----------------------------------------------"/>
<xslt-saxon in="${cocoon.home}/lib/jars.xml" style="fop/fop-jars.xsl" out="${cocoon.home}/lib/jars.xml.patched">
</xslt-saxon>
<move file="${cocoon.home}/lib/jars.xml.patched" tofile="${cocoon.home}/lib/jars.xml"/>
<echo message="-----------------------------------------------"/>
<echo message="Removing jars to be replaced "/>
<echo message="-----------------------------------------------"/>
<move failonerror="false" file="${cocoon.home}/lib/optional/batik-all-1.6.jar" tofile="fop/jars_replaced/optional/batik-all-1.6.jar"/>
<move failonerror="false" file="${cocoon.home}/lib/optional/fop-0.95.jar" tofile="fop/jars_replaced/optional/fop-0.95.jar"/>
<move failonerror="false" file="${cocoon.home}/lib/optional/xmlgraphics-commons-1.3.1.jar" tofile="fop/jars_replaced/optional/xmlgraphics-commons-1.3.1.jar"/>
<echo message="-----------------------------------------------"/>
<echo message="Adding replacement jars and new required jars "/>
<echo message="-----------------------------------------------"/>
<copy todir="${cocoon.home}/lib/optional" overwrite="true">
<fileset dir="${fop.home}/lib">
<include name="batik-all-1.7.jar"/>
<include name="xmlgraphics-commons-1.4.jar"/>
<include name="serializer-2.7.0.jar"/>
</fileset>
</copy>
<copy todir="${cocoon.home}/lib/optional" overwrite="true">
<fileset dir="${fop.home}/build">
<include name="fop.jar"/>
</fileset>
</copy>
<copy todir="${cocoon.home}/lib/endorsed" overwrite="true">
<fileset dir="${fop.home}/lib">
<include name="xml-apis-ext-1.3.04.jar"/>
</fileset>
</copy>
</target>
<target description="Prepare the cocoon and eXist directories"
name="prepare" depends="update_exist, copy_exist, prepare_cocoon, upgrade_fop, build_cocoon, copy_additional_jars, patch_cocoon">
</target>
What I don't know yet is whether anything else works properly. This is what I'll have to do next:
patch_web_xml.xsl:
<!-- change default encodings to UTF-8 --> <!-- except for servlet[servlet-name='cocoon']/init-param/param-name[. = 'container-encoding']: should stay ISO-8859-1, see -http://markmail.org/message/nm6bnvqztbee4s5o -http://markmail.org/message/jt256gl3magir6g4 -http://wiki.apache.org/cocoon/RequestParameterEncoding#A3._Decoding_incoming_requests:_Servlet_Container -->We had form submission working fine with both GET and POST, using all-UTF-8 settings; if possible, I'd prefer this, but it may be that something has changed in Cocoon 2.1.12x which screws this up.
I discovered that the problem I was having with RVDB's Ant script, which I've modified, was actually with Saxon 9.3, which for some reason was failing to do transformations properly. I've rolled back to Saxon 9.2, and now the build goes OK and the resulting webapp works fine. Next I need to integrate the FOP instructions into the Ant build.
RVDB sent a solution to my Ant build problem, which basically works -- the eXist build.sh is now found -- so I've integrated that, and I've also found a better solution to using Saxon for XSLT, that doesn't depend on its being on the CLASSPATH (which I want to avoid, because its version numbers are likely to change over time). I found the solution in one of the comments on this page, and it's basically this:
<macrodef name="xslt-saxon">
<attribute name="in"/>
<attribute name="out"/>
<attribute name="style"/>
<sequential>
<echo level="info">XSLT Generating @{out}</echo>
<java classname="net.sf.saxon.Transform"
classpath="${saxon.home}/saxon9he.jar"
logError="true"
output="@{out}"
fork="true">
<arg value="@{in}"/>
<arg value="@{style}"/>
</java>
</sequential>
</macrodef>
<xslt-saxon in="${cocoon.home}/build/webapp/WEB-INF/cocoon.xconf"
style="cocoon/patch_cocoon_xconf.xsl"
out="${cocoon.home}/build/webapp/WEB-INF/cocoon.xconf.patched">
</xslt-saxon>
Now I have RVDB's whole build process working OK, but the resulting webapp is broken; XSLT transformations don't seem to work. I suspect this is something to do with the way Saxon is being set up in the sitemap, because I'm using a more recent version of Saxon (HE, 9.3, as opposed to the three-jar version 9). I'll work on this issue tomorrow.
This blog is the location for all work involving software and hardware maintenance, updates, installs, etc., both routine and urgent, in the server room, the labs and the R&D rooms.
| Sun | Mon | Tue | Wed | Thu | Fri | Sat |
|---|---|---|---|---|---|---|
| << < | Current | > >> | ||||
| 1 | 2 | 3 | 4 | |||
| 5 | 6 | 7 | 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 | 16 | 17 | 18 |
| 19 | 20 | 21 | 22 | 23 | 24 | 25 |
| 26 | 27 | 28 | 29 | 30 | 31 | |