Archives for: December 2010

23/12/10

Permalink 08:49:05 am, by Greg, 73 words, 127 views   English (CA)
Categories: Labs, Activity log; Mins. worked: 15

New scanner

Old scanner died.
Bought a new one (Canon CanoScan 8800F).
SANE website says it's fully supported by the PIXMA library.

Experience: under Lucid (libsane v1.0.20) I get nothing. I found a reference to a hack which requires editing a udev rule if it doesn't include the scanner info, but it apparently relies on a newer version of SANE (v1.0.21) - of course.

I tried it in Maverick, which includes v1.0.21 and it works.

21/12/10

Permalink 04:31:56 pm, by Greg, 63 words, 486 views   English (CA)
Categories: Labs, Activity log, Activity log, Announcements; Mins. worked: 400

Lab machines builder scripts

I've updated my builder scripts for Maverick (Ubuntu 10.10). So far everything seems OK, but I've yet to test in a real-world environment (just in a VM, which has never been a problem).

It includes a backup builder that constructs a customized script for users to use for backing up data.

I'll build an ISO later this week and try it out on Radish.

17/12/10

Permalink 10:57:41 am, by mholmes, 14 words, 64 views   English (CA)
Categories: R & D, Documentation; Mins. worked: 0

Deleting cursed .DS_Store files

find . \( -name '.DS_Store' -o -name '._*' -o -name '*~' \) -delete

10/12/10

Permalink 01:56:30 pm, by mholmes, 201 words, 74 views   English (CA)
Categories: R & D, Activity log, Documentation; Mins. worked: 180

Build script is finally there for Cocoon+eXist+FOP

I have a really solid build script working now. At the beginning, it checks whether Tomcat is running, and if so, shuts it down; then it does the build, does all the patching (with a placeholder for the Analyzer patch if we get a working one), and links in the testing material; then it restarts Tomcat, and sends Firefox to the test site. That's pretty much perfect. I tried turning on the webapp.samples and webapp.test-suite in the Cocoon local.build.properties, but the results were mixed. The Samples page works, but not the "Blocks with samples" page; the eXist page comes up, but all the links are broken and you can't get the server status. Doubtless there are ways to fix this, and if I can do that, then we can really release our build alongside the main builds on the site, because it would be functionally complete. I think it might be worth putting time into that. Alternatively, I could just fix the links manually in a static build and post that, but that would make updates a bit annoying. We should figure out whether all that's wrong is weblinks, or whether there's something else that needs fixing.

Permalink 11:11:44 am, by mholmes, 76 words, 80 views   English (CA)
Categories: R & D, Activity log; Mins. worked: 45

eXist patch for Snowball Analyzer not working yet

This morning I learned how to do patching as part of an Ant build, and integrated the patch for the Snowball Analyzer into the script. However, the patch process fails, so I think the patch must have been generated against an earlier version of the target file. Waiting for confirmation from AR about this. Meanwhile, I'm having trouble making a specified target run or not based on a property in the properties file, for some reason.

09/12/10

Permalink 04:41:17 pm, by mholmes, 362 words, 91 views   English (CA)
Categories: R & D, Activity log, Documentation; Mins. worked: 240

More fonts for FOP, and an improved build script for Cocoon/eXist/FOP

Progress today:

  • Integrated one CJK font (TakaoMincho) into the FOP config and set up the test code to use it for Japanese text. We limited this to one font, because adding it seems to increase the base memory footprint of Cocoon by about 80MB; that's significant, and there's no need to overload the test environment by adding more.
  • Added a series of new targets to the Ant build file to first remove the symbolic link to the test environment inside the build tree before doing anything else (because otherwise the build clean process deletes the target directory's contents!), then recreate the link at the end of the process. It also checks to see if there's a link in the Tomcat webapps directory on my local machine, and if not, creates one. This means the test site will work out of the box once you start up Tomcat at the end of the build. This involved learning more Ant stuff, which is useful. I'm getting the hang of it now.
  • Tested the new build with the Moses code, and everything works just great.

I think I'm almost done with this. Remaining bits and pieces:

  • Have the build check whether Tomcat is running at the beginning, and if so, shut it down. I found some example code for that here.
  • Have the build restart Tomcat at the end of the process, then start Firefox pointing to the test site. This will make it a single operation to update from SVN, build the app, and get the test site up and running.

I've been looking into the differences between the Lucene syntax and the old index system, and realized that I'll have to rewrite some of my backend code for e.g. Mariage to get the advantage of Lucene; and also that Lucene itself requires the use of its XML syntax in order to get support for wildcards. I think the best solution to this is to rewrite my xqSearchUtils Java library so that it can spit out Lucene XML instead of the code aimed at the old eXist. This shouldn't be too hard to do, and it'll make creating good Lucene search interfaces easier.

08/12/10

Permalink 05:06:32 pm, by mholmes, 191 words, 85 views   English (CA)
Categories: Servers, R & D, Activity log, Documentation; Mins. worked: 90

Character encoding issues on Tomcat-dev through Apache: some solutions

RE and co tried following the procedures outlined in my previous post yesterday, to enable Apache to talk to Tomcat over UTF-8, but the final stage in the Apache config screwed up all our virtual host mappings, so that's no good. I went back and looked at the problem again, and determined that the only issue remaining was the quicksearch feature on the Mariage site, and this was because it used a straightforward form submission. I ended up coding a workaround. What I do is to use a JavaScript function instead of a straight form submission. The JS constructs a URL, which is then encoded with encodeURIComponent (this hexes the UTF-8 octets); then as the search page loads, more JS parses the search string and decodes the results back to UTF-8, which it can then submit to the regular search.

This seems to be working, so we'll stick with it for the moment. If any more problems show up with sites proxied through Apache, we might have to revisit this, but I'm not inclined to take risks with the virtual host sites; they're our most public and important, in some cases.

Permalink 04:14:29 pm, by mholmes, 1032 words, 246 views   English (CA)
Categories: R & D, Activity log, Documentation; Mins. worked: 280

More work on Cocoon + eXist + FOP

Today I expanded the test site so that it covers tests of Unicode characters and ft:query searches with match-tagging. These work fine. Then I started working on the FOP configuration, to make fonts available to FOP through a portable configuration in the sitemap, following my own instructions here. However, various things have changed. First of all, to generate the font-metrics files, you need slightly different jar file names. Change to the cocoon/build/webapp/WEB-INF/lib directory, and then issue something like this:

java -cp fop.jar:xercesImpl-2.9.1.jar:xml-apis-1.3.04.jar org.apache.fop.fonts.apps.TTFReader /home/mholmes/cocoon_with_exist/testsite/fop-fonts/GentiumPlus-R.ttf /home/mholmes/cocoon_with_exist/testsite/fop-fonts/GentiumPlus-R.ttf.xml

where the first is the path to the TTF, and the second the path to the metric file you're creating. I did this for the two Gentium Plus fonts, and four base DejaVu Sans fonts.

Next, I created this customized fo2pdf serializer definition in the sitemap:

<map:components>
    <map:serializers>
      <map:serializer logger="sitemap.serializer.fo2pdf" mime-type="application/pdf" name="fo2pdf_custom" src="org.apache.cocoon.serialization.FOPSerializer">
        <user-config>cocoon://testsite/fop-config.xml</user-config>
      </map:serializer>
    </map:serializers>

This tells FOP to use a specific config file, which is delivered from a pipeline elsewhere in the sitemap. That pipeline looks like this:

    <map:match pattern="fop-config.xml">
      <map:generate src="fop-fonts/fop-config-src.xml" />
      <map:transform type="saxon" src="fop-config.xsl">
        <map:parameter name="fontPath" value="{realpath:/}testsite/" />
      </map:transform>
      <map:serialize type="xml"/>	
    </map:match>

It gets its source from fop-config-src.xml, which looks like this:

<fop version="1.0">
  <renderers>
    <renderer mime="application/pdf">

  <fonts>
   
    <font metrics-url="fop-fonts/GentiumPlus-I.ttf.xml" 
      kerning="yes" embed-url="fop-fonts/GentiumPlus-I.ttf">
      <font-triplet name="GentiumPlus" style="italic" weight="normal"/>
    </font>
    <font metrics-url="fop-fonts/GentiumPlus-R.ttf.xml" 
      kerning="yes" embed-url="fop-fonts/GentiumPlus-R.ttf">
      <font-triplet name="GentiumPlus" style="normal" weight="normal"/>
    </font>
    
    <font metrics-url="fop-fonts/DejaVuSans.ttf.xml" 
      kerning="yes" embed-url="fop-fonts/DejaVuSans.ttf">
      <font-triplet name="DejaVuSans" style="normal" weight="normal"/>
    </font>
    
    <font metrics-url="fop-fonts/DejaVuSans-Bold.ttf.xml" 
      kerning="yes" embed-url="fop-fonts/DejaVuSans-Bold.ttf">
      <font-triplet name="DejaVuSans" style="normal" weight="bold"/>
    </font>
    
    <font metrics-url="fop-fonts/DejaVuSans-Oblique.ttf.xml" 
      kerning="yes" embed-url="fop-fonts/DejaVuSans-Oblique.ttf">
      <font-triplet name="DejaVuSans" style="italic" weight="normal"/>
    </font>
    
    <font metrics-url="fop-fonts/DejaVuSans-BoldOblique.ttf.xml" 
      kerning="yes" embed-url="fop-fonts/DejaVuSans-BoldOblique.ttf">
      <font-triplet name="DejaVuSans" style="italic" weight="bold"/>
    </font>
    
  </fonts>
      
    </renderer>
  </renderers>
</fop>

That source is transformed into the actual config file using this XSLT:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="xs xd"
  version="2.0">
  
  <xsl:param name="fontPath" />
  
  <!-- On Windows, Cocoon's {realpath:/} omits the trailing slash. Since we 
    know where to expect it, we should be able to fix this. -->
  <xsl:variable name="fixedFontPath" select="if (contains($fontPath, '\')) then replace(replace($fontPath, 'testsite', '/testsite'), '/', '\\') else $fontPath" />
  
  <!-- XSLT Template to copy anything, priority="-1" -->
  
  <xsl:template match="@*|node()|text()|comment()|processing-instruction()" priority="-1">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()|text()|comment()|processing-instruction()"/>
    </xsl:copy>
  </xsl:template>
  
  <!-- Massage the path attributes. -->
  <xsl:template match="@metrics-url">
    <xsl:attribute name="metrics-url"><xsl:value-of select="$fixedFontPath"/><xsl:value-of select="."/></xsl:attribute>
  </xsl:template>    
  <xsl:template match="@embed-url">
    <xsl:attribute name="embed-url"><xsl:value-of select="$fixedFontPath"/><xsl:value-of select="."/></xsl:attribute>
  </xsl:template>  
  
</xsl:stylesheet>

Note: the structure of this file has changed considerably for the new FOP 1.0. The reason we use this setup instead of a hard-coded fop-config.xml is that we want our projects to be entirely portable without configuration changes; this system fills in the required hard-coded paths to the fonts directory on the fly, using Cocoon's realpath module, so it effectively makes those paths relative in the source XML file.

This actually works, in that I can use the GentiumPlus font family to render Russian characters (which will not render with the default fonts). One gotcha: I didn't realize how important it is to restart the servlet (Cocoon, or better, Tomcat); just making changes to the sitemap and other files seems to have little effect without this, so perhaps Cocoon reads and caches the config file for FOP on startup (or FOP gets started up when Cocoon is started, and reads it then).

I've also set up my test setup so that:

  • The built webapp is symbolic-linked from the Tomcat webapps directory, so I actually don't need to copy it into Tomcat.
  • The testsite code is symbolic-linked from inside the built webapp, so I don't need to copy it into the webapp. Obviously I'll have to have the Ant build recreate this link at the end of the build. I haven't done that yet.
  • To make the tests work, you still have to manually upload some stuff into the database through the client. This is OK, really, as it's a test of the client, but it would be cool if this could be done somehow from the Ant build as well.

Next steps:

  • Create a collection of fonts and font-metrics files that cover all the ranges we care about (we need something for CJK, and perhaps also Aboriginal Sans). We need a licence-free redistributable font array that will cover all our needs. For any given project, it would be trimmed down, of course.
  • Enhance the Ant build script as above.
  • Run another test build and archive the results.
  • Start testing real sites inside the new build.

07/12/10

Permalink 03:26:10 pm, by mholmes, 187 words, 119 views   English (CA)
Categories: R & D, Activity log, Documentation; Mins. worked: 240

Working Cocoon+eXist+FOP 1.0 build script

I've refined the build a bit to add a couple of things we need, and modified some of RVDB's preferences to match our own; there will probably be some more of this work to do. I've also written a tiny test site that we can use to quickly confirm that everything is working. And it is working! We now have what looks like a reliable build script that can be run any time. We should run it regularly (once a month?) and archive the builds, so that we have the potential to roll things back if a future build goes bad. Next steps:

  • Archive this build.
  • Add steps to the script for copying the build webapp to my local Tomcat, copying the test site into it, and running Tomcat.
  • Add tests to the test site for accented characters (display, submission from forms, and submission through GET), to ensure that UTF-8 encoding is working.
  • Test the new build with full working sites (Mariage and Moses). The latter is a good test of indexing and Lucene search with Unicode.
  • Look at the Analyzer patch, and see if it works.
Permalink 10:58:42 am, by Greg, 40 words, 210 views   English (CA)
Categories: Documentation, Documentation, Documentation; Mins. worked: 0

Access to SAN on MacOS

To access taporshare on the MacOS:
command-k to open the dialogue
URI - smb://taporshare.tapor.uvic.ca/<sharename>
Username - uvic\<netlinkId>
If you forget the "uvic\" in the username field the authentication will fail.

Permalink 10:47:06 am, by mholmes, 263 words, 213 views   English (CA)
Categories: Servers, R & D, Activity log, Activity log, Documentation; Mins. worked: 60

Character encoding issues on Tomcat-dev through Apache

Investigation and testing related to the container-encoding setting in the new Cocoon build process led me to discover a bug that's currently affecting sites on Pear's Tomcat-dev when accessed through Apache. Here's an illustration of the problem:

If you go to the Mariage site search page on Pear, accessed on its Tomcat port, and search for "mariée", you'll get correct results. However, if you access the site through Apache and the virtual domain and do the same search, you'll get garbled results.

The problem seems to be this:

We build our recent Cocoon stacks as all-UTF-8, and set up Tomcat as well to use UTF-8, but it appears that the last stage in the process, when Apache talks to Tomcat, is not working in UTF-8. We've done a bit of research, and based on this page:

http://confluence.atlassian.com/display/DOC/Using+Apache+with+mod_jk

Two things may need to be changed:

  • The AJP connector in Tomcat's conf/server.xml file may need to be tweaked to add a URIEncoding="UTF-8" parameter:
        <!-- Define an AJP 1.3 Connector on port 8009 -->
        <Connector port="8019" protocol="AJP/1.3" redirectPort="8081" />
    
    changed to:
        <!-- Define an AJP 1.3 Connector on port 8009 -->
        <Connector port="8019" protocol="AJP/1.3" redirectPort="8081" URIEncoding="UTF-8" />
  • This needs to be added to the Apache configuration:
    JkOptions +ForwardURICompatUnparsed

For the moment, this only applies to Tomcat-dev; Pear's Tomcat-stable is running legacy projects which operate in 8859-1 encoding, and they're working fine.

Wrote to sysadmin to request that they look at this and see if it makes sense.

06/12/10

Permalink 04:14:32 pm, by mholmes, 660 words, 92 views   English (CA)
Categories: R & D, Activity log; Mins. worked: 180

Cocoon+eXist+FOP: all three now building

I've made the following changes to RVDB's Ant script:

  • Inside the build directory (where the root build.xml is located), created a fop subdirectory, and inside that, a jars_replaced directory, where I'm going to store Cocoon jars that have been superceded by newer versions from FOP.
  • Created a fop-jars.xsl file inside the fop directory, to process and update Cocoon's jars.xml file (modelled on exist-jars.xsl, which does the same thing).
  • Created a new target in the Ant file:
    <target description="Replace FOP 0.95 with FOP 1.0" name="upgrade_fop">
          <echo message="-----------------------------------------------"/>
          <echo message="Upgrading FOP from 0.95 to 1.0             "/>
          <echo message="-----------------------------------------------"/>
    
          <echo message="-----------------------------------------------"/>
          <echo message="Patching jars.xml            "/>
          <echo message="-----------------------------------------------"/>
          
          <xslt-saxon in="${cocoon.home}/lib/jars.xml" style="fop/fop-jars.xsl" out="${cocoon.home}/lib/jars.xml.patched">
          </xslt-saxon>
          <move file="${cocoon.home}/lib/jars.xml.patched" tofile="${cocoon.home}/lib/jars.xml"/>
          
          <echo message="-----------------------------------------------"/>
          <echo message="Removing jars to be replaced        "/>
          <echo message="-----------------------------------------------"/>
          
          <move failonerror="false" file="${cocoon.home}/lib/optional/batik-all-1.6.jar" tofile="fop/jars_replaced/optional/batik-all-1.6.jar"/>
          <move failonerror="false" file="${cocoon.home}/lib/optional/fop-0.95.jar" tofile="fop/jars_replaced/optional/fop-0.95.jar"/>
          <move failonerror="false" file="${cocoon.home}/lib/optional/xmlgraphics-commons-1.3.1.jar" tofile="fop/jars_replaced/optional/xmlgraphics-commons-1.3.1.jar"/>
          
          <echo message="-----------------------------------------------"/>
          <echo message="Adding replacement jars and new required jars        "/>
          <echo message="-----------------------------------------------"/>
          
          <copy todir="${cocoon.home}/lib/optional" overwrite="true">     
            <fileset dir="${fop.home}/lib">
              <include name="batik-all-1.7.jar"/>
              <include name="xmlgraphics-commons-1.4.jar"/>
              <include name="serializer-2.7.0.jar"/>
            </fileset>
          </copy>
          
          <copy todir="${cocoon.home}/lib/optional" overwrite="true">     
            <fileset dir="${fop.home}/build">
              <include name="fop.jar"/>
            </fileset>
          </copy>
          
          <copy todir="${cocoon.home}/lib/endorsed" overwrite="true">     
            <fileset dir="${fop.home}/lib">
              <include name="xml-apis-ext-1.3.04.jar"/>
            </fileset>
          </copy>
          
        </target>
  • Added that target to the main "prepare" target:
    <target description="Prepare the cocoon and eXist directories"
          name="prepare" depends="update_exist, copy_exist, prepare_cocoon, upgrade_fop, build_cocoon, copy_additional_jars, patch_cocoon">
        </target>
    
  • Ran a build, which seemed to work fine, and confirmed that Cocoon and the eXist Webstart client still work. They do.

What I don't know yet is whether anything else works properly. This is what I'll have to do next:

  • Check through the sitemap changes made in RVDB's transformations, and alter anything that doesn't match our existing projects. For instance, he changes the XHTML doctype to transitional; that's something we definitely don't want. There may also be changes in our original transformation missing from RVDB's.
  • Identify, or create, a simple project that can be used to test all the functionality. The natural candidate is the IALLT Journal project, which is quite small, and uses FOP. However, it's not simple. It might be worth simply creating something straightforward for easier testing of future builds. We want to confirm that XQuery Generators, Saxon XSLT transformations, and FOP transformations all work.
  • Test character encoding issues. This is particularly needed because of this comment at the head of patch_web_xml.xsl:
    <!-- change default encodings to UTF-8 -->
    <!-- except for servlet[servlet-name='cocoon']/init-param/param-name[. = 'container-encoding']:
      should stay ISO-8859-1, see 
      	-http://markmail.org/message/nm6bnvqztbee4s5o
      	-http://markmail.org/message/jt256gl3magir6g4
      	-http://wiki.apache.org/cocoon/RequestParameterEncoding#A3._Decoding_incoming_requests:_Servlet_Container
    -->  
    
    We had form submission working fine with both GET and POST, using all-UTF-8 settings; if possible, I'd prefer this, but it may be that something has changed in Cocoon 2.1.12x which screws this up.
Permalink 10:52:13 am, by mholmes, 58 words, 80 views   English (CA)
Categories: R & D, Activity log; Mins. worked: 60

Cocoon+eXist+FOP: First two working

I discovered that the problem I was having with RVDB's Ant script, which I've modified, was actually with Saxon 9.3, which for some reason was failing to do transformations properly. I've rolled back to Saxon 9.2, and now the build goes OK and the resulting webapp works fine. Next I need to integrate the FOP instructions into the Ant build.

01/12/10

Permalink 02:53:41 pm, by mholmes, 265 words, 69 views   English (CA)
Categories: R & D, Activity log, Documentation; Mins. worked: 120

Cocoon+eXist+FOP: back to Ant...

RVDB sent a solution to my Ant build problem, which basically works -- the eXist build.sh is now found -- so I've integrated that, and I've also found a better solution to using Saxon for XSLT, that doesn't depend on its being on the CLASSPATH (which I want to avoid, because its version numbers are likely to change over time). I found the solution in one of the comments on this page, and it's basically this:

  1. Define a macro for Saxon, specifying a way to call it with the parameters we need to use:
      <macrodef name="xslt-saxon">
        <attribute name="in"/>
        <attribute name="out"/>
        <attribute name="style"/>
        <sequential>
          <echo level="info">XSLT Generating @{out}</echo>
          <java classname="net.sf.saxon.Transform"
            classpath="${saxon.home}/saxon9he.jar"
            logError="true"
            output="@{out}"
            fork="true">
            <arg value="@{in}"/>
            <arg value="@{style}"/>
          </java>
        </sequential>
      </macrodef>
    
  2. Call the macros when required like this:
    <xslt-saxon in="${cocoon.home}/build/webapp/WEB-INF/cocoon.xconf" 
                style="cocoon/patch_cocoon_xconf.xsl" 
                out="${cocoon.home}/build/webapp/WEB-INF/cocoon.xconf.patched">
    </xslt-saxon>
    

Now I have RVDB's whole build process working OK, but the resulting webapp is broken; XSLT transformations don't seem to work. I suspect this is something to do with the way Saxon is being set up in the sitemap, because I'm using a more recent version of Saxon (HE, 9.3, as opposed to the three-jar version 9). I'll work on this issue tomorrow.

Maintenance

This blog is the location for all work involving software and hardware maintenance, updates, installs, etc., both routine and urgent, in the server room, the labs and the R&D rooms.

Reports

Categories

December 2010
Sun Mon Tue Wed Thu Fri Sat
 << < Current> >>
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  

XML Feeds