Put Tomcat and teiJournal on Radicchio and tested it -- working fine. Now I can hammer it without killing tomcat-dev on Lettuce.
Greg generated font-metrics files for all the DejaVu and Gentium files -- he's blogged that process elsewhere. It only worked on Lettuce; even with the same Cocoon jars, it fails on OSX and Windows.
Then I was able to create a fop-config.xml file, which I'll reproduce at the end of this message. That file was placed in [cocoon]/WEB-INF. Then the root sitemap, which is where the fo2pdf serializer is defined, was modified, to add the <user-config> tag here:
<map:serializer logger="sitemap.serializer.fo2pdf" mime-type="application/pdf" name="fo2pdf" src="org.apache.cocoon.serialization.FOPSerializer"> <user-config>context:/WEB-INF/fop-config.xml</user-config> </map:serializer>
We know this works, because FOP would fail when we got it slightly wrong, then succeed when it was able to find the file. The key here is that this file path is relative, so the Cocoon instance remains portable.
Next, I put the fonts themselves, along with the font-metrics files, in a subfolder of WEB-INF, [cocoon]/WEB-INF/fop-fonts.
But here's the big problem: FOP was not able to find and use the fonts unless the paths to them were absolute. We tried a number of variants of relative paths, including file:fop-fonts/..., file://fop-fonts/..., ./fop-fonts/... and so on. Every attempt required a restart of Cocoon, and every second attempt required a restart of Tomcat (Tomcat dies on the second attempt to restart one of its webapps). This is not a workable way to proceed, so I'll have to set up a working Tomcat stack on a local machine to play with this. It's possible that setting the <base> or <base-font> elements correctly will do it; and it's also possible that we're incorrectly assuming these paths need to be relative to the fop-config.xml file, when actually they should be relative to something else (such as the lib directory where the FOP jar file is).
For the record, here's our fop-config.xml file, with ./ relative paths that don't work.
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<fonts>
<font metrics-file="./fop-fonts/DejaVuSans.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSans.ttf">
<font-triplet name="DejaVu Sans" style="normal" weight="normal"/>
<font-triplet name="DejaVuSans" style="normal" weight="normal"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSans-Bold.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSans-Bold.ttf">
<font-triplet name="DejaVu Sans" style="normal" weight="bold"/>
<font-triplet name="DejaVuSans" style="normal" weight="bold"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSans-BoldOblique.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSans-BoldOblique.ttf">
<font-triplet name="DejaVu Sans" style="italic" weight="bold"/>
<font-triplet name="DejaVuSans" style="italic" weight="bold"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSans-Oblique.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSans-Oblique.ttf">
<font-triplet name="DejaVu Sans" style="italic" weight="normal"/>
<font-triplet name="DejaVuSans" style="italic" weight="normal"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSansCondensed.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSansCondensed.ttf">
<font-triplet name="DejaVu Sans Condensed" style="normal" weight="normal"/>
<font-triplet name="DejaVuSansCondensed" style="normal" weight="normal"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSansCondensed-Bold.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSansCondensed-Bold.ttf">
<font-triplet name="DejaVu Sans Condensed" style="normal" weight="bold"/>
<font-triplet name="DejaVuSansCondensed" style="normal" weight="bold"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSansCondensed-BoldOblique.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSansCondensed-BoldOblique.ttf">
<font-triplet name="DejaVu Sans Condensed" style="italic" weight="bold"/>
<font-triplet name="DejaVuSansCondensed" style="italic" weight="bold"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSansCondensed-Oblique.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSansCondensed-Oblique.ttf">
<font-triplet name="DejaVu Sans Condensed" style="italic" weight="normal"/>
<font-triplet name="DejaVuSansCondensed" style="italic" weight="normal"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSansMono.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSansMono.ttf">
<font-triplet name="DejaVu Sans Mono" style="normal" weight="normal"/>
<font-triplet name="DejaVuSansMono" style="normal" weight="normal"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSansMono-Bold.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSansMono-Bold.ttf">
<font-triplet name="DejaVu Sans Mono" style="normal" weight="bold"/>
<font-triplet name="DejaVuSansMono" style="normal" weight="bold"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSansMono-BoldOblique.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSansMono-BoldOblique.ttf">
<font-triplet name="DejaVu Sans Mono" style="italic" weight="bold"/>
<font-triplet name="DejaVuSansMono" style="italic" weight="bold"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSansMono-Oblique.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSansMono-Oblique.ttf">
<font-triplet name="DejaVu Sans Mono" style="italic" weight="normal"/>
<font-triplet name="DejaVuSansMono" style="italic" weight="normal"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSans-ExtraLight.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSans-ExtraLight.ttf">
<font-triplet name="DejaVu Sans Condensed" style="normal" weight="400"/>
<font-triplet name="DejaVuSansCondensed" style="normal" weight="400"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSerif.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSerif.ttf">
<font-triplet name="DejaVu Serif" style="normal" weight="normal"/>
<font-triplet name="DejaVuSerif" style="normal" weight="normal"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSerif-Bold.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSerif-Bold.ttf">
<font-triplet name="DejaVu Serif" style="normal" weight="bold"/>
<font-triplet name="DejaVuSerif" style="normal" weight="bold"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSerif-BoldItalic.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSerif-BoldItalic.ttf">
<font-triplet name="DejaVu Serif" style="italic" weight="bold"/>
<font-triplet name="DejaVuSerif" style="italic" weight="bold"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSerif-Italic.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSerif-Italic.ttf">
<font-triplet name="DejaVu Serif" style="italic" weight="normal"/>
<font-triplet name="DejaVuSerif" style="italic" weight="normal"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSerifCondensed.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSerifCondensed.ttf">
<font-triplet name="DejaVu Serif Condensed" style="normal" weight="normal"/>
<font-triplet name="DejaVuSerifCondensed" style="normal" weight="normal"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSerifCondensed-Bold.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSerifCondensed-Bold.ttf">
<font-triplet name="DejaVu Serif Condensed" style="normal" weight="bold"/>
<font-triplet name="DejaVuSerifCondensed" style="normal" weight="bold"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSerifCondensed-BoldItalic.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSerifCondensed-BoldItalic.ttf">
<font-triplet name="DejaVu Serif Condensed" style="italic" weight="bold"/>
<font-triplet name="DejaVuSerifCondensed" style="italic" weight="bold"/>
</font>
<font metrics-file="./fop-fonts/DejaVuSerifCondensed-Italic.xml"
kerning="yes" embed-file="./fop-fonts/DejaVuSerifCondensed-Italic.ttf">
<font-triplet name="DejaVu Serif Condensed" style="italic" weight="normal"/>
<font-triplet name="DejaVuSerifCondensed" style="italic" weight="normal"/>
</font>
<font metrics-file="./fop-fonts/GenI102-Italic.xml"
kerning="yes" embed-file="./fop-fonts/GenI102.ttf">
<font-triplet name="Gentium" style="italic" weight="normal"/>
</font>
<font metrics-file="./fop-fonts/GenR102.xml"
kerning="yes" embed-file="./fop-fonts/GenR102.ttf">
<font-triplet name="Gentium" style="normal" weight="normal"/>
</font>
<font metrics-file="./fop-fonts/GenAI102-Italic.xml"
kerning="yes" embed-file="./fop-fonts/GenAI102.ttf">
<font-triplet name="Gentium Alt" style="italic" weight="normal"/>
<font-triplet name="GentiumAlt" style="italic" weight="normal"/>
</font>
<font metrics-file="./fop-fonts/GenAR102.xml"
kerning="yes" embed-file="./fop-fonts/GenAR102.ttf">
<font-triplet name="Gentium Alt" style="normal" weight="normal"/>
<font-triplet name="GentiumAlt" style="normal" weight="normal"/>
</font>
</fonts>
</configuration>
Actually, that's a misleading title; FOP is already working. The issue is how to configure the font settings, and how, then, we might deploy the system with working font settings. Here are the basics:
- FOP comes with the capability to handle all the BASE 14 fonts (the fonts which are guaranteed to be in every PDF reader). It seems to know about those font metrics without any configuration. That means I can start developing my code using those fonts (which means Helvetica, Times and Courier), without addressing any of the problems below, but that's very limiting.
- To use other fonts, you first need to generate font-metric files for them. These are XML files encoding the metrics of the fonts, which FOP uses to calculate layout etc. The instructions on the FOP site for generating these files doesn't work; the TTFReader class is not found.
- Other documentation on the FOP site suggests that the latest version (0.95, which we think = fop-0.20.5.jar, which is in our Cocoon 2.11's) can work out the metrics for itself if you tell it where to find the fonts, using the fop.xconf file. However, we don't know where to put such a fop.xconf file on Cocoon, and how to tell FOP where to find it.
- Paths to fonts and font-metrics files need to be full paths for FOP to use them, as far as we can tell. That makes deployment difficult. Up to now, teiJournal is completely config-free, in that you can dump it into any Tomcat and it will just work. If these paths have to be hard-coded, then some kind of configuration script will have to be run when the app is first started, or by the administrator when deploying it. That's a disappointing step backwards.
We're still working on this. We'd very much like to distribute teiJournal with all the DejaVu fonts and metrics files for them, using them by default in the PDF generation; that's allowed under their licenses, and they're more attractive than the BASE 14 fonts.
I'm just beginning the PDF file generation code, so I've written to the IALLT Journal editors for some feedback on the design. As far as I remember, we've never addressed the issue of page size and layout for the PDFs. These are things we need to think about:
- What page size are we producing? Letter (8 x 10) is probably the best choice, because most people will be printing the PDF on a regular printer, if they print it at all, but we can choose any size we like.
- Do we want to go with separate layouts for recto and verso? When you're producing a print volume, you normally have slightly different margins for recto and verso pages, as well as a different running header, and the page numbers will typically be located in different places. However, when the target audience is likely to print the document off on their inkjet or laserjet printer, this is a bit pointless, because the pages are not bound in book form. On the other hand, more and more network printers have duplexing capabilities, so people may well be able to create a little booklet for themselves, or may put the pages into a binder, so perhaps we should allow for that.
- Bearing in mind the factors above, where should the page numbers go -- top or bottom? Left (verso) and right (recto), or centred?
- Again, bearing in mind the above, what should we use for the running title(s)? Each article already has a custom running title based on its title, but if we have separate recto and verso designs, we can have a second running title; that might be the author name(s), or the journal name/vol/issue, bearing in mind that a printout will be out of context of the site, so the journal name needs to be there somewhere. Alternatively, we could put the journal name in the gutter.
These decisions don't have to be set in stone; they'll just be encoded in a an XSLT file stored in the database, and can be modified easily, but it would be easier to start with a firm plan even if we change it later. My instincts say that we should design for a situation where people would print off the document in duplex and insert it into a binder, so:
- 8 x 10 paper, with a larger right margin on the recto and a larger left margin on the verso
- page numbers on the top left (verso) and top right (recto)
- running article title at the top of the verso, and author name(s) at the top of the recto
- journal name in the gutter of the verso, and volume/issue in the gutter of the recto
I'm waiting for feedback from the editors on this. Meanwhile, Greg and I have been looking at how to get FOP working (see next post).
I'm about a third of the way into the paper, which is long and full of references. There are five images to deal with, but I haven't got to them yet. So far so good -- I have a couple of queries in with the author about puzzling bits.
Finished marking up the bibliography of the last article. Found a couple of typos, and also tweaked a lot of the XSLT to display presentation and online journal data. Dates are very variable with online content, and where full dates exist, APA requires that they be in long form: 2008, September 15, but the code has to be aware that sometimes the day is missing, and sometimes the month. This is coming along nicely, though, and I'll be ready to start on the PDF code soon.
Started marking up the bibliography of the latest paper, and adding new handlers to the XSLT code as needed. We now have handlers for online letters to the editor, online reports, and blog postings. I'm about half-way through the bibliography. Only one more item (online transcript of forum presentation) looks problematic.
The third, and probably final, article for volume 40 has arrived, and it's a big one, with lots of new types of reference item to handle. I started by moving all the footnote references into a list of alphabetically-ordered commented items in the bibliography area, ready to be marked up, and looking around for some additional info for some of the items. Many are of types which are not clearly or directly handled by APA guidelines (for instance, online transcription of presentation, or letter to the editor of an online journal in response to a previous article). This article will take quite a while to mark up, but it'll help add more handlers to the reference system, before I start work on the PDF export.
The journal system allows for many different contribution types, specified through the @rend attribute on the <TEI> tag, and constrained by a customization of the schema. I've never actually used these before, but the latest contribution being a "Lab notes" item, HM asked that this fact be made noticeable on the page somewhere, so I implemented a system which supplies an absolutely-positioned contribution type label, based on XSLT string variables. Four document types are handled in the XSLT, although there are string variables for all the variants. The default "Note" is overridden in the user stylesheet in the db, and reads "Lab notes" as per the IALLT Journal requirements.
In the process, I fixed an annoying IE JavaScript error, resulting from the fact that IE doesn't support hasAttribute.
The second new article is short and simple, with no bibliography, so it was quick to mark up.