SMK suggested CharisSIL, which I've downloaded and tested with XEP; it looks good. I'll now try integrating that into FOP. I've also updated the Collator after discovering that we had not included barred lambda without a following apostrophe, causing one (actually erroneous) example of that to be sorted to the beginning, a mystifying thing for a while. I've made a little more progress with the rendering of entries, but I don't have a model to work from yet so it's just exploratory.
Up to now, we've been using the excellent Gentium Plus fonts for our dictionary website, and they're working well; they're attractive and cover all the glyphs we need.
However, GentiumPlus comes only in regular and italic flavours; there's no bold version of the font. When we use "bold" on the website, the browser renderer automatically fattens-up the regular font to give the impression of bold; it's not particularly pretty, but it works. Unfortunately, that can't be done when generating the PDF. Neither Apache FOP nor the commercial XEP generator we use has the capability of automatically generating a bold version of a font from a regular version (and they would presumably argue that it shouldn't be done, because it's ugly, and would be especially noticeably so in print). So we're faced with three choices in the print dictionary:
- Eschew bold text altogether. This is not ideal, because dictionary entries depend for their readability on the textual differentiation of lots of information bits which appear next to each other.
- Choose a different font, one which has both bold and italic variants, for the whole dictionary.
- Choose a different font only for bits of the text which need to be bold.
We'll have to think carefully about this. I've written to the team for thoughts and suggestions.
Just reminding myself: I need the MosesCollation.jar file to provide the sort collation for the dictionary, and in eXist this is found automatically as long as it's in the WEB-INF/lib directory. However, locally, I have to add it manually to the transformation scenario -- on the first tab, click on Extensions, then Add, and find the jar file.
I've finished the first draft of my bits -- over to ECH...
Redrafting the third part of the article encouraged me to do even more refinement of the schema; I've now nailed down the use of @type and @subtype more thoroughly, and tweaked the encoding of <gloss>
as a result. It uses @subtype="i" instead of @type, for consistency with <seg>
and <phr>
, even though it makes no use of @type (at the moment). The article is coming along, and I hope to get the first draft of my bits finished tomorrow.
Built a fresh checkout of eXist trunk, and reworked the app to run in it. Had to change relative paths to XSL files in XQL files to full paths from /db for some reason. Also fixed a bug in the personography rendering, which after much confusion turned out to be caused by my having moved the schemas around in the db. Only thing left to do: add password protection.
And better tested and documented. I think it's finished now, but we've also learned enough to be able to update it easily. I also learned that if I compile it targetted at Java 1.7, it will cause errors on Pear, so I compiled it for 1.5.
It's remarkably fiddly to get all this stuff right.
This took way longer than I expected, and I ended up resorting to tables to keep the layout under control. Not ideal, but nothing else would work.
Sarah says:
I think we decided on Monday (talking with Ewa) that you should go ahead and use the full def:segs instead of the glosses in the "get related words" and "other entries containing this morpheme" lists. The glosses are only intended as headwords for the English-Nx word list.
Our example was cìqqnúnn, with the def:seg "I accidentally dug up something". The gloss tags are "dig", "dug", and "accidentally". We want the Nx word to appear (with its def) under all three headwords in the English-Nx word list, but we don't want a learner to look at the "get related words" list and think the word can be used to mean just "accidentally".
I've moved the NetBeans project for the RuleBasedCollator class MosesCollation into our SVN tree, and then I rewrote it to include orthographic characters and English (upper and lower case) so that we can sort all three types of string using the same collation.
I had to do this twice because the current version of NetBeans from the Precise repo, which was 7.0.1, has the most disastrous bug imaginable: no file changes are saved to disk. This is undetectable while you're running it, because file buffers are changed, and jar files are built from the buffers. I lost all my work to this bug, and had to repeat it after removing the repo NetBeans and installing 7.3 from the download installer.