Map Of London

April 20, 2016

Variant spellings: everything except the gazetteer working

Posted by on 20 Apr 2016 in Activity log

I now have the variant spellings being automatically harvested into the HTML output for location documents, and the result is functioning correctly with and without JavaScript. I'm now in the process of rewriting the code for generation of the gazetteer pages themselves so that instead of figuring out its own set of variants, it simply harvests from the spelling_variants.xml file which is created already. Once that's working, the existing appendix code should take care of pulling in the variant spellings when generating the HTML.

April 18, 2016

Variant spellings: more progress

Posted by on 18 Apr 2016 in Activity log

I've enhanced the variant spellings stuff a bit and I'm now generating AJAX fragments from it; this will serve as content for the gazetteer pages and for the location page files.

April 15, 2016

Stow chapter segments: decisions

Posted by on 15 Apr 2016 in Activity log

After much discussion, we've decided to go with a system based on that outlined two posts ago, with some additional clarification about the status of bibliographic codes appearing between sections. If a new chapter begins at the head of the page, the formeworks etc. appearing above should not be part of its div, but they should be harvested to be part of its single-chapter TEI file, providing the bibliographic context for the conceptual div that follows them. Ditto for bibliographic elements appearing after the end of a chapter div. This means that such elements should first be moved out of their containing divs, making them div-liminal, and then the chapter-splitting code needs to take account of them. In addition, any page break preceding a chapter div which starts in the middle of a page must be harvested, and copied (with a special @type attribute yet to be decided) to the head of the chapter file being created. The existence of this will enable us not only to link to the page image, but also to determine that the chapter does not start at the top of the page, and signal this in some way in the rendered XHTML.

With the exception of the pb/@type attribute change, I've made all the changes necessary to the schema to accommodate these decisions, and we've added Schematron to stay on top of how they're supposed to function.

Variant spellings stage 2

Posted by on 15 Apr 2016 in Activity log

I decided to generate not just the linkGrp structures but a fully-expanded list of lists of variants, each with a link to the actual document from which it comes, constructed from its title, as the body of the variant spelling output. This saves additional steps in downstream processing. That's now working and valid.

April 13, 2016

Rewrote the variant spelling generation code

Posted by on 13 Apr 2016 in Activity log

I needed a much cleaner and more robust set of variant spelling data, not only for the gazetteer, but also for the provision of variant spellings with links in the location files themselves. I've now completely rewritten that code, and extracted it from the xml_create_generated_master.xsl file, so that it can run separately first, and its output can be used by the latter process. I had to do some schema-tweaking to make it valid, because all our usage of <linkGrp> and <link> up to now has been for a single purpose, and highly-constrained, but this is more generic.

April 12, 2016

Handling Stow chapter segments

Posted by on 12 Apr 2016 in Activity log

JT and I have worked out a plan for handling the Stow 1598 in such a way that we can track the publication status of separate chapters, and create standalone versions of those chapters in XML and XHTML. This will enable us to handle peer-review and publication on a chapter-by-chapter basis. Here's how it works:

Each identified chapter will constitute a distinct <div> with an @xml:id.
In the <teiHeader> of the Stow document, the <revisionDesc> element will contain a <listChange> element; before the existing change elements, its first child will be a <listChange> element with @xml:id="stow_1598_chapter_status".

Inside this listChange will be a single change element corresponding to each chapter:

<listChange xml:id="stow_1598_chapter_status">
  <change xml:id="stow_1598_cripplegate_ward_status" when="2016-04-12" who="mol:DUNC3" status="draft"/>
  <change xml:id="stow_1598_breadstreet_ward_status" when="2016-04-10" who="mol:LAND2" status="final"/>
[...]
</listChange>

Each individual chapter <div> has a @change attribute pointing to its corresponding <change> element:

  <div xml:id="stow_1598_cripplegate_ward" change="stow_1598_cripplegate_ward_status">
  [...]
  </div>

When we process the overall file to create each of the individual chapter files, we take the @status attribute from the corresponding <change> element in the header and give its value to the <revisionDesc>/@status value in the header of the chapter file.
The <listChange> element in the header can be processed as a key to split out the individual chapters, and also to generate the modern table of contents page.
When an encoder or editor determines that a chapter has reached a new stage, that person updates the corresponding <change> element in the header to specify the new status and date.

I've update the ODD and regenerated the schema to allow for this.

April 7, 2016

Bugfix for static XHTML output

Posted by on 07 Apr 2016 in Activity log

Fixes for problems with items duplicated in the XHTML output, due to being in the body as well as in the header, mostly in XIncluded documents. Output is currently still invalid due to namespace issues arising from date code, which JT is fixing.

April 5, 2016

mdt files now building in original XML collection

Posted by on 05 Apr 2016 in Activity log

I've moved the generation of mdt category files back to the original XML collection, and tweaked accordingly; I've also made a number of changes to the schema linking in all of those files (incorporating the rng file as a Schematron source for additional checks), and now all the original XML stuff is validating correctly, as is the standalone which is generated from it.

April 4, 2016

Static build again: new approach to "original" XML

Posted by on 04 Apr 2016 in Activity log

I decided that it would benefit us to process our source encoding into something cleaner for the "original XML" folder, expanding XIncludes and standardizing relative paths to schemas, so I've done that; the standalone is now based off the original folder. Having done that, I'm now back to building the gazetteer XML content, and working on the index of variants to document lists which will be harvested for the variant set at the top of each location page. It's basically working, but I'm still wrestling a bit with problems of whitespace affecting validity. I think I've solved (for the moment at least) the problem of creating a unique id for each variant spelling, based on the spelling itself.

March 31, 2016

Static build and gazetteer

Posted by on 31 Mar 2016 in Activity log

After some thought, I've decided on the following strategy for handling the gazetteer, and implemented some of it:

We generate TEI files for each of the "letters" of the gazetteer alphabet, giving us a source file for each of those pages. That's working, along with generation of the JSON. These files require specific markup practices which the schema now allows for, and use two new private URI schemes.
These files are put into the site/xml/original folder.
They are validated there along with all the others.
Validation will require some tweaking to the copied originals and these new ones, to do a couple of things: expand the XIncludes, and tweak the schema PIs so that they point to the correct location.
From these files, we generate another file which contains a list of all the locations, and for each location, a list of the variant spellings, and for each variant, a list of links to the documents that use it. This will be used as the basis for providing the variant spelling collection at the head of each location file, along with appendix items which can be turned into popups showing links to all the documents.
Gazetteer XHTML5 pages will be generated in the normal way from the "original" source files, once those have been converted to standalones. This will require addition of specific templates to match the tabular encoding and particular URI schemes (molagas: and molvariant:) which I've introduced as part of this construction.