Category: Activity log

16/01/19

Permalink 04:20:18 pm, by mholmes, 172 words, 6 views   English (CA)
Categories: Activity log; Mins. worked: 180

Wrote XSLT to amend faulty rhyme labels

KF asked me today if we could have a fix for rhyme-labelling issues, in particular where you get through an entire labelling sequence then you discover that where you assigned label "m" you failed to notice that it was the same as "c" so you need to change it to "c" and re-label everything after "m" to move it down one slot. This is a slightly more tricky problem than you might think, so rather than try to squash it into a quickfix, which doesn't allow for as much graceful termination with useful messages when you ask it to do something that doesn't make sense, I've created it as an XSLT transform with a framing transformation scenario, which is not the default for XML documents but which you could run on them by using "Transform with". I've also written a fairly extensive XSpec test suite for it. In the process of developing and testing, I fixed some old encoding from back in the day. That's going to be a long steady process.

11/01/19

Permalink 04:06:49 pm, by mholmes, 22 words, 12 views   English (CA)
Categories: Activity log; Mins. worked: 140

Planning meeting for 2019

Met with AC and drafted the plan for 2019; that's in the repo now, along with a todo list for both of us.

21/12/18

Permalink 04:42:58 pm, by mholmes, 55 words, 9 views   English (CA)
Categories: Activity log; Mins. worked: 240

CSS and other progress

CSS selector conversion now properly written, tested and working. XSpec file now includes test for that function. Extra poem encoded for testing. Fixes to a couple of other poems done.

I've also now finished re-creating the original site banner using higher-res sources to get something that will actually scale. Results look pretty good in isolation.

19/12/18

Permalink 05:09:53 pm, by mholmes, 24 words, 10 views   English (CA)
Categories: Activity log; Mins. worked: 90

XML and XSL updates to match db

SQL-to-TEI conversion now updated to take account of the changes to the db; ditto with schema and documentation; and finally the poem rendering XSLT.

Permalink 11:43:08 am, by mholmes, 621 words, 15 views   English (CA)
Categories: Activity log; Mins. worked: 200

Database updates

Did the updates on the dev db first, then on the live db. Problems encountered were that extensions to field lengths in the poems table seem to have hit MySQL limits, in particular the size limit on a row, which is 65,535 bytes. Converting some columns to TEXT instead of VARCHAR solves the problem, although of course there's a performance penalty. I also had to delete some indexes which were hitting limits. Below is the process in half-code-half-comments. Now I have to update my TEI generation code to take account of the changes.

/* This file is the working SQL file for changes to the db made in December 2018, per 
 * instructions from AC. 
 * 
 * Make these changes step by step and confirm/check/test/backup before continuing. */
 
/* FIRST THE SIMPLE THINGS: MAKING TEXT FIELDS LONGER. */

/* Pseudonym field needs triple the length of characters. */
/* First we have to drop some indexes this field is involved in. */
ALTER TABLE `poems` DROP INDEX `idx_po_general`;
ALTER TABLE `poems` DROP INDEX `idx_po_pseudonym`;

/* Now set the length. */
ALTER TABLE `poems` MODIFY `po_pseudonym` VARCHAR(300);
 
/* Display name ditto. */
ALTER TABLE `persons` MODIFY `prs_displayName` VARCHAR(300);

/* Images field needs to handle up to 70 images. This involves changing its type to TEXT. */
ALTER TABLE `poems` MODIFY `po_images` TEXT(4096);

/* Add a new allonym text field. */
ALTER TABLE `poems` ADD COLUMN `po_allonym` VARCHAR(300) AFTER `po_allonymous`;

/* Add new hashtag field. */
ALTER TABLE `poems` ADD COLUMN `po_hashtags` TEXT(1024) AFTER `po_links`;

/* SERIES FIELD FOR POEMS. */
ALTER TABLE `poems` ADD COLUMN `po_series` int(11) default NULL AFTER `po_organ`;

/* NOW UPDATE local_classes.php and test. */
/* local_classes.php:
 * set prs_displayName to 300 length.
 * set po_pseudonym to 300 length.
 * set po_images to 4096 length.
 * add new allonym field.
 * add new hashtags field.
 * add new series field.
 * 
 * */

/* NOW THE HARD STUFF: TURN THE TRANSLATOR FIELD INTO A ONE-TO-MANY LINK TO PERSONS. */

/* First create the linking table. */
DROP TABLE IF EXISTS `poems_to_translators`;
CREATE TABLE `poems_to_translators` (
  `ptt_id` int(11) NOT NULL auto_increment,
  `ptt_po_id` int(11) default NULL,
  `ptt_tr_id` int(11) default NULL,
  PRIMARY KEY  (`ptt_id`),
  KEY `fk_ptt_translator` (`ptt_tr_id`),
  KEY `fk_ptt_poem` (`ptt_po_id`),
  CONSTRAINT `fk_ptt_translator` FOREIGN KEY (`ptt_tr_id`) REFERENCES `persons` (`prs_id`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `fk_ptt_poem` FOREIGN KEY (`ptt_po_id`) REFERENCES `poems` (`po_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

/* NOW UPDATE local_classes.php and test. */

/* Next, we try to dicsover candidates for translators in the persons table. */
/* This is the XQuery to generate the SQL:  */

---------------------------------------

declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";

declare option output:method "text"; 

let $poemsWithTranslators := //table_data[@name='poems']/row[string-length(field[@name='po_translator']) gt 0],

$links := for $p in $poemsWithTranslators
let $transName := normalize-space($p/field[@name='po_translator']/text()),
$candidates := //table_data[@name='persons']/row[field[@name='prs_displayName'] = $transName]

return

if (count($candidates) = 1) then
let $poId := $p/field[@name='po_id'],
$prsId := $candidates[1]/field[@name='prs_id']
return concat(
'INSERT INTO `poems_to_translators` (`ptt_po_id`, `ptt_tr_id`) VALUES ("', $poId, '", "', $prsId, '");', '
')

else (concat('/* No match found for ', $transName, '. */
'))

return $links

---------------------------------------


/* Run the resulting SQL against the db to insert
 * the new records.*/
 
/* Change the local_classes.php file to show "OLD Translator field". */

/* Download fresh versions of the db and commit. */

/* Run XPath against the db to get comma-separated lists of poem ids where:
 *  a) a new record has been inserted linking to the poem table, and 
 *  b) no match was found so a record will have to be manually created. */

18/12/18

Permalink 03:55:21 pm, by mholmes, 98 words, 8 views   English (CA)
Categories: Activity log; Mins. worked: 180

Preparation and code for db updates tomorrow

Tomorrow is update day for the db, so I've built a full plan with SQL and XQuery code ready to execute. It looks like we can link about 730 of the 1530 or so translators directly to existing person records, so although those will need to be checked, that's a lot faster than doing them all manually. The remaining ones will have to be handled manually, though. I've also pulled the banner from the old VPN site ready to make a rough home page for the site, and done some more thinking about a more sophisticated CSS parser in XSLT.

17/12/18

Permalink 04:26:10 pm, by mholmes, 156 words, 8 views   English (CA)
Categories: Activity log; Mins. worked: 150

First pass at using rendition/@selector

It's very common to find the same pattern of indents throughout the stanzas of a poem. Right now, people are encoding these mechanically and repetitively, which is OK but clutters the XML and takes time. A better option would be to use the TEI rendition element with the @selector attribute, like this:

           <rendition selector="lg">
               margin-left: 6rem;
            </rendition>
            <rendition selector="lg>l:nth-child(2), lg>l:nth-child(4), lg>l:nth-child(7), lg>l:nth-child(10)">
               margin-left: 1.1em;
            </rendition>

to specify that all stanzas have a left margin of 6rem, and lines 2, 4, 7, and 10 of each stanza are additionally indented.

This is easy to code but hard to process. I've had a first shot at figuring out how to do it, and so far so good, although as the selectors get more gnarly the code will have to be revisited. It's good enough for testing purposes at any rate.

14/12/18

Permalink 04:54:14 pm, by mholmes, 44 words, 9 views   English (CA)
Categories: Activity log; Mins. worked: 60

Fixed and enhanced statistics/progress tracking

The progress tracking output was borked in a couple of ways, one cosmetic (the chart display had hundreds of stacked labels on the X axis) and one arithmetical (I was miscalculating the projected duration based on current progress). I've fixed both of those issues.

Permalink 04:52:21 pm, by mholmes, 65 words, 9 views   English (CA)
Categories: Activity log; Mins. worked: 150

Poem encoding and Quick Fixes

Worked on my poem-encoding Quick Fix so that it can now tag a whole poem in one go. As part of developing and testing, I also encoded a couple of poems myself, and did some tweaks to rendering and processing. I also ran the OCR task against the 1840 poems to give myself a bit more choice in picking test poems. Updated the documentation as well.

12/12/18

Permalink 05:17:08 pm, by mholmes, 84 words, 7 views   English (CA)
Categories: Activity log; Mins. worked: 360

Refrains, anaphora, epistrophe and horizontal lines

Met with KF and AC and discussed a number of issues, as a result of which I've added schema support, processing support and documentation for handling refrains, eliminated the hack that was used to handle them before, tightened up the rhyme label attribute constraints as a result, and fixed the old encoding approach from the data. I've done the same for ornamental horizontal lines. In the process I encoded a couple of poems myself, and fixed some bugs in rendering that were annoying me.

:: Next Page >>

Digital Victorian Poetry Project

This is a blog to track work on the DVPP project. Prior to this blog being created, posts were made in the Depts blog.

Reports

XML Feeds