Edition 3.6 of Dr. Kim Blank's widely-admired site Mapping Keats's Progress has been released.
Category: "Activity log"
Dr. Kim Blank's site Mapping Keats's Progress: a Critical Chronology, which "has received plaudits and unsolicited praise from respected colleagues," (Suzie Grogan) has progressed to edition 3.5. This project is a collaboration with HCMC and is a fully Endings-compliant project.
Dr. Kim Blank's widely-admired site Mapping Keats's Progress: a Critical Chronology has reached another milestone, with edition 3.4. This project is a collaboration with HCMC and is a fully Endings-compliant project.
HT wants Blüh to sort Blue Blüh Blum Blz rather than the default Blue Blum Blz Blüh
Here's the original code to sort the json by name using just the raw unicode
thisCatFeatures.sort(function(a, b) {
var aName = a.getProperties().name.toUpperCase();
var bName = b.getProperties().name.toUpperCase();
if (aName < bName) { return -1; }
if (aName > bName) { return 1; }
return 0;
});
Here's a modification I made which transforms each accented character to its base character (ä to a) by splitting the base from the combining diacritic and then stripping the diacritic. This addresses the immediate problem, but does not conform with e.g. the German convention which requires that ä actually sort as ae.
thisCatFeatures.sort(function(a, b) {
var aName = a.getProperties().name.toUpperCase().normalize("NFD").replace(/[\u0300-\u036f]/g, "");
var bName = b.getProperties().name.toUpperCase().normalize("NFD").replace(/[\u0300-\u036f]/g, "");
if (aName < bName) { return -1; }
if (aName > bName) { return 1; }
return 0;
});
Here's a variation using the Collator object with an argument telling it to use German (de) spelling conventions. The "{sensitivity: base}" argument in combination with the German localization might be redundant or might generate slightly different results than no such argument, I haven't tested and this makes it explicit.
thisCatFeatures.sort(function(a, b) {
var aName = a.getProperties().name.toUpperCase();
var bName = b.getProperties().name.toUpperCase();
return new Intl.Collator('de', { sensitivity: 'base' }).compare(aName,bName);
});
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Collator
Before, alpha sorting distinguished umlauted characters from non, so would sort like this:
Baa
Bad
Bag
Bäa
Bäd
Bäg
Whereas HT wants
Baa
Bäa
Bad
Bäd
Bag
Bäg
I created a person_name_normal field and put into it the person_last_name field, and then went through and replaced all instances of ä with a, ö with o and ü with u, so the normalized field no longer has the umlauted characters.
I now have to revise the code for output so that it includes the person_name_normal field and sorts on that field rather than on person_last_name.
I also have to revise the code that populates the database so that it properly includes the normalized version of the person_last_name automatically.
Seven years to the day since the last update, Hot Potatoes version 7.0 has finally been released. It's also well over 20 years since the first versions came out. Still going strong! Thanks to all the people in the Hot Potatoes user community for all their input and help over the last year, as I worked on the new version.
This post describes the current work that's just starting to create a new Hot Potatoes exercise site for the Alpha to Omega textbook.
- The student working on this project (as MN's workstudy) is RH.
- An svn repo has been set up (sv/greek) containing the old site/textbook material (svn/greek/athena) and a framework for the new one (svn/greek/alpha_to_omega).
- RH has permissions to write to the repo, and has been introduced to svn. He has used git before.
- The work will begin with chapter 14, and go through to 17 as soon as possible, so that current students will have something to work on before their exams.
- RH is booked for two sessions a week on Spade, and I've installed Visual Studio Code on there so that he can use it to develop his own stuff before dumping it into Hot Potatoes. He's planning to use CSS grid layout instead of tables, and has his own custom source file in the repo (css/ross.css) which can be included in HotPot exercises on the fly, to handle the grid layout stuff. He also knows about the HCMC clipart and will use it where appropriate. He is new to Hot Potatoes itself, so will need a little help as he gets started.
180 minutes over the last few days.
This is the first release that incorporates a search built from the Endings staticSearch codebase. All seems to be working well. Had to tweak the rendering of the documentation to make it valid HTML5 as well as moving the old search code out of the way. The build goes a bit faster now. 60 minutes.
Met with SR and MS on issues to resolve in creating work plan for version 2 of Mountain Legacy Project software.
Confirm phase 1 will be
- any modifications to db structure necessary to support known feature requests (whether those features are implemented in phase 1 or later)
- static build process to create JSON indexes etc. to support searching and other features in existing product
- provide comparable features to existing Explorer and MEAT
- may include copying of media files from uvic and third party servers to computeCanada filespace (7 TB to expand to 15)
MS identified these issues:
- use hierarchical navigation scheme in MEAT to replace current filtering feature in Explorer
- allow user to select area (rectangle) on map and retrieve all map points in that area (allow grouping of those by station, etc. hierarchy)
- provide list of prefab areas with details as above (e.g. national parks, watersheds)
Likely we will go with "thick client" javascript-based "single-page application". One infrastructure; Explorer is a subset of MEAT capabilities.
SR to
- contact Eric Higgs to meet to get large picture and to
- enquire about contacting Chris Gat for technical details on Ruby and DB,
- then to contact Mike on use of JavaScript and integration / reuse of stuff Mike has already done.
There's a problem with connections from the Siberian site to the mysql database hanging around on the database forever.
Searching through the code, I see it's using PDO to connect to the database. A variable called $dbhCon is the PDO connection object. A variable called $dbhQuery is the query that is executed. The only instance I can find of $dbhQuery->execute() is line 16 of /include/search-results.php. The results of that query execution wind up in a php array called $rows. I've added a $dbhCon = NULL; line immediately following that assignment in an effort to force the connection on the mysql database to terminate, as it sometimes seems to hang around for weeks.
History department is doing a talk which is to be live streamed by AV. AV gave them this block of code to insert in their Cascade site. Presumably some of the connection arguments are specific to the stream and will vary with each instance.
When we put that code in and published the page, we got the player with a "error cross domain request denied" message. In consultation with Scott Thorpe, he said this is normal behaviour and the code will just work when the stream is enabled (about 15 minutes before the event actually starts).
There is no way to test this in advance.
<div style="width: 100%; max-width: 512px;">
<div style="border: 1px solid #000; position: relative; width: 100%; padding: 0;" id="VidPlayerPlaceholder_1243" class="videoplayer">
</div>
<script type="text/javascript" src="//www.uvic.ca/video/player/js/7.11.2/jwplayer.js"></script>
<script type="text/javascript">jwplayer.key="UJGcVouk597phvGZrziZMHAb3IRluP27vKFmTIMbWyw=";</script>
<script type="text/javascript">
var p = jwplayer('VidPlayerPlaceholder_1243').setup({
flashplayer: "//www.uvic.ca/video/player/jwplayer.flash.swf",
playlist: [
{ title: "", image: "//hlsvod.uvic.ca/vod/mediaservices/UVic-one.jpg", sources: [{ file: "//hlslive.uvic.ca/hls-live/livepkgr/_definst_/livestream/livestream.m3u8"}]}
],
primary: 'html5',
hlshtml: 'true',
width: '100%',
aspectratio: '16:9',
autostart: 'false',
repeat: 'false',
controls: 'true',
logo: {
file: '//www.uvic.ca/systems/assets/images/video-player/uviccopyright.png',
link: 'http://www.uvic.ca',
position: 'top-left',
hide: 'false'
},
rtmp: {
bufferlength: '5'
}
});
p.setVolume(50);
</script>
</div>
<!-- Closes video player -->
Got the first editorial intro material from DH, and it sort of fits in the projectDesc, so I've created a new file called data/project/project.xml to host it. Eventually there will be a full editorialDecl in there, and I'll convert that stuff into an editorial info page on the site.
Wrote some XSLT to align and normalize all the Masher files, and was able to rebuild the whole site with the new HotPot 7. Posted it for a look by MN and LB. There may be a bit more work, but this is now mostly done, I think. The site builds quite rapidly (around 20 minutes) on my new machine.
Met with KS and EP to discuss building a site in support of the a German 400 course; this will be hosted on our server, data will live in our svn, and the site will be an XML/XHTML5 site like Keats and Close Reading. KS will organize and name the images, and build spreadsheets linking images to categories. When we start to receive that info, we'll start building out the site with browsable category pages, including categories for each of the four plays done so far.
I've now built validation into the repo for the HTML output, and tested it with units 1 and 2, which are now all valid; I've fixed a bunch of table resources that were using old attributes, in anticipation of being able to use a global class in the CSS; and I've nuked a bunch of obsolete materials from units 1 and 2, making everything a bit cleaner.
I think we're now good to go; waiting for KB to press the button.
KB sent a list of changes, which I've done; what's left is to decide where to put the site, and what to do with the WordPress instance.
Hot Potatoes 7 is now at a point where it's ready to be put to real use, and the Latin site has been waiting for it, since the site needs a refresh, a new style, and support for audio. I started by creating an svn repo for it (svn/latin), into which I've put the current Wheelock materials, and future editing will be done using the repo, which makes everything a bit more robust. I've already created a customized index file page and one other custom file which adds a second keypad at the top of JCloze exercises. I built Unit 1 and posted it for some feedback from MN. One thing I'm running into is the issue that the source folder setting in the Masher is an absolute path; that's obviously not what anyone wants, because we need to be able to check out and build the site on multiple machines, so I've gone back to the Masher codebase in Delphi and I'm adding support for relative paths (making them the default, in fact). Once this is all working and tested, it should make building the site much easier, and in fact enable editors to build entire units themselves with the right source files to test their work.
I've finished porting all the Close Reading material over to an Oxygen-based editing format, and posted a test version of the site on the hcmc server; waiting on KB to let me know how he wants to proceed in terms of replacing the old site. More could be done on the documentation for editors, but if it turns out not much more editing is going to be done anyway, it may not be worth the time.
For some time I've been working on version 7 of Hot Potatoes, mostly at weekends because the work has to be done on a Windows XP vm which is on my home desktop. Today I was working remotely because of snow and other reasons, and I got a good amount of work done, releasing the fourth or fifth beta (can't remember which), numbered 7.0.0.7. This is (I think) feature-complete, and has all existing reported bugs fixed. Changes from version 6 include:
- Pure XHTML5 output, fully validated.
- Responsive interface using CSS flex to better support small form-factor devices.
- Touch support in drag-drop exercises.
- Reading texts now available to accompany drag-drop exercises (screens are bigger so this is more practical now).
- Cosmetic refresh for exercises, including rounded corners and drop-shadows on buttons, etc.
- New version of the base HotPot application background graphic, with version 6 replaced by version 7.
- Removal of obsolete code and simplification of existing code, due to better options in JS and CSS (there's more of this to do).
- Fully updated help file, readmes, example exercises, and installer.
- Updated interfaces for Italian and European Portuguese (that's all we have so far).
I think we're now in a position to re-work the Latin and Greek sites with their own customized versions of the new source files, but beta-testers are still getting started so there may be more bugs to fix.
Set up a document type, with a schema from the ODD file, build for the ODD file (still based heavily on Keats), and rescued the existing content into XHTML files rooted on div elements as in the Keats site. Shouldn't take much more work to get to a basic build.
Many changes on the codebase, so rebuilt the schema and did a build, which threw up some errors; fixed those, rebuilt, pushed to server, pushed to repo.
After many hundreds of thousands of search/replace and other tweaks on the code base, along with local testing using a tweaked hosts file and Apache with a self-signed cert, I've now pushed all 26GB of stuff up to hcmc/www/eol, where it will live, and asked sysadmin to organize new virtual hosts files on HCMC's clusters followed by switching the DNS. Not everything works, but most things do, and many will be easier to fix in-place live than in a test setup.
I now have (I hope) full crawls of the ISE site, and I will carry on to get full crawls of the others. I'm now at the stage where I'm trying to fix problems, such as the prominent message touting a user survey which needs to be removed from every page, and the fact that the nbsp entity is referenced everywhere (it should be a numerical entity). A few more hours of this sort of work will be needed, given the scale of the project (19GB+ for ISE itself).
Modified the code in db_to_servitengasse_xml, db_to_person_html and db_to_building_inc files so that for those few instances where a person has two addressess on Servitengasse, we don't get two entries e.g. in the list of people associated with a building or in the servitengasse.xml file (which is used to generate the servitengasse.json file).
Also made some tweaks to the punctuation etc. in the output and ensured that occupancy_notes field is displayed under all output circumstances in voluntary and collected instances of occupancy.
Meeting with MN and SA today:
- Wheelock site: add page for ANKI deck download.
- Athenaze site: add page for ANKI deck download.
- Wheelock site: redesign as the testbed for HotPot redesign/HTML5. (Deadline one month.)
- Wheelock site: re-encode tabular exercise layouts to make them mobile-friendly. (No deadline; examples to be done by HCMC; exhaustive implementation to happen in the fall with next workstudy student.)
- Both sites: Add page for credits, and find out all the historical contributors and add them.
- Both sites: get logs added to our current set of log dumps.
From MT, got a list of all the additional files that will need to be downloaded. I've built a test script for QME, as the smallest, and I'm now testing that.
Following the download, I'll still have to build a local server setup so I can test using the actual URLs, because they're hard-coded throughout, and can't be changed because the same fragment may be pulled into any page at any point in the tree. Grrr.
I've now also pulled the content from the DRE and QME sites. All three sets of data have the same range of issues, the most serious of which is a preponderance of hard-coded links to the domain (http://internetshakespeare.uvic.ca for example) rather than relative links. The WGET tool I used to pull the content should have been able to make those links relative, but in most cases it didn't, possibly because the content is not valid HTML so couldn't easily be parsed.
So I'm going to have to fix all those links using a script, which I'm writing in Python now. It's doubly tricky because of the mobile vs desktop issue, which results in links which look like this:
<a href="../../../frommobile.html%3Fto=http:%252F%252Fdigitalrenaissance.uvic.ca%252FFoyer%252Fcopyright.html">
So it will take a while before I can get this all normalized and working. Meanwhile, I don't yet have the list of AJAX files that need to be pulled from the server, so I won't be able to get those until MT sends it to me.
After several false starts due to work still being done on the site despite yesterday being the deadline, eventually got a standard WGET of the site. Got over 67,000 files, but a) lots of them still link to the live site with a full URL, despite telling WGET to fix that, and b) MT tells me that there are AJAX files that I won't have because there's no sitemap linking them. Will have to try to address this on Monday.
Following the issue of the missing NFLD newspaper documents, we determined that the problem was bad linking in the documents themselves, which DH fixed; I then had to rebuild to schema to get things to validate. I also got tired of the tedious process whereby XML documents are validated one by one, something that was necessary because jing suffered from a stack overflow due (I think) to the complexity of the folder structure within which the XML files were found. I discovered that if I just copied the XML files to a temp directory in a flat layout, jing would validate them just fine, and I could also give it a bit more memory anyway by setting my ANT_OPTS like this in .bashrc:
ANT_OPTS="-Xmx8G -Xss8m" export ANT_OPTS
I think the -Xss solves the stack overflow. This means that I can now validate the XML in a few seconds. I then started looking at the Schematron, which has never been run as part of the build. Borrowing from other more recent project builds, I'm now generating a static .sch file as part of the ODD build process, and then compiling that immediately to create an XSLT file, following the model of DVPP. The validation process then runs that against the document collection, stores the results in a temp directory, and a second process parses those results to generate errors and fail the build if necessary. Found and fixed several errors in this process.
So now the NFLD documents are included in the site, and our build process is WAY faster.
Usual weekly meeting with DH, at which we hacked at the problem of the missing Newfoundlander documents, and determined that they're on the site, but just lack entries in contents documents, because their metadata doesn't match the taxonomies. DH will fix.
We had a file called "autum.mp3", for shame, so links to it weren't working. Now fixed. I've also added blockquote styles (grey background, rounded corners) to initiate a discussion about how to handle the larger ones.
M is back working, and needed a little help with images; KB stopped by to get assistance with some svn and linking issues. All solved, site rebuilt and uploaded.
The code for vihistory used "<?" to open all the php blocks. Newer versions of php and our server infrastructure require "<?php", so I've replaced all the instances that needed to be changed. Also made changes in protocol from "http" to "https". NB for both of those, there are a small number of instances (e.g. <?xml and http used in headers) that must remain as they were, so a simple global search and replace is not possible. Site seems to work at e.g. webserver2.hcmc.uvic.ca/~taprhist/search/searchcensus.php and hcmc.uvic.ca/~taprhist/search/searchcensus.php
No code committed yet because it will break stuff -- I should be working in a branch but I'm not. But I'm trying an append function that would add a new GeoJSON feature collection to the existing one. If this isn't practical, an alternative approach will be to combine two GeoJSONs in JS before reading them, but that's probably no simpler.
MM has joined the team and will start tomorrow. I also fixed some CSS issues (adding handling for nested quotes), and updated the documentation.
Fixed a couple of bugs from tickets, did some CSS work to clean up the Team page, fixed a bunch of badly-linked and badly-named images there, and checked and cleared a bunch of older GitHub ticket.
I've now added the new check to the diagnostics for incomplete sequences of images. In addition to poems which have no images at all, there are also a few poems which are listed in the diagnostics output like this:
Poem #9095 Book VI (Blackwood's Edinburgh Magazine) (expected page count: 5; actual images: 4.)
This check can't really take account of the relatively small number of cases where page numbers are not pure numbers; for instance, if the page-range is specified as:
354a-354b
or
xx-iv
it's not practical to try to figure out how many pages should be in there.
I've normalized all instances of abbreviated pages so that e.g. 227-28 becomes 227-228. But again, I can't easily do that with non-numeric pages, so there may be examples like xx-iv where only a human could really deduce that it should be xx-xxiv.
AC wants all page-ranges to be shown in their entirety, so I ran some XQuery to generate fixes for the abbreviated ones:
let $commands := for $p in //table_data[@name='poems']/row return let $pages := $p/field[@name='po_pages']/text(), $id := $p/field[@name='po_id']/text() return if (matches($pages, '^\d\d\d-\d\d$')) then concat('UPDATE `poems` set `po_pages` = "', substring($pages, 1, 4), substring($pages, 1, 1), substring($pages, 5), '" WHERE `po_id` = "', $id, '";') else if (matches($pages, '^\d\d-\d$')) then concat('UPDATE `poems` set `po_pages` = "', substring($pages, 1, 3), substring($pages, 1, 1), substring($pages, 4), '" WHERE `po_id` = "', $id, '";') else () return string-join($commands, ' ')
then ran the commands (after testing in dev) to fix the existing problems. Added a new test to the diagnostics for this.
I was asked to open up access to the multimedia IPA chart. In order to do that I had to change some of the landing pages to redirects in to the app directory. I also changed the name of the sowl content directory to sowl.
Did some minor reorganization and file renaming, as part of the long-term goal of having a standard set of naming conventions; images are where most of the problems are. Also removed a bunch of old obsolete HTML and plist files that were in the repo for no good reason. Finally, added detailed documentation on how to add images to the repo and use them on pages. Docs are coming along nicely.
Met with PAB to discuss progress; projected edits/additions are still not complete, so we'll hold off on moving data into svn for the moment. Rescheduled to October. Meanwhile, I updated my backup of the data from eXist, and we troubleshot a set of obsolete and duplicate data files. All sorted now.
Updated the stats and found a bug in the author count report (now people count), so I fixed that and updated again.
Per AM, pulled units 28-40 from Cadfael and rebuilt the web pages, then uploaded to the new_wheelock folder. Everything is now complete; it looks like it should be swapped in to replace the existing wheelock folder when approved by MN.
AC asked for a new diagnostic, so I've added it, and in the process updated the existing code to handle the change from authors to people, as well as enriching the info given about each poem in a diagnostic test result.
I now have text search actually working, after a lot of workarounds to handle eXist's fierce validation of the forms of URIs. However, the implementation of the search facets is not functional yet. Getting there.
Added four new images to the gallery per KB, and at the same time, documented the process for adding images to the gallery. Also re-encoded one of the poems (Lock of Milton) following KB's model, and added two images to an article page.
I have four to prep, so I'm starting early. :-) The Keats one, which is part TEI part Endings, is perhaps the easiest.
AM working on the rebuild of the Wheelock exercises provided a list of units she's worked on, so I rebuilt all of those and pushed the results up to hrd/latin/new_wheelock. Had to install HotPot first.
Posting some time spent over the last couple of days writing and documenting an ant + XSLT script to convert the Newfoundlander transcriptions into "final" form for the generation of XML. GL's skills are moving forward.
Another section of the script to convert text-field illustrators into one-to-many. The only remaining issue is how to reliably recreate existing field data in the form of relational links. This is likely to be a little bit risky because of whitespace normalization issues. It should probably be done after the other work is done, so that the SQL can be created using XSLT.
Met with DH and gradually worked out a plan to revamp the category system completely.
Reorganized a lot of the file folders on the server; fixed the db pointers to match; added some new diagnostics; made the diagnostics more user-friendly and slightly prettier; added lots of documentation to the build process ahead of beginning the generation of TEI and HTML; and lots of other bits and pieces.
I'm now able to retrieve all fresh data from the dbs and the home1t filesystem, generate the stats and commit them as part of a single ant target. Much more convenient, and will work much better when I'm away.
List of requests from KB is now done. Added lots more images and linked them, and fixed a few typos.
Polished off one more ticket (a French caption not showing up) before meeting with DH and discussing remaining tickets.
Got the chart working, which was easier once I switched to separate versions of Chart.js and moment.js; tweaked it a little, learning some useful stuff about how it works. Then wrote some progress tracking calculations into the same section of the diagnostics, including end-date prediction. That's it for progress tracking for now.
At AC's request, added a new diagnostic to catch instances where no authors have been specified for a poem, but neither unsigned nor anonymous has been selected either. Also continued hacking at the chart functionality, but it seems that there's a commonly-encountered issues when trying to get the ChartJS bundle to work, so I'm going to switch to using the separate moment.js instead.
Chart.js code is now being imported and used, and data is being generated in a form that should enable the production of a chart, which already appears but is broken because of an issue with date parsing. From here it's bugfixing.
Still have to figure out the best way to build a chart from them.
I have the contributor names showing up in the debate-day pages, but I can't build the site because team.html is broken. Waiting for DH to fix it.
Attacking some of the issues in the diagnostics output.
Met with DH, and did the first half of ticket 37: I now have XSLT that can semi-reliably harvest user info from HOCR files and create respStmt elements in the TEI. I still have to figure out how to output that information in the HTML.
...to catch unwanted quotes or punctuation.
The diagnostics is now showing lots of errors in filenames and links, so I've handled the bulk of them in a semi-automated way because they were amenable to that. We're now down to 64 bad filenames and 4 bad links.
I've made the updates to the database, by rsyncing most of the dev version over the live codebase (keeping a backup).
The changes are:
- There's now a "First Line" field in the Poems table, where you can add the first line of a poem which otherwise would be indistinguishable from others with similar titles ("Song", "Sonnet" etc.).
- The "Related poems" field now uses a combination of the Title, the First line (if there is one), and the id of a poem to give you something which (I hope) is findable and unique for each poem. When poems start with punctuation (quotes or whatever) they'll sort to the beginning, so you usually need to scroll past the quoted titles to get to the regular ones.
- There is now a publicly-accessible read-only view of the database here. I haven't linked this from anywhere or publicized it yet; I think the plan is to hide the notes field in this view, which will take a bit of work.
One note: there's an additional field in the poems table which just amalgamates all the text values of the other major fields, to make for a simple search field, which was used for the WordPress plugin. Since that's no longer working, and it was going to be difficult to combine the trigger I needed for the new First line field with the original trigger that also ran on update on that table, I've removed the original trigger. That means the plugin will no longer work, so if we need it to do so at some point, we'll need to figure out how to combine those two triggers in a single operation.
We now have diagnostics giving stats, and providing tests for ill-formed image filenames on the filesystem, and database records that point at images that aren't there. This is a good start, and there's plenty for people to get their teeth into.
KB came in to get SVN and Oxygen installed, and is now working on edits. Initial setup and basic process documentation is good, but I also need to do more in the way of documenting markup practice as we move to more complicated edits.
Inserted one of the poem page images as requested by KB; this works as expected, so I can go ahead with the same model for the rest of them. Did some other fixes from the list of TODOS while I was in there.
The project needs some of the improvements to the adaptive db added after this project started, such as the read-only interface, so I've copied the current data over into the dev db, checked out the latest trunk from svn, and hacked away till everything worked. The process found some oddities in the adaptive db code, which I've copied back to the repo. I'm now in a position to add and alter some fields that need to be changed, doing that only in dev first.
61 of the original 2014 sample of 100 poems were marked up in the original repo, albeit a bit shakily; I've converted those, added them to the new repo, and fixed all the validation problems.
This morning we had the intro session for the RAs, and they'll be starting tomorrow. I've now begun work on the diagnostics which will track our progress, as well as adding ordinary backups to the build file; the build file now has two combining targets, the do_all, which is what will happen on Jenkins, and the admin, which is what I'll run locally, to transform or generate things that need to be committed to svn or stored locally. I've also laboriously added the English 500 encoded XML poems to the repo organized by journal and year (it's clearer than using the variable vol/issue kind of organization), and I'll be doing the same with the original hundred poems we did a few years ago. Steady progress.
I've converted the old VPN ODD and associated files (XML and schema-building files) to DVPP files, and created both a schema build with Schematron extraction, and a general build process. I've set up a Jenkins job, got validation with RNG and Schematron working, and set up a cron job which puts the XML version of the db on home1t where the build process can retrieve it. Coming along nicely.
A couple of TODOs came in just when I needed a break from something else, so I polished them off.
Finished off the poem that AN was working on, and then went through all the other outstanding things in the todo list; the only remaining issue is the first-publication images of poems, on which I don't think we have a clear plan yet.
Pushed forward with building a schema and converting the old data in consultation with SH and SA this morning, and was making good progress, but a possible change of direction from SH this afternoon means I've suspended work until it's resolved.
Did three more of the poems for which we have curated texts from KB.
The Google Custom Search page is a plain setup which doesn't allow for much configuration; I've started learning some of the ins and outs of Google context and annotation files which might enable us to give users the option of filtering by English site or French site, which would be the perfect solution to the problem of the mixed-up language situation. The annotation stuff is a bit puzzling, but I think we should be able to get it working in the end. Current results suggest that not all of the site has been indexed yet, but we have plenty of time to worry about that as we get nearer to completion.
There appear to be quite a few of the less common names which have been missed in the early tagging, so I've been hacking away at them. I think in the end I need some sort of diagnostic for this.
Working through the to-do list.
I have a to-do list of poems that need reformatting, and I did one today, as well as retitling two poems (which meant renaming files and fixing links). But we're getting closer.
Google search is now configured and working; a search button has been added to the pages whose content I control; links from popups in the map now open in a new tab; and various other minor tickets have been closed or referred on to the next person.
Cleared up a bunch of issues around English vs French documents being displayed and linked, and closed three tickets; now half-way through adding a Google Search. Meeting with DH.
Two new useful features added this weekend:
- The topic index now includes a link to each poem directly, as the first item in the entries listed under that poem, and the poems themselves each have links back to their heading in the index, so you can easily find all references to a specific poem.
- The JSON for search now includes a document type designator for each document, and the search page enables you to specify any of four document types to search, so you can (for instance) search only in the poems, or in the timeline articles.
Turned out the layout problem for the poem stanzas was caused by the footnote. Took forever to realize it.
I've been struggling for a while with the requirement to do nicely-formatted layout of multi-stanza poems with stanza headings while at the same time keeping the poem encoding simple and intuitive. It finally occurred to me that I can handle the problem at build time by adding an extra div into the structure, and I've implemented this today, testing with a corrected and formatted version of Grecian Urn. Looks good so far.
I think the new version of OL has solved one of the problems we had with Chrome, but it's not 100% clear yet.
Got the basic image markup working, but I'm still stuck using place elements instead of facsimile. However it's not clear how best to support the GeoJSON geometry range in a plain old facsimile element in TEI.
Over the weekend, I converted the old XSLT 2 code to create GeoJSON to XSLT 3, which means it's much easier to add more support for new TEI elements being converted to HTML in the GeoJSON, and to debug that whole module. Today I've added a new test project based on static image markup, and started extending the code so that it can tell the difference between when it's working on static images (using only pixel offsets) and when it's working on a genuine map (in which case various conversions have to happen between projections, since we render in EPSG:3857 but store in EPSG:4326 for GeoJSON conformity). I'm about a third of the way there, but so far it's very promising.
You can now do quite complex things in the smaller popup box, including linking to any document so that it pops up in the left iframe. Fixed a bunch of other bugs in the CSS to get rid of unwanted scrollbars and so on.
Time to rewrite the XSLT (see ticket on GitHub), and to do some serious documentation in the ODD file.
GN found more issues, and I found one too; those are now fixed (see GitHub issues). I've also added testing for local document links to the test projects.
AM has finished up to unit 15, and the new work is uploaded to the site.
As GN's project moves forward, he's finding things that haven't been properly worked out or tested, and I also found a new CSS bug arising out of PS's fixes for old Safari limitations. Two fixed, two more added to the queue.
We're getting down to the real nitty-gritty now. In steady conversation with KB, I've implemented a fix for the case where left- or right-floated images resulted in overly-narrow columns of text alongside them, by detecting window-width and switching them to centred with no float; fixed a problem in Safari where one menu button was larger than the others because it had a search glyph (char, not image) on it (set line-height to 1); added some additional responsiveness for window-widths intermediate between desktop browser and portable device, making the chronology column narrower; and fixed a bunch of other minor things. Updated the documentation to explain the search engine in detail, and expanded the documentation of the ODD file and the build process. Also went into the Google Custom Search control panel and discovered I could force a re-crawl of the site by using the Fetch functionality; did that, and saw some results right away (new pages appearing in the results), but some annoying hangovers (old pages persisting). That will presumably go away eventually.
AM has been working on the revisions in Hot Potatoes, and I've been building the units and uploading the results as we go. Moving along nicely. When the whole lot is finished and proofed by MN, it will replace the existing site. Right now it's at hrd/grs/latin/new_wheelock.
This is a note to self: the TEI rendering of ODD to HTML documentation has screwed-up whitespace inside the egXML elements. The solution is to add this to the CSS:
div.pre{ white-space: pre-line; }
This should probably be added into the TEI Stylesheets, via a ticket. Took me a while to figure it out.
Tagged and added a new article from KB, removed the old About page, which was redundant, and tided the footer and the menu. Some questions still outstanding on the new article.
Added a new block of bios from KB, as well as a menu item pointing at the search page. Tweaked some processing and fixed some typos.
GN found two bugs arising out of recent changes. Fixed them in the JS.