Edition 3.8 of Dr. Kim Blank's excellent site Mapping Keats's Progress has been released.
Isolation log week 44
Work done from home and in the office 2021-01-18 to 2021-01-22.
Total: 37.25 hours
G&T hours: +2.25 hours
Monday 2021-01-18
0.25 hours: Update Jenkins servers.
0.25 hours: Mariage: correspondence on staticizing; message to sysadmin to change subdomain pointer.
2.25 hours: TEI: Write XSLT to parse the P5 specs to generate spreadsheet of all translatables whose English was updated since TEI 3.3, for Japanese translation team. Slightly harder than expected.
200 hours: LEMDO: Rewrite/re-organize processing of egXMLs to fix spacing issues; working through existing tickets, closing some already completed, raising questions on others.
2.00 hours: Mariage: More optimization for staticSearch, adding fragment ids and tweaking style, testing and debugging.
0.50 hours: Keats: Build, test and release version 3.8.
Total: 7.25 hours
Tuesday 2021-01-19
0.25 hours: Update Jenkins servers.
0.25 hours: MoEML: Dig into obscure ligature issue and ellipsis characters.
1.25 hours: HCMC Staff meeting.
0.50 hours: LEMDO: Look at implementation of detachable pinnable popups.
1.00 hours: Work on ideas for DHSI sustainability conference panel.
2.00 hours: TEI: Assemble materials and beginning draft of proposed new Guidelines section on ruby.
1.00 hours: Endings/staticSearch: Port pilot CSS back from Mariage repo into staticSearch, and tweak. Write documentation for new custom data attributes. Add new issue on requirement for better handling and documentation of default values for boolean params.
0.75 hours: Wendat: Project meeting.
Total: 7.00 hours
Wednesday 2021-01-20
0.50 hours: Update Jenkins servers. Kernel updates required reboots and other cleanup.
1.50 hours: MoEML: Process LibraryOther and LibraryRoyal documents to add xml:ids to pbs for signatures; add those document types to the set that gets rendered with signature links; fix Schematron bug; update schema to allow gap[@reason="sampling"].
1.50 hours: Academic: Prep for and participate in RA's mock interview.
1.50 hours: Mariage: Tweak and rebuild for release; test subdomain pointer switch; push new release and test; make announcements of staticization.
3.00 hours: Endings/staticSearch: Complete first draft of French stemmer in XSLT (not passing tests yet -- lots of testing and fixing to do).
Total: 8.00 hours
Thursday 2021-01-21
0.25 hours: Update Jenkins servers.
1.50 hours: LEMDO: Diagnosed build-break problem with HTML; modified schema to disallow the corresponding XML structure; fixed broken file; fixed conversion code which generated it; fixed broken menu item; added ODD file to duplicate id check.
1.50 hours: MoEML: Project meeting, a couple of bugfixes.
0.50 hours: DVPP/Admin: Long-running saga re CC's timesheets finally getting worked out.
3.50 hours: Endings/staticSearch: Completed ticket to move CSS into a separate file for easier maintenance; continued rewrite of XSLT French stemmer, which is now passing the first 798 tests, but still has a ways to go.
Total: 7.25 hours
Friday 2021-01-22
0.25 hours: Update Jenkins servers.
1.50 hours: TEI: Meeting with NC and SB to work on ruby and Stylesheets issue ahead of group meeting; other discussions on the ruby ticket and repo.
0.75 hours: DVPP: Constructed new timesheets for CC based on the WS ones, submitted them, and notified her.
0.50 hours: LEMDO: Diagnosed and fixed repo conflict and build break; tweaked svn user id.
0.75 hours: MyNDIR: Zoom with PB and SDK to figure out and fix broken XPR file, validation problems, build break, and use of dating attributes.
4.00 hours: Mariage/Endings/staticSearch: Finally completed the XSLT French stemmer and got it to pass all 20,805 tests; some correspondence with Snowball maintainer about clarification of the algorithm's prose description.
Total: 7.75 hours
Posting hours from isolation log 2021-01-11 to 2021-01-15.
I'm just posting this because I've had to spend over an hour trying to figure this out. I had what I thought was a bog-standard Ubuntu 20.04 installation with ImageMagick installed from the repos. This gives you ImageMagick 7. On my desktop, it just wouldn't work properly; attempting to process images at the command line gave me missing delegate errors:
no decode delegate for this image format `JPEG' @ error/constitute.c/ReadImage/562
I remembered having hit this before once, and knew to go and change some settings in policy.xml, but that just didn't work. Eventually I had to do this:
- Uninstall and purge imagemagick
- Find all imagemagick-related folders and files and delete them (there are lots, including symlinks to the binaries all over the place)
- reinstall imagemagick (still didn't work)
- sudo apt install graphicsmagick-imagemagick-compat
The last one seems to be the magic bullet.
Isolation log week 43
Work done from home and in the office 2021-01-11 to 2021-01-15.
Total hours: 38.25
G&T hours: +3.25
Monday 2021-01-11
0.25 hours: Update Jenkins servers.
1.00 hours: LEMDO: Fixes for broken menu item and staticSearch page styling; switched content para width for born-digital documents from 38 to 44rem.
0.50 hours: MyNDIR: Meeting with PAB, troubleshooting, fix for permissions on newly-reconfigured repo setup.
0.25 hours: ColDesp: Write up TODO for disclaimer page.
5.50 hours: Mariage: Lay out plan for static version for CC; begin initial work; fix several thousand old http links that should be https; rewrite linkchecker code and run first 200 links through it; replace Saxon 9 with 10; begin rewrite of build files; debug issue with linebreaks in normalized text.
Total: 7.50 hours
Tuesday 2021-01-12
0.25 hours: Update Jenkins servers.
1.75 hours: Mariage: More progress on static site work, including taxonomies other enhancements.
0.25 hours: Keats: fix for bad filename, re-encoding of poem to fix lineation issue.
0.50 hours: MoEML: Tweak the handling of mol:bibl links to allow for fragment hashes, and add documentation for it.
1.50 hours: HCMC Staff meeting.
2.00 hours: LEMDO: Project meeting; raising new tickets; fixes for bibliography encoding issues; other bugfixes.
2.00 hours: Wendat: Project meeting, prep and follow-up.
Total: 8.25 hours
Wednesday 2021-01-13
0.25 hours: Update Jenkins servers.
3.00 hours: Mariage: Build and release new version 6.4b; on upload to server it stopped responding twice and had to be restarted manually, but eventually succeeded. Create schema build process with taxonomy incorporation, currently for catRefs but eventually for other things; add reprocessing of output documentation as in other projects, and test. Write identity transform to add appropriate catRefs based on classCode content, run it, validate results, and add Schematron to enforce the presence of at least one catRef.
0.50 hours: MoEML: Fix CSS issue for semi-diplomatic texts.
2.00 hours: Wendat: Create 8 slides for our ICLDC presentation, with notes and graphics. Examine and test cognate material collection to see how searchable we might be able to make it.
1.75 hours: LEMDO: Fixed a bug in the anthology diagnostics where it was failing to find texts; Fixed several bugs in the getSiteFromJenkins script, caused (I think) by a change in Jenkins behaviour, and made the script more complete and robust. Tweaked XSLT to fix bad document type identification, and added mitigation for ridiculously long xml:ids which cause layout issues in the A-Z Index page.
Total: 7.50 hours
Thursday 2021-01-13
0.50 hours: Update Jenkins servers. Kernel updates took a while, and discrepancies between the two servers needed investigating.
0.25 hours: LEMDO: Further fix to A-Z Index page long-id problem.
6.25 hours: Mariage: Got the bulk of the work done to complete the staticization, with staticSearch now implemented, metadata for filters appropriately assigned, and various additional checks and filters done to exclude some documents from indexing.
0.50 hours: Endings/staticSearch: Arising out of the Mariage work, raised a number of new issues relating to captioning and sorting on non-English pages. Solved one of them (a nasty JS bug), and started work on others.
Total: 7.50 hours
Friday 2021-01-15
0.25 hours: Update Jenkins servers.
0.25 hours: Endings/staticSearch: Add French captions from CC into dev branch, and pull into other feature branches.
1.50 hours: TEI: With SB, continue work on Ruby in TEI, now continuing in a GitHub repo as well as on the ticket.
1.00 hours: Scancan: Enter author's corrections for review, trawling CMOS for guidance; fix a bug in processing endnote numbers; push new version of site to server.
4.50 hours: Mariage: Add document images and test; add fragment images and test; add override CSS to customize the search page; fix revealed errors in original encoding; fix bugs in yesterday's XSLT work; refine staticSearch indexing config; reorganize the repo a little bit.
Total: 7.50 hours
Martin Holmes and Joey Takeda have released the second production version of their staticSearch tool, one of the key products from Project Endings. This version adds several new features and improvements. See the release notes and full documentation.
Posting hours from isolation log 2021-01-04 to 2021-01-08.
Just a note to self for future reference: One of our source documents came in the form of full-page-spread images, and rather than manually crop them (there are 370), I wrote a bit of JS to generate a bash script which uses ImageMagick to do it. First I renamed all the input files so that they were -0001, -0003, -0005 etc. to allow room for the new ones, then used this:
function writeCropLines(num){ let padNum = num.toString().padStart(4, '0'); let output = 'convert mgchau-' + padNum + '.jpg -crop '; output += '1377x1936+0+0 out/mgchau-' + padNum + '.jpg '; output += '\u000a'; padNum2 = (num + 1).toString().padStart(4, '0'); output += 'convert mgchau-' + padNum + '.jpg -crop '; output += '1497x1936+1096+0 out/mgchau-' + padNum2 + '.jpg '; output += '\u000a'; return output; } function doAll(){ let output = ''; for (var i=1; i<740; i+=2){ output += writeCropLines(i); } return output; } console.log(doAll());
in the browser dev tools to generate the script.
As of today, the Scandinavian-Canadian Studies Journal, and the Robert Graves Diary project, both maintained by HCMC, have migrated to pure static websites built according to Endings principles and using the staticSearch engine.
Isolation log week 42
Work done from home and in the office 2021-01-04 to 2021-01-08.
Total hours: 37.00
G&T hours: +2.00
Monday 2021-01-04
0.25 hours: Update Jenkins servers.
0.75 hours: Maint: Investigation/testing of dart-sass install to replace our current sassc, which is now deprecated; after next reboot of Jenkins I'll switch MyNDIR and LEMDO over to that, and we'll see how well it works.
6.00 hours: LEMDO: Complete rewrite of document type taxonomy, update of all XML documents and schemas, and fixes to processing required as a result. This will leave us in a better position to provide good search filter options. Lots of cleanup was needed, and weaknesses in our XSLT are evident.
Total: 7.00 hours
Tuesday 2021-01-05
0.25 hours: Update Jenkins servers.
0.50 hours: LEMDO: Figure out and fix overnight build break.
0.50 hours: Maint: Update work desktop, lab machines, and HCMC laptop.
1.50 hours: HCMC staff meeting.
0.50 hours: MoEML: Discuss solution to encoding five-period figure; implement schema and schematron changes.
3.00 hours: Scancan: Static site launched by LW; rewrote the build process to remove all eXist content, fixed a couple of bugs, updated some info pages, moved obsolete files, added publication tasks, tested, and published latest build.
0.25 hours: Endings: Added Winter 2020 update to site.
0.25 hours: MoEML: Find and fix reported bug in diagnostic.
1.00 hours: Wendat: Process MS 60 into facsimile document ready for encoding; examine McGill-Chaumonot page-images and try various approaches to splitting the spreads into individual pages. Will need to be scripted using ImageMagick, I think.
Total: 7.25 hours
Wednesday 2021-01-06
0.25 hours: Update Jenkins servers.
1.75 hours: Wendat: Split and remediate page-images to create 740 individual page-images for McGill-Chaumonot MS; create TEI file for the MS, add new doctype category for grammars, and update stats.
3.00 hours: LEMDO: Fix menus and listings pages broken by changes to taxonomies; fix bugs in rendering linebreaks; work out and test encoding of verse with rhyme; remediate FV_Q1 to fix hundreds of bad entity references.
1.25 hours: Endings: Project meeting and tweaks on bulletin board.
0.75 hours: Endings/staticSearch: Added new What's New section to documentation, and did other prep for the upcoming release.
0.50 hours: LEMDO: Track down and fix spacing issue in QME toolbox.
Total: 7.50 hours
Thursday 2021-01-07
0.25 hours: Update Jenkins servers.
2.50 hours: Maint: With GN, testing of Tomcat 9, eXist apps, and different JDKs to determine what we need on the replacement for Peach. Long discussions of staticization strategies and search functionality for CGWP, Francotoile, VIHistory and other sites.
2.50 hours: LEMDO: Working on overnight build break, discussions of tracking for document history versus document status and Schematron, fixes to many documents and to Schematron rules.
1.00 hours: MoEML: Project meeting and
1.00 hours: Endings: Update to website to add Symposium page; started remediating the horrible old HTML to create something processable, so I can more easily maintain the site with a build process.
Total: 7.25 hours
Friday 2021-01-08
0.25 hours: Update Jenkins servers.
0.75 hours: LEMDO: New taxonomy for document history.
1.50 hours: TEI: Weekly meeting with SB: working on the Japanese ruby proposal.
1.00 hours: MoEML: Fix for diagnostics and other tweaks to complete all my outstanding tickets.
2.25 hours: LEMDO: Write basic documentation for new taxonomies; update conversion processing chains to add in new categories when generating documents; addition of new categories to existing documents.
2.25 hours: Endings: Plans for tech paper; created build process for site, removing piles of unwanted JS and abstracting core components into XSLT variables; added new pages and tweaked existing pages; got everything to validate; published latest version.
Total: 8.00 hours