Drafted another section, revised some other bits and added supporting data.
Drafted the section on the Chicago surname lists, which involved doing some XSLT and other calculation to get solid numbers on names missing from the lists, etc.
We've been trying to figure out the best way to parse the Zotero library to find out what we have where, and it's quite a puzzle; the search function gives you results, but doesn't seem to let you find out where those results are in the tree, and the storage system is such that linked files (such as our crucial PDFs of documents) are in a centralized storage system addressed by pointers. Exporting in Zotero RDF or in TEI provides two different views of the data which are useful in different ways, but I'm realizing that if we need to do a lot of this, a custom Zotero exporter would be what we really need.
We have, inevitably, the issue that folks uploading to the nfs filespace end up creating folders and files owned and grouped to themselves, instead of grouped to the loi group. WinSCP allows users to change group ownership, but for those on Macs it's not so simple, since FileZilla doesn't. Simple script to put in the root, for users to run:
# Change group of all stuff owned by current user to loi. find . -user $USER -exec chgrp loi {} \; #chmod directories to rwxrwx---, and set the setgid bit find . -type d -user $USER -exec chmod 2770 {} \; #chmod all files to rwxrwx--- find . -type f -user $USER -exec chmod 0770 {} \;
Spent a few hours with AM comparing notes on the lists of datasets provided by the NNM for potential further treatment (likely transcription and markup by us, or recommendation for treatment by appropriate other cluster (Oral history, government records, legal records).
Created datasetPlans document to summarize datasets of potential interest, what their current status, scale of dataset to treat, and likely cluster to do work. Incorporated information from List of Lists, Community Lists, Kagetsu file-by-file spreadsheets. Should be completed in another hour or two and will then discuss with AM and then contact colleagues this and other clusters.
Also, need to ask JSR if focus on dispossession extends to include efforts after the war on protest or claims (e.g. work done by various committees, the Bird Commission) dealing with property specifically.
Query to JSR about one issue; first draft of the text is coming along.
Drafted another section, on phonology and romanization, and did a bit of research arising out of it.
Spent most of the day reviewing report on about 90 datasets (varying from a single document to 17 boxes per dataset) supplied by Lisa U at NNM. Some data sets they've imaged, some not. Some datasets they have records at "item" level, others at "file" level (less granular). I added initial judgement on Yes, Maybe, No for transcription and markup (and possibly translation), and reasons why - mainly looking for documents that provide details on individual cases to complement directory-based stuff we have so far.
Will meet with AM on Tuesday to compare notes on this and start planning work for upcoming year.
Drafted sections on O'Neill, Jisho, and ENAMDICT. Also found and fixed some spacing issues in the Haney Nokai document (trailing spaces in e.g. foreign tags).
Research on previous surname classification work is coming along.