Working on my presentation materials for the BC Studies conference, and using the opportunity to look more closely at one or two specific addresses in the directories. We've learned that the 1941 phone directory is only phones (no-one without one appears); while the 1941 zai-kanada directory looks like it should contain everyone. However, the 1941 city directory has a lot of people living in rooms (e.g. at 510 Alexander) who are Japanese but never appear in the zai-kanada, so there's obviously some additional process of selectivity going on. In any case, I've put together the first half of the presentation, mainly illustrated with images, and will finish up tomorrow. Some of the same materials can be re-purposed for the Ottawa presentation.
Went through the spreadsheet updating info on Japanese directories, checking against UBC and Nikkei holdings, and found some more puzzling issues with dating. Wrote to the team to suggest we make some decisions on what to encode after examining the documents. Right now we only have material from a single year.
Added a block of new data to any row in which the previous title was a custodian, providing the info for the title preceding that one; that makes it easier for the stats processing to figure out the sale price excluding what happened with the custodian.
Testing of the Mac version of the jpeg-to-pdf app demonstrated that it was now broken, presumably because of some change to OSX since last May. After much pain and suffering involving getting a fresh version of Platypus installed and doing a lot of testing, we have ended up with a new version of the .app file which works. Lessons learned:
- Platypus settings should include bash as the executor, but check those settings manually; when I just accepted the default, the result was an app which tried to execute bash from inside bash (I think).
- Do the build on a Mac (obviously), and make sure the result is committed to svn from a Mac, otherwise stuff gets broken (the .app is actually a folder, and is treated like one by every other OS).
- Add another zipped version to svn just to be safe.
- Use the Cactus DMG distro of ImageMagick rather than trying to install it from Mac Ports or using binaries.
Began work on preparing for the presentation in May, and started by gathering some screenshots and making a rough plan. In the process, we began to examine the page-images for the community directories, and I think (based on my limited Japanese) that I've discovered a discrepancy; a directory we believed (based on UBC metadata) to be 1939 actually looks to me as though it's dated 1941 (Showa 16). Waiting for some expert help to confirm or dispel this suspicion.
Titles with no dates at all are screwing things up, so the CSV generation now excludes them by default, per JSR.
One typo fixed, and some new bits and pieces added; other things discussed, explained and debated. By-property is confusing for multiple-property titles; by-title is unworkable because propsets on title chains change. What to do?
There was a bug in the spreadsheet from last week, and in the process of figuring it out I found some issues with institutions not tagged as such in the db, and incorporated some requested changes to parts of the spreadsheet. We also had a Skype on the AtoM issue.
Had some back and forth with HR who is building the AtoM box; provided some test data from UT which successfully imported (except for one error in the data), and he's now working on switching from nginx to Apache. Should be ready in a week.
Skype meeting with two folks from SFU for consultation on metadata and AtoM; they confirm we're on the right track with the RAD CSV approach, and will help us to finalize our fields since they're working on similar processes right now.
Ethnicity issues have been finally sorted out, and eight new constructed variables are needed for the SPSS csv output; I'm still talking with JS-R on exactly how they should work, and on changes to the way sixteen others are calculated.
All generation generating is completed; now we're working on Japanese Provisional names, and figuring out some new constructed variables for the spreadsheet for SPSS.