HCMC Journal: Past Wrongs, First Choices Programmer

Past Wrongs, First Choices Programmer

12 September 2022: Stewart Arneil
Minutes: 60

Zoom meeting 2022-09-08 on budget, timelines

Jordan has about $73,000 for a programmer to work on “infrastructure” for Past Wrongs Future Choices. That amounts to 1800 hours (one year) at $40.00 per hour - though benefits or % in lieu has not been taken into account.

There will be series of seven international spotlights added to the existing data structure. Each is approximately 120 digitized items (photo, letter, report, ?oral history). They may be presented to user as 7 additional collections or as one additional collection with 7 items. For these spotlight collections, data and interface will be in English, Portuguese and Japanse.

Jordan wants necessary infrastructure changes and 2 spotlights done by September 2023, another 2 or 3 done by September 2024, and final 3 or 2 by September 2025.

Need to find and hire a contract programmer.

Specs from CFI-JELF proposal

The database will also serve as the foundation for the creation of a web-based application to support discovery by researchers, members of the public around the world, and Nikkei community members, joining a suite of digital archival outputs at UVic in this topic area.

The web-based outputs entail the production of searchable indexes and multilingual web pages, maps and spreadsheets created by a build process from intermediary eXtensible Markup Language (XML) data files. The value of the database for research hinges significantly on its ability to facilitate integrative analysis of complex archival materials that have, in past research, been isolated from one another as paper records in archival repositories across the world. The XML markup and searchability of the database provides the infrastructural integration that underlies integrative analysis.

In addition to holding the seven digital collections on Nikkei internments, the database will offer four unique features designed to meet the needs of users:

Multilingual capacity: The database will offer multilingual (English, Portuguese, and Japanese) data, database tools and user interfaces to users.
Geographic Referencing Features: The database will report on specific geo-referencing features that will enable researchers to generate spatial representations (for example to see the distribution of items that are about women) to visually demonstrate the regional context of experiences across the study areas, and to locate features mentioned in the texts. The objective is to integrate these capabilities with the browse and text-based query and report capabilities of the site in a manner that will long outlive the active development phase of the project. To do this, we need to build appropriate data structures, write the code processing those structures and prepare and populate a map-tile server provisioned for this project.
Sophisticated Searches: The database will comply with professional archival standards and enable sophisticated searches by researchers around the world. These capabilities for the end user are normally provided by a back-end database server and executable code running on the web server to communicate with the database. What distinguishes our approach is that our build process eliminates the need for the back-end database and for executable code running on the web server, thus dramatically increasing the long-term sustainability of the project at minimal cost and eliminating technical dependencies that are a common cause of failure over time. After it is completed and made available to the research team, features of this database will become part of the standard build environment used by the staff programmers at the University of Victoria and thus available to be used by other projects. The executable files and technical documentation will be openly accessible.
Diagnostic Reporting Capabilities: The database will also include new diagnostic reporting capabilities within the code to pinpoint data and processing errors, to minimize costs of debugging and correcting errors. As a result, the database will be designed to last for decades with minimal technological dependencies and technical support demands.

The UVic’s Humanities Computing and Media Centre brings critical expertise to the project and will implement the following workflow over a period of four years to develop the database.

Enter data, transfer and convert format, and merge existing data to create a collection of structured data files including metadata, transcription and markup of features of interest (XML documents validated against a schema)
Create technical diagnostic programs to ensure data files will support the features required in the output products, in particular the multilingual capabilities.
Design, develop and test a suite of programs that processes validated data files into output products on a development server and beta-test those outputs, in particular customizations to support the map-based and multilingual features (a build process invoked every time the repository is updated using a Java library called ANT to invoke executable code written in XSLT and related technologies).
Copy the output products to a publicly accessible production server and test those outputs for useability, standards compliance and long-term viability (i.e. “publication” of a digital edition on instruction of principal investigator).