How to Help
This page provides detailed information on how the reborn site at UVic actually works, and how you can help improve and maintain it, either as a scholar or as a programmer.
- How to report problems
- How the build process works
- How to build your own copy of the site
- How to contribute as a programmer
How to report problems
If you are a scholar or a researcher working with the dataset, you may notice problems or omissions in the data that could be corrected. You can report these problems to use through the GitHub repository Issues system:
https://github.com/UVicHCMC/BBTI/issues
You will need to register as a user on GitHub if you don't yet have a login; this process is free and straightforward.
To raise an issue, click on the New issue button, and just fill in the form with a heading and a detailed description of the problem. Please provide as much information as you can, along with sources if applicable. If the problem you are reporting is an obvious typographical error or spelling mistake, you can expect someone to act on it fairly quickly; if it's a substantive change to the data, then we will obviously wait for scholars from the community to weigh in, and act only when there is a consensus on what to do. The discussion related to an issue will take place in the form of comments on the issue ticket itself, so you will be able to track and participate in the discussion.
If you are not able to work in GitHub, you may also send information to Janelle Jenstad. Put the phrase "BBTI Correction" in the subject line of your email. In the body of your email, provide a heading for the problem and give a detailed description. Janelle will transfer this information to a GitHub issue ticket and send you a link to the ticket. (Note that reporting errors to Janelle is a short-term fix. We will need a community member to take responsibility for creating tickets on behalf of others.)
When it comes to releasing new versions of the site publicly, we follow Endings Principles, trying to ensure that each edition of the site is clearly labelled and dated, and is coherent, consistent, and complete. That means that any changes made will not show up on the public site until the next planned release.
How the build process works
The GitHub repository contains both data and programming code. The data is in two formats:
- TSV (simple spreadsheet) files, in the sources folder, derived from the original database dump provided by the Bodleian Libraries through the Oxford University Research Archive (OURA). At the time of writing (December 24, 2024), these files are still the canonical source data; in other words, we are making corrections in these files, and then generating the TEI XML from them.
- The TEI P5 XML files in the tei folder, from which the website itself is generated. TEI P5 is a worldwide standard for encoding humanities datasets which has been in use for over 35 years (see the online TEI Guidelines). Eventually, we hope to leave the TSV files behind, since, as a data format, spreadsheets are rather limited and difficult to constrain and process. It would be better to make the TEI files the canonical dataset, but this decision can only be taken with the collaboration and approval of the community.
Corrections, therefore, are made initially to the TSV data. Then the TEI files are regenerated (using the programs Ant and Saxon, and the language XSLT) to re-create the TEI files, and all the resulting changes are committed to the repository.
To build the site, we also use Ant, Saxon, and XSLT, this time to convert the TEI files to HTML web pages. We also copy over some other resources such as CSS and JavaScript files to support the site functionality.
The result is a fully Endings-compliant static website, which does not require any back-end services other than a web server to function. Endings-compliance makes the site relatively immune from hackers, since nothing is ever sent to the server except for requests for files. Endings-compliance also future-proofs the site. HTML, CSS, and JavaScript have been working for decades; just as simple static websites from the mid-1990s still function without any issues, we believe that our static sites will be functioning without maintenance decades from now.
How to build your own copy of the site
Anyone can build a copy of the site at any time, and host it on their own server if they wish. As the LOCKSS Program says, Lots of Copies Keeps Stuff Safe, although we like to say that Lots of STATIC Copies Keeps Stuff Safer (LOSCKSS). To run a build, this is what you will need:
- A computer running a recent version of Linux or Mac OS. (Sorry, we don't have resources or expertise to support Windows.)
- Java. Install this in the most appropriate way for your operating system.
- Git (on Linux, install from your package manager; on Mac OS, use Homebrew to install it).
- Apache Ant (again, use your package manager or Homebrew to install).
- The ant-contrib package (ditto).
Then, if you haven't yet done so, clone the GitHub repository, then run the build:
git clone git@github.com:UVicHCMC/BBTI.gitcd BBTIant
The build process may take anywhere from fifteen minutes to an hour, depending on the configuration of your computer. At the end, you should have a folder called site which contains around 3GB of content; this is the website, and if you copy the contents of that folder to any web server, the site should work.
How to contribute as a programmer
If you have the skills and interest and would like to contribute programming code, please do so in the form of pull requests on GitHub. Also, please note that you should first familiarize yourself with our working methods and principles. Check out these resources:
- The Endings Principles for Digital Longevity
- The TEI Guidelines
- Project Resiliency (a special issue of Digital Humanities Quarterly), and in particular From Tamagotchis to Pet Rocks: On Learning to Love Simplicity through the Endings Principles
We limit our dependencies quite severely; Ant, ant-contrib, Saxon, Jing, XSLT, HTML, JavaScript and CSS should be all we need for this project, so please do not make PRs which bring in other dependencies: no Python, no Ruby, no React, no NPM, no PHP, no Typescript... you get the picture.