Fren : Vesalius Histories of Medicine website
HeleneC of French asked to meet to discuss a website which could be used to demonstrate the "reception and repercussion" of Vesalius' famous anatomical drawings over time and space. Here are my notes from our meeting:
A number of different accounts of Vesalius and his work developed over time. You're primarily interested in tracking the growth, dispersion and distribution of those accounts as indicated by features in the language of numerous secondary documents.
Each secondary document will include:
- a number of metadata fields about the document (author attributes, publisher, date, location etc.)
- a marked-up transcript of the document
- an image of the original.
Helene and Erin need to look at a sample of documents and decide on the metadata fields to include, and for each field the range of acceptable values (e.g. is it open text such as name of author, is it a restricted set of values such as religious affiliation, is it a yes/no value, is it a number or a date)
Helene and Erin also need to look at the text of each of the sample documents and decide what features you wish to identify explicitly (i.e. mark up) in the transcript, and guidelines on what counts as an instance of each feature. Those decisions are based on you determining what kinds of patterns do you hope to find (or disprove) in the texts. It would be a good idea to make a short list of the kinds of questions you'd like to be able to ask the database and what kind of response you'd like back from the db.
We talked about marking up words that have non-modern spellings so that you could include both the "as-found" spelling and the normalized form of the word. Another slightly more abstract example: say you think protestants and catholics use theological words in different ways, whether they agree on the conclusions or not. You might have a "theological_word_or_phrase" feature to pick out any word that carries particular theological weight. Many features may not need anything more elaborate in the markup, but for some you may want to include an attribute. For example, each instance of a "theological_word" might include the attribute "whose_theology" which could take the value "protestant, catholic, protestant and catholic, other".
We also discussed that some features you care about are more easily anchored to specific words in the document and others may better anchored to the entire document. For example, a "theological_argument" feature might be better thought of as applying to the entire document rather than to the presence of a few specific words in the document.
We'd like to be able to do queries of various forms. In one type, we'd filter on values in metadata fields (for example identify a date and then compare the frequency or pattern of "theological_word" instances in documents written before or after that date).
In another type, we'd filter on the markup in the document (for example, rank the documents by density of "theological_word" instances and include in the report the author, publisher, location and date for each document)
If you have the hypothesis that there is a "northern version" and a "southern version" of accounts of Vesalius, you'd presumably want to include features in the text and possibly in the metadata whose presence or whose values would tend to provide evidence for inclusion in one or the other category. I.e. we should be able to run queries on the database which would distinguish members of those two categories based on counts or patterns of instances of unambiguous (and non-question-begging) features
The most important things for you two to be thinking about are the kinds of questions you'd like to be able to ask of the data set and what kinds of answers you'd like to those questions. I can help with the translation of those questions into forms that are computationally viable.