CB now has a user id on the eXist db, and is a member of the editors group. In the process of testing this, we discovered that lots of files and dirs in /db/data did not have group write, or were assigned to the dba group, which meant that he couldn't overwrite them. This is fixed for most files, but we need to watch out for it. I think it may happen when I upload stuff as admin; although admin is in the editors group, it's also dba, and it may cause uploaded data to be set to group dba.
I have a task, which needs to be clarified a bit before I start work on it. We have two types of links: those which open up popup windows, and those which navigate off the page or off the site. It would help users if they could tell the difference before they click on one. A number of options are under consideration.
CB reported that on the People index page, the name Æthelred II was sorting at the end. The sort is done in XQuery, and I've fixed this issue by adding a collation parameter to the order by clause:
order by $p/persName/reg/text() collation "?lang=en&strength=primary&decomposition=full"
Another issue is that Disraeli and D'Israeli are out of order; the apostrophe sorts before the letters. This can't be solved with a standard collation, so I could either strip the punctuation prior to the sort:
order by $p/persName/reg/text()/replace(., ''', '') collation "?lang=en&strength=primary&decomposition=full"
(untested), which would probably slow the page down noticeably, or write a custom collation and move the sort into XSLT (very disruptive and also slow). If it proves important, though, I'll have to do one of these. The former, for preference, if it works.
I've submitted the first set of timesheets. Takes a while to get all the info together. Watching to make sure they get processed correctly.
Links on the people list weren't working, due to a missing block of XHTML that should have been supplied by the XQuery. Due to a problem with Flow, I didn't know about this bug report until today, but it's now fixed.
Greetings all,
I am really enjoying my time with MoEML so far. While working through the BIBL1 and PERS1 files I have noticed a few things that we will need to be thinking about in the near (and distant) future. The following is a log of my tasks so far, including notes about what we may need to think about looking forward.
My first task was to delete the dates and names of contributors who have added files to BIBL1 in the past. JJ decided that there was no longer any need for this info. Working in this file, I noticed numerous formatting inconsistencies (arising from different people adding different info at different times with different MLA conventions). I look forward to amending these errors in the coming weeks.
My second task was to ensure that all links in PERS1 were updated. The ODNB had made some changes to their website and so most of our links were broken. Again, I noticed many formatting/style inconsistencies that I am eager to amend in the coming weeks.
My third task was to add the medium (i.e. Print, Web) to each BIBL1 entry. This is a newer MLA convention and has not been used consistently since the website launched. While adding these, I made some changes to the more easily-spottable inconsistencies. This got me pretty excited about a large-scale tidy-up! I am hoping to have this "spring cleaning," as I am calling it, finished by mid-June.
Since I will be spring cleaning for the next few weeks, JJ has assigned me the task of creating an updated style guide for the website. This will ensure that everyone adding information to these files continues to follow a correct and consistent format. I cannot stress enough the importance of consistent formatting for even the most trivial matters like using an en-dash instead of a hyphen between a person's life dates. If we are to continue to assert MoEML as a serious academic publication, we cannot allow formatting errors to persist. I (think I) have attached a draft of the style guide (which is also available in the svn file "documentation," in case the file addition backfired) and I would love to hear your input. It is not yet completely implemented, but please begin referring to it when inputting information. If you have any questions about formatting, please don't hesitate to contact me.
I think that's all for now. Thanks to JJ and MH for all of their guidance so far.
CB
Note to Janelle and to the RAs:
The HCMC computers do not have MS-Word. If we are editing working files, use OpenOffice. If you are on another computer and need to convert a .doc or .docx file to .odt, do NOT use MS-Word to save the file as a .odt file. The comments (where we record so much information for our encoders) will be stripped away in the file conversion.
Instead, save and close the .doc/.docx file. Then, start up OpenOffice and open the .doc/.docx file. Now you can save it as a .odt file without losing the comments.
Remember: File names must not include spaces or punctuation.
JJ requested on Flow that when you click on a person's name, their info be shown in a popup. This is now implemented. Specifically, if you use this type of reference:
<name type="person" ref="mol:HOLM3">Martin Holmes</name>
then the name will generate a popup link, but if you put this:
<ref target="mol:HOLM3">Martin Holmes</ref>
then a link to that person's page will be generated.
Worked on bibliography encoding and rendering this morning:
<title level="a">, even though in the encoding it's (correctly) placed outside the <title> tag.<title level="a"> tags throughout the bibliography file. These will be checked by CB.All changes have been uploaded into the db.
Martin's Comments on the simple_template in an email to Janelle on 2012-05-15:
The template looks fine. The only thing I noticed that I'd reconsider is the use of the <segmentation> element. I'm not quite sure how that got into our documents in the first place:
"segmentation<>> describes the principles according to which the text has been segmented, for example into sentences, tone-units, graphemic strata, etc. [2.3.3 The Editorial Practices Declaration 15.3.2 Declarable Elements]"
We don't really have much to say about that; segmentation isn't a major issue for us, especially in the born-digital documents such as Location files.
I think what we should probably have is:
<encodingDesc>
<p>See
<ref target="mol:modernEncoding">modernEncoding.xml</ref>for full details of transcription and encoding used in this document.
</p>
</encodingDesc>
We do have to produce those files first, though. We should start with the original document encoding practices, which we need to lay out in some detail before the work on Stow begins in earnest.
First Pass: Preparing a basic text for encoding.
Second Pass (depends on time and the instructions from JJ).
If you are putting the information into comment bubbles:
If you are printing out the file, highlighting locations, people, bibl items, and other features you want to flag for the encoder:
Finally...
JJ, MS, and CB have started to use the title level="a" markup for articles, both in the BIBL1.xml file and in the markup of pages where an article title appears.
Current rendering code puts quotation marks around article titles. We note, however, that MLA style and MoEML house style call for commas and periods to be INSIDE quotation marks, even if they are not part of the quotation. Colons, semi-colons, exclamation marks, and question marks remain outside the quotation marks.
The rendering code we need will pull a following period or comma into the quotes but leave a following colon, semi-colon, or question mark outside the quotes.
If the question mark is part of the title, then it will be inside the title tag anyway and will be automatically included inside the quotation marks.
Priority: Sometime this summer. We can leave with a few stray commas until MH has time to write the code.
Right now, the titles of articles are not showing up in our handy list of all XML:ids used in the site. Let's try adding the title level="a" tag to a couple of articles, then checking out what displays on http://mapoflondon.uvic.ca/ids.htm. I've asked MS to undertake this task next week. If titles do not show up, we'll ask MH to adjust the programming.
Added a new folder to the Subversion repository called "working_files." This folder is where we can store spreadsheets, workbooks, Word files, and OpenOffice files that we are sharing and storing in the process of preparing articles and texts for the site.
Important! File names must not contain spaces or punctuation.
Examples of files we might store here:
I expanded the XML Encoding document today to include a set of instructions for using SVN in Linux (previously it was only Window). The new stuff can be easily tweaked so it applies both to Mac and Linux, and the screenshots should look the same for both.
This was prompted by SM showing up for her first day of work (I wasn't expecting her till Tuesday), but she's not using SVN yet anyway.
I've made a number of changes to both Stow documents using regex and search-and-replace, and I'm half-way through an XSLT conversion that will do such things as add all the forme works, add the long s, and fix various other things. When that's ready, I'll add documentation to the blog of all the changes we've made. At present I'm tracking them in a wp document (because I'm going to send some of this info to PS as feedback on the TCP documents).
I've moved forward with my detailed documentation of the Stow encoding and conversion. I've also added P5 versions of both documents to the repository, and re-worked the encoding of the "ye" for "the" typographical convention, following MUFI guidelines.
I've continued work on my conversions of Stow 1598 and 1633, creating a full list of issues that need to be addressed, and we've discussed them in detail. The results are in my conversion_process_notes.odt file, which still need to be completed with an action list, before I start work on actual fixes. The 1633 needs to be ready for RA work in 10 days.
I gave this list of resources to our incoming RAs:
I've done a preliminary analysis of the TCP encoding of the Stow 1598 and 1630 texts, along with a test conversion using the latest TEI stylesheets for TCP. There are some problems, which I've detailed in my report, but they seem fairly minor, and I think with some post-processing we'll have usable texts. All the placename markup will have to be added, of course, and some current encoding will have to be added to and elaborated, but the core is sound.
Met with JJ to start planning the next four years. Decisions made on:
This project allows literary and scholarly works (primary and secondary) to be associated with locations in London, providing the reader with a richer understanding of the works.
| Sun | Mon | Tue | Wed | Thu | Fri | Sat |
|---|---|---|---|---|---|---|
| << < | Current | > >> | ||||
| 1 | 2 | 3 | 4 | 5 | ||
| 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| 13 | 14 | 15 | 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 | 24 | 25 | 26 |
| 27 | 28 | 29 | 30 | 31 | ||