I've been trawling through the Moses XML files trying to figure out a useful structure for the XML, and I'm realizing that the way we've been working -- moving files between directories to indicate their status -- is going to be an unhealthy way to work in subversion. Every time you move a file, you actually have to do an "svn remove" command to remove the file in its original location from tracking, and an "svn add" to add it into its new location. This will be tiresome and error-prone.
We really ought to be making use of the TEI revisionDesc element to track the changes we make to files.
Each change we make to the file would be documented in a <change>
element, and the <revisionDesc>
element itself has a "status" attribute which we can use to track what stage each file is at. I can fix it so that the web application looks at this status attribute to decide whether to display the content of the file or not, which means that all the files could always be stored in the database; we wouldn't have to put them in a special folder to "publish" them.
All the files would stay in one folder, and wouldn't have to be moved around; we could open any file to find out its situation and status.
So this is the proposal:
- Go through all the files in these folders, and identify _which_ of the folders each file should actually be in, based on its status:
- rescued
- tei_for_xform
- tei_xml_ECH
- tei_xml_done
- tei_xml_editing
- ready_to_edit_xformed
- For all of the files, create a
<revisionDesc>
element with its@status
attribute showing one of the following values:- status="unedited" (files from ready_to_edit...)
- status="editing" (files from tei_xml_editing)
- status="rescued" (file from rescued)
- status="edited" (files from tei_xml_ECH)
- status="complete" (files from tei_xml_done)
- Merge all these directories into one, called "xml", containing all the files.
- Use this to create the Subversion repository (which will also include the cocoon code, documentation, schema etc.).
After that, I'll be able to upload all files into the db without worrying about what status they're at, because the db will only "publish" files which have a status of "complete". ECH will be able to do a find-in-files to discover which files have status="edited", meaning that they're awaiting tweaks and approval from her, and SK will be able to do the same to determine which files she's currently working on. When I run XSLT transformations, I'll be able to target the transformations at files with a specific status value.
Every time a file is edited, we'll add an entry to the top of the <revisionDesc>
element explaining the changes we've made (very briefly, unless there's a good reason to go into detail).
Waiting for approval from the team before moving forward with this.