Add a TimeWasted button to the quicktags. Completed 20/11/06.
Create these three blogs and assign appropriate permissions. Completed 20/11/06.
- Add Francela to the users.
- Make sure C and F can log in and post, and change their pws.
- Add a link from the project Website to the blog.
- Create an inc file for the project, for linking into the HCMC Website.
The Mariage blog has been set up, and Claire Carlin added as a user.
Had a long discussion with Stew, Greg and David, and latterly Scott, about how to handle a number of issues in our customization and usage of the blog tool. Outcomes:
- Tracking of CTO and vacation days will be carried out in a separate simple PHP/mySQL database that we build. Building it into the blog is going to be too complicated, while creating a dedicated tool shouldn't be too onerous. This might get going in the New Year.
- All projects worthy of the name will get their own blogs.
- All projects deemed by their owners/administrators to be moribund can be hidden using their property settings; the blog would still be accessible, e.g. via a link from the project home page, but it would not appear in the list of blogs on the blog site. Any project blog could be re-activated by unhiding it. This manual control is deemed preferable to any kind of system we could imagine for detecting moribundity algorithmically.
- All project blogs will have, at some level in their category system, at least one (possibly more) instance of each of the following: "Activity log", "Announcements", and "Tasks". These categories will all be distinct records in the db category table, but can nonetheless be used for filtering records for reports using the category names (hence they must be precisely as shown above).
- Any post which includes "Activity log" as one of its categories should include a value for "Minute worked". I need to find an efficient way of adding a check for that in the posting interface. This will be added as a task.
- Where necessary, an "Academic" category will be added to a project blog to accommodate conference presentations, authoring of papers, etc.
The following blogs will be added to handle more general and non-project work:
- Maint (already added): covers HW and SW maintenance and updating.
- Depts: will cover simple support for departments, website updates and so on.
- Admin: will cover bureaucratic work, general meetings, and so on.
The default view under any skin of a single blog did not provide any simple clear link to add a new post; it was necessary to scroll down the right menu to find an "Admin" entry, and then click on that to make a post. I trawled through the code to figure out how to do this, then added appropriate code (slightly different in different skins) to all of the _main.php skin files (not including the RSS feed skins).
B2Evolution Customization Notes
All customizations are prefaced by comments thus:
MDH_Custom begin... [explanation]
and end with comments thus:
MDH_Custom end... [explanation]
Files changed so far:
1. Adding new custom tag generators to the toolbar:
The toolbar is created by the _quicktags.plugin.php file in the plugins folder. Each button is created by a call to a script function. NOTE: It seems that the shortcut key assignments don't work as advertised (e.g. Alt + W), but with an additional Shift (Alt + Shift + W).
2. Adding a "Minutes worked" field to the db and to the post interface:
The evo_posts table contains the fields which store individual posts. An integer field has been added to that table:
post_minutes_worked int(11) UNSIGNED No 0
The blogs/inc/MODEL/items/_item_class.php file contains the item class definition which interfaces with the db.
The blogs/inc/MODEL/items/_itemlist.class.php handles lists of items.
New variables, functions and codeblocks have been added to both of these files modelled on the handling of item_priority, which is also an unsigned int(11) field. The only slight difference is that when priority is displayed, it has to be converted to a string value representing what a particular priority setting means (there are five options); our minutes_worked field is simpler, because we can simply output a number.
The blogs/inc/VIEW/_item.form.php file implements the interface to the item class, and provides the form elements on the page. This has been modified in several places to create the necessary form field and label, and re-populate it when editing an existing post.
The blogs/inc/VIEW/items/_browse_posts_exp.inc.php file implements post browsing. This has again been modified to display the minutes worked value in the post header.
PHP SKINS etc.:
The blogs/a_noskin.php file outputs a simple representation of a list of posts. This has been modified to show the value of minutes worked in post headers.
The blogs/skins/*/_main.php file is the main template for each particular skin. Again, a modification has been added in each case to include minutes worked in post display. Modifications vary considerably; some files are outputting XHTML for the Web, others XML syndication feeds.
Phase one: Rescuing the old data (basically complete)
The initial task was to retrieve the original data from its DOS/WordPerfect/Lexware form. The bulk of this work was done by Greg, with the assistance of a piece of software (Transformer) written by me to make certain complicated search-and-replace operations easier.
- Convert the binary files to text (a process of converting escape characters and WordPerfect codes to predictable, readable text-strings).
- Identify each of the escape sequences used to represent non-ascii characters and character/diacritic combinations, and select an appropriate Unicode representation for it.
- Implement search-and-replace operations in sequence to convert the data to Unicode.
- Use the Lexware Band2XML converter (http://www.ling.unt.edu/~montler/Convert/Band2xml.htm) to turn the original data into rudimentary XML markup.
Phase 2: XML Encoding
(ongoing: this ball is currently in Ewa's court)
- Decide on a suitable XML format for the data. The requirements were:
- portability (format must be easily parsed and transformed)
- efficiency (data should not have to be duplicated -- for example, the same information about the same item should not have to be encoded in two places, as an independent entry, and as a nested component of another entry).
- standards-compliance (format must be based on an existing, well-accepted and documented standard; we don't want to have to rescue it again in future).
We chose TEI P5 (http://www.tei-c.org/P5/), and we decided to avoid all nesting and do all linking through xml:id attributes. We also decided that each entry would be marked up in such a way as to break it down into individual morphemes, each of which would be linked through xml:id to the entry for that morpheme. In this way, most feature information for most entries need not be encoded at all, because it can be retrieved from the entries of the morphemes that constitute it. This makes the encoding simpler and cleaner, offloading much of the work onto the XML database that will store and handle the data.
- Devise a method for migrating the rudimentary Band2XML data to the new format. This was achieved using a two-stage XSLT transformation:
- Un-nesting all the entries. Nested entries were extracted and made into siblings, and derivations encoded as part of main entries were also split off into separate entries. After this stage, all entries are siblings at the same level.
- Elaboration. The rudimentary entry information from the Lexware bands was expanded to produce a more elaborate and explicit TEI P5 structure; xml:ids, transcriptions and morphological breakdowns were created by the XSLT based on the original data, and where appropriate, linguistic descriptors were added using the TEI feature structure system (http://www.tei-c.org/release/doc/tei-p5-doc/html/FS.html).
- Check and correct the results, based on the original printed/handwritten data (this is the real work!).
- Decide on a suitable XML format for the data. The requirements were:
Phase 3: Storage and Presentation
- The data will be stored in an XML database, from which all presentation forms will be created algorithmically. The database system we're using is eXist (http://www.exist-db.org/), and open-source native XML database system which we have used for some years. This project will eventually use the next-generation version of eXist (1.1 or 1.2) which is currently in development, but pilot work is being done with the 1.0 beta version.
- The interface to the data will be built in Cocoon, an open-source servlet container which provides a good basis for browser-based interaction with the eXist XML database.
- The first output format we aim at will be a browser-based system which works like this:
- A list of headwords is retrieved through a search.
- Clicking on a headword retrieves the full entry for that item, which is inserted into the page. This is done on-the-fly using AJAX/XMLHttpRequest, which sends a query to the XML database; the database response with a block of XML, which Cocoon converts to XHTML through an XSLT transformation before sending it back to the page; the page then inserts the data into the appropriate place.
- Each morpheme in an entry is itself a link, and clicking on it retrieves the data for that morpheme's entry, which is then inserted into the page. Thus a kind of expanding tree is generated, and an entry can be expanded until it contains all the information on all its constituent morphemes.
- Each morpheme entry includes a "see also" link which would call back through AJAX to get a link to all other items in the database which include that morpheme. Clicking on one of these would retrieve its entry and insert it.
The search functionality basically works like this:
- There is a drop-down list of "what fields to search": Orthography, Transcription or English.
- If you choose one of the first two, you get a button bar to enter special characters.
- The choice also determines which fields will be searched.
- A checkbox also allows fuzzy searching. This involves translating IPA to plain latin, and searching agains plain latin conversions of the fields searched. This will probably require the construction (through XSLT, on the fly) of parallel indexes of plain latin fields, so it can be relatively fast.
- Searches will pull back results which can be added selectively to your "notebook"; you can then take your notebook to the "print shop" where you can configure a page (accessible through URL) or a PDF printable output to create a vocab sheet or mini-dictionary.
- Other projected output formats include:
- Print dictionary (PDFs generated through XSLT -> XSL:FO and converted to PDF using the RenderX XEP engine). We envisage linguistic dictionaries as well as dictionaries aimed at language-learners.
- Printable wordlists (Nxa'amxcin to English and English to Nxa'amxcin).
Graphical navigation devices for browsing the data such as the KirrKirr Java model used by Christopher Manning (http://www-nlp.stanford.edu/kirrkirr/doc/ach-allc2000-ver5-single.pdf).
Phase 4: Media
We plan to integrate audio and visual media into the database in the future.
Met with Ewa to discuss the future of the project.
- Decided to set up the blog.
- Updated the plan to post it here for comment, so that we can later break it down into tasks.
- Discussed options for the search functions, and the possibility of user-generated vocabulary sheets.
- Tracked down the source of the Chinese interface bug; it's a Delphi bug in TypInfo.pas:
Borland bug report.
- Figured out a workaround using an overloaded function.
Tested this -- seems to work well.
- Did a release (188.8.131.52).
- Worked on the menu background colour issue, and found a workaround there too (unassigning them reassigning the Images list when changing captions).
- Got another release together (184.108.40.206).
- Posted the package.
- Updated the source files on the server.
- Updated the Website.