Some time ago Stew asked me to do some work on a jwplayer instance for the Thomson mystery. I did the work then, but we recently needed to make some basic changes and hack the surrounding page to make things work in the site.
Category: "Activity log"
Using the new v3 for work on the Arabic writing site, I found a bug whereby the trailing backslash was missing from the backup folder, causing files to be backed up one folder up in the tree. Fixed the bug and tested successfully.
JJ reported a problem with the site today. The eXist service wasn't running so I started it, but the experimental map wasn't running. Toolbar and so forth appear but no map.
I've tailed the access and error logs and find no errors.
Using 3.0 to help mark up a ScanCan doc, I noticed that changing the Checked Only checkbox didn't set the Modified flag on the current file; fixed that, and then fixed a bug resulting from that fix. Also added a new "secret" Action invoked by F9 which takes content directly from the clipboard, transforms it, and places it back on the clipboard, also showing it in the GUI (there was a way to do this without the GUI before, but when you don't see anything happen, it's a bit difficult to know if you hit the right key or not).
Trying to use version 3 of Transformer for marking stuff up in ScanCan, I hit a bug whereby the first time the app starts (when the layout XML files are not there, so it's starting from zero), the resize code which ensures that columns in the main TListView are not resized out of existence or out of view does not work properly, because the initial values for column widths are not readable, or not read. Added a get-out clause in that resize routine so it detects that situation and does nothing; that allows the app to start up with columns not too badly set up, and thereafter you can resize normally, and it will save and read back correct column size values. That makes Transformer 3 usable on Wine.
string(PAnsiChar(message)). This seems to do the job. The error messages will always be in English anyway.
I also discovered today that my "is it UTF-8?" detection code gives a false positive on GB18030 files (PRC Chinese encoding). This is annoying, but there's not much I can do about it. It just means that when opening such files in Transformer, it always suggests UTF-8 instead of GB18030.
Since I now have robust text file encoding conversion available, I think it would make sense to make this a batchable feature. I'll need to think about how best to do that, but it could be a part of the main batch screen, where you could specify the input encoding. There should be a new, initial tab entitled "Loading files", which should have a drop-down where you can specify the input encoding. You could discover this using the normal file-loading capabilities in the Source tab.
Fixed a bug I found in Transformer when I was using it to prep some texts in the workshop yesterday.
Wrote the back-end code to check input, run XSLT transformations and save the results; added appropriate error messages to the error-logging code for when failures occur, and did some basic testing. Everything seems to be working. This basically means the application is code-complete. Next:
- A serious comprehensive test with a big batch of big files, including Unicode. Encoding is predicted to be potentially problematic; we need files in more than one original encoding.
- Updating of the Help system.
- Building a new installer for a beta release.
- Adding the beta to the website.
- Emailing users who might test it.
- Final release.
Completed all the dialog box functionality for the new XSLT transformation item type. Created a couple of new icons and added them to nuvola.dll, one for an XML document and one for adding an XML document; the latter is used for "Add new XSLT Transformation Item" in Transformer 3. Spent a little time making the file paths for external XSLT files robust; I'm storing both a relative and absolute path for referenced external XSLT files, so that if the sequence file is moved, and the XSLT file is moved relative to it, the absolute path can be reconstructed on load. This is working well.
Now the only thing left to do is the integration of the actual XSLT transformation, using the xsltproc dll from libxml2. This code can be adapted from the Image Markup Tool, and the adapted code will then be re-used in the IMT version 2 when I write it.
The testbed for this app will probably be the Perseus dataset, which is in an old version of TEI (P4 or P3, not sure which). We'll want to massage it into P5, and at the same time add some extra markup and do some fixes. The data will then be used for the GRS myth-mapping project.
Converted over the dpr and the main application form. Also brought in a stripped-down version of the old SystemFunctions.pas, and did a huge amount of rewriting of the file i/o functions in the application. I'm also using the SelectFileEncoding functions and dialog box I wrote for the Apparatus application when loading files. Transformer 3 now builds and runs, but of course there's a lot of testing to do, and many issues where character encoding will need to be carefully checked. I also have a problem with the main menu having disappeared; it's there, but it seems to be covered up by other components for some reason. Still working on that.
Next link in the chain...
Began focusing on the more Transformer-specific libraries for porting to Delphi 2009, specifically the
ReplacePair form/unit, and then the
TransformItems library, which defines the core classes of transformation items (currently only
TScriptItem, but soon to include
ReplacePair form was pretty straightforward, but there was one problem with
TPerlRegEx, which I'm using to replace
TURESearch. Compilation would fail with a fatal error, for no discernible reason. I eventually tracked down this bug, and was able to use the workaround (adding a pointless call to the
js15decl.pas, to overcome an ambiguity with
StrDispose, which now has two variants, one for
PAnsiChar and one for
PWideChar. I chose the latter version, so it now compiles, but I have no idea whether it will work or not. However, I did also find this project: (http://code.google.com/p/rawfpcjs/) (blog is blocking Google URLs in links at the moment), which is very recent, and looks promising; if there are problems with the use of JSBridge in D2009, I can probably move to this alternative, and it might even make life simpler. There's still a lot of work to do on
TransformItems, and then on the individual
ScriptItem units, but I'm getting close to working on the actual application, finally.
Added the code to map the old output filename variables onto the new template system. The Batch window functionality is now complete.
Of the three items listed in the previous posting, I've fixed the menu positioning issue, and I've cleared out the old unused message TStaticText controls. I've also made a start on a version conversion system for batch files, by adding a version element to the output file, and checking its presence/absence/value when loading a file. Now I just need to add the code for converting the complicated old settings (5 variables!) to the new single
OutputFilenameTemplate string value.
Finished the output filename mask code, with the live demo of the mask in action and the file i/o fully tested. A couple of things remain:
- That popup menu is still in the wrong place.
- I haven't decided how to handle old Transformer files; if it's even possible to map the old setup onto the new, I should try it, but I should also warn people when they open an old file that they should review the filename settings.
- I probably haven't cleaned out all the obsolete messages stored in labels in the Batch form.
This stuff shouldn't take long, and then I'll be in a position to take on Transformer itself, starting with the data structure for replace items.
As part of rewriting the Batch file processing screen, I've been looking closely at the clunky old system by which the results of files were saved to a designated location and name. I'm going for a placeholder-based system similar to that used in oXygen's transformation scenarios, but a bit more flexible (it has more human-readable placeholders, and has one for the original file extension). This also integrates with controls for choosing an actual folder as part of the filename. I have most of the work done, but there are some GUI issues to fix -- I need to add a plain folder image to the Nuvola dll, add images to the popup menu, and figure out a problem with the position of the display of the popup menu, which is currently in the wrong place. I've also refactored a lot of the original code, renaming components to remove the leading "u" which was used to designate a TTntUnicodeControls component.
A great addition to the application, so I've rebuilt the installer (same version, of course -- there's no change to the executable), and updated the documentation and the Web site. I've also updated the roadmap and future features information to show my current plans for the application.
I found this excellent implementation of the PCRE library wrapped for Delphi by Jan Goyvaerts. The Delphi code is MPL 1.1, and the PCRE engine is BSD, so it's all usable in any of our projects, and it's perfect for Transformer because it handles UTF8. I initially compiled and installed the component, which is designed to work with the dll that's shipped with it, but every time I destroyed a created instance of the component, I was getting access violations, so I edited the source to link to the C object files instead of linking to the dll. This uses a slightly newer version of PCRE, and more important, it doesn't generate the AVs. Built a test app, wrapping the component in my own class to suit what Transformer will want to do. Everything works fine!
Made fantastic progress today. This is basically what I've implemented:
- Several LoadFileToString functions with a range of different input parameters.
- Functions for detecting the character set on load, including peeking into XML and HTML headers, and detecting UTF-8 byte-sequences.
- An inventory (TDictionary) of code pages known to Windows, which make it possible to look up any code page id found in a header.
- A dialog box which will let you test any of the different code pages against your text until you find the right one, with live conversion and font control.
- Lots of testing with a variety of languages, encodings and BOMs.
With the exception of UTF32, I now have all of this stuff working. I'll have to add the UTF32 handling, and then work on finding a decent open-source implementation of regular expressions for Delphi. At some stage, it might be worth trying to take the broken port of Mozilla code, which has functions for recognizing likely ANSI encodings by their byte sequences, but that might be overkill.
This really has been hard, but quite rewarding, and infinitely valuable. I can add to Transformer the ability to specify an input code page as well as an output encoding.
Delphi 2009 has better file i/o for Unicode text files than any previous version, but there are still lots of holes. It's good at loading a file which has a BOM, but if there's no BOM, it just uses the system's default encoding. I need to do much better than that, so I'm writing a lot of new code, and repurposing old code, to make that happen. What I've got so far is a function which automatically detects any UTF8, 16 or 32 BOM, and failing that, checks the bytes of the file to see if it's likely UTF8. Now that's working, I need to go further and check for explicit character encodings named in the preamble of the file itself, in HTML, XHTML or XML files. This will involve assuming ANSI, which is reasonable, and loading it that way, then searching all the likely locations, allowing for case, etc. I've done something a little like this before, but it has to be a bit more bulletproof. Then I have to follow Marco Cantu's example of defining a custom encoding to create the second of the UTF32 encodings, so my apps can load files in UTF32.
This was a bit tricky, because of the use of AnsiString types and PChars in the original Pascal header conversions. I have it working, but I don't have much confidence it will work reliably with (say) filenames in Japanese. That will need some testing.
Ported a progress bar dialog box which is required for the batch file window in Transformer to Delphi 2009.
Ported and simplified my DocLauncher console app, which is used to launch help documentation for IMT and Transformer, to Delphi 2009. In the process, I removed dependencies on old libraries (
SHBrowseU) by substituting actual
ShellExecute calls for calls to my own wrapper functions.
Updated the TRecentFiles class to use ADOM for saving/loading, and also simplified the constructor (there were two of them, but one was actually enough).
Starting working out the graphics needs for e.g. IMT, and found that Delphi 2009 has built-in support for JPEG, PNG and GIF (add
GIFImg to the
uses clause respectively).
TIFF and other formats are not supported, though, so I found a port of
GraphicEx to Delphi 2009; adding this to the
uses clause adds lots of formats including
TIFF (but it generates lots of warnings about "unsafe code"). I think adding
GraphicEx before the other graphics units in the
uses clause will ensure that Delphi code is used for those formats which are handled natively, while
GraphicEx code is used for other formats it can handle.
Began porting RecentFiles library. This unfortunately relies on my old-style XML i/o code instead of ADOM, so the disk i/o stuff will have to be rewritten, but it shouldn't take more than an hour or so.
Also, with Greg, began testing D2009 test apps on Darwine. It seems that the Kronenberg download of Darwine (which is at 1.0.1, the same as the Ubuntu version) works perfectly as long as you run your apps from the command line using the wine binary; if you try to use his WineHelper app, they blow up. What this means is that Wine works great on OSX (we even tested with Japanese GUI strings), so we have a viable platform, but we don't have a user-friendly Wine front-end for Mac users. That may change over time, but in any case we can provide a script-based installer that would put the app in the right place, and then create shortcuts that would run it properly. So all three platforms are a go.
More progress porting my code to D2009:
- Ported the Preferences unit and dialog box, in the process creating a Transformer replace sequence that does most of the work on PAS and DFM files.
- Built a test app for Preferences, and tested it.
- Built a universal test app for all my libraries; as each is ported, it'll be added to the universal app, and be tested automatically.
- Tested Unicode (with Japanese) in GUI translation system. Works great.
- Tested the same thing, along with Preferences, in Wine -- again, it works a treat!
I've also, finally, downloaded and built the Help file updates, so I have a working help file. Not that it's much use, in practice.
Ported the translation code, which again proved a little simpler than expected; the test application is working like a charm. The only unexpected thing was that although I thought I'd be looking for properties which were tkString (previously tkWideString), it turns out that I needed tkUString; presumably the TypInfo.pas unit is not quite as "Unicode-everywhere" as the rest of the VCL and RTL. My guess is that this is a result of its hooking into Windows fairly closely, so being dependent on Windows types rather than Delphi types, so tkString has some specific relationship to a Windows string type.
Simplified the FormState library so that it only uses XML files, and also added a new feature so that non-modal forms can be shown at start-up if they were showing at shutdown.
Next, the translation code...
Ported several libraries to D2009 today, including GenFunctions, SplashAbout, VersionInfo, Icons, and FileOverwriteConfirm, and I'm beginning work on FormState, which requires several of the others. Everything is going very smoothly indeed so far.
The next iteration of Transformer needs to get rid of its dependence on the buggy TURESearch component from jclUnicode.pas, which has serious issues. I've been searching for ages for a decent open-source alternative, and now I think I've found one. It requires Delphi 2009, but that's on order (I think), and I do need a relatively straightforward project with which to pilot a move to D2009; Transformer is probably that project. This would be the migration path:
- Get D2009 installed and working.
- Create a new projects tree.
- Migrate Transformer files into it, along with other key libraries from my current tree (such as Batch, Translate, etc.).
- Install XDOM 4.2 (or ADOM, if that's the current name of the appropriate version), and Project JEDI JCL and JVCL.
- For each sub-project (Batch, Preferences, Translate, FormState, SplashAbout, RecentFiles...), create a new test/dev app in which to develop it.
- For each sub-project, remove all TNT dependencies, and rationalize all dependent code so that Unicode strings are now used. Pay special attention to any SystemFunctions, FileFunctions and StringFunctions code which may be invoked. It may be necessary to start a new version of each of those files, into which we only add functions that we turn out to need.
- Once all the dependent projects are working, bring in Transformer and strip out TNT from that.
- Get the PERL RegExp wrapper package and install that in Delphi.
- Rewrite the string replace code based on it.
- Once everything is working, think about adding the XSLT support through libxml, as in IMT.
You can now hold down the control key when pressing the Do Transformations button in Transformer, and it will take input off the clipboard, process it through the transformations, and put the result back on the clipboard. That's really handy for these ad-hoc usages that I find for it in markup projects.
I'm using Transformer to auto-markup some bits of text while working on ScanCan documents, and found it annoying that after every operation in the main screen there's a popup you need to dismiss. Added a command-line parameter,
/suppressPopups, which prevents this. May or may not document it. Transformer is deuced handy...
I scanned the poem at 300dpi for EdeR and gave him a PDF of the whole thing.
I also created a layered Photoshop file with the scans and adjusted it so that the layers were properly aligned vertically and horizontally. An overlayed grid can now be used to make quite precise measurements for placing fragmetns.
Paolo Cutini (as always) completed a new Italian interface translation within hours of the release of version 2.0. Great thanks go to him. I've rebuilt the installer to include the new translation, and posted it on the site.
Originally developed in 2006 as part of a project to rescue old DOS word-processor files from a Linguistics project, Transformer has since been used extensively on the Colonial Correspondence project.
Find out more at the Transformer site...
This morning I finished working on the tutorial, including the interactive screenshots, which will also be on the main Transformer Website. It took longer than I expected, because the originals were done with a pre-release (0.9) version of the Image Markup Tool, using a file format which can't easily be converted to any of the release versions, and in any case much has changed in the main interface. In the process of doing this, I also:
- Found and fixed a bug in the column-sorting code, which was supposed to be sorting by length of items for Find and Replace items, but was actually sorting (most of the time) alphabetically.
- Fixed a long-standing annoyance with column sizing in the TTntListView control that displays the sequence items. There's no onresize event for column headers, so they can end up being sized too wide for the control. Now they resize themselves appropriately when you exit the control, or click anywhere on it, which is better than nothing.
- Tweaked a couple of other resize functions, which were causing scrollbars to disappear in some extreme circumstances.
- Fixed a label which should have been updated before.
I've now finished the section on scripting, so I just have a handful more topics to update with new screenshots etc.
JS-R is working on providing a new angle on Canadian census data by pre-calculating, and making available through a Web interface, two measures which express how segregated or integrated individual groups are within Canadian cities. He has a contract programmer working on a pilot of this, using PHP and mySQL, and wants us to take over and maintain/extend the project after the first phase is done.
Wrote to sysadmin to get a TAPOR project id and group set up, along with a domain (segregation.uvic.ca). JS-R will provide some basic intro material for the site, which we'll set up ahead of a presentation he's giving on June 3; the site will initially point at the pilot application on the external developer's site, for the purposes of that presentation, and then the code will be moved over to our server. At that point, we'll most likely move from mySQL to Postgres, to take advantage of better support for Views, since the queries are very calculation-heavy.
With more feedback from DB on what individual escape sequences should be converted into, I was able to add a lot more replace sequences to the set that I've developed for his Waterloo Script files. In the process, I noticed a need for a "Clone Transformation Item" feature, because many of the new replacement items were minor variations on old ones, with one character changed. I've now added that feature (including adding a new icon for it -- twin wands -- in the icon library dll). It seems to be working fine.
I've run the extended conversion sequence on DB's files, zipped up the results, and posted them again for him to download.
Ended up just tidying up what I'd done so far (centring images etc.); no time to get any further with it. Back next week.
The tutorial is quite densely-populated with screenshots, all of which have to be redone, so it's quite time-consuming to update it. So far, I've done the first five topics. I'll also have to add another one or two topics to cover the new scripting functionality. Should be done by tomorrow, I hope, in time to do a beta release before the weekend.
Batch operations in Transformer can be very long lists, and sometimes you just want to run a subset of the list, to test things out on a few files, or to re-convert a few files which failed previously, because of something that didn't affect the bulk of them. I've now added an option so that you can select one or more files in the list box of the batch screen, then right-click and choose to transform only those files.
The Help file is now complete for Transformer 2.0, and seems to be working fine. It might need a little more indexing, and perhaps some additional help for the scripting component, but the tutorial might take care of that.
With help from the oXygen forum staff, I reinstalled oXygen 9.2 in a new folder, and removed the old folder; problem solved. It must have been caused by a file not deleted during the pre-install uninstallation of the old version.
Spent most of the afternoon documenting new features in Transformer, with screenshots, and tweaking the app where I came across something less than ideal. At the end, I wanted to do a test build of the help file, but ran into an indecipherable error with Saxon 9B running in oXygen 9.2:
Saxon 9B null
When trying to validate the XSLT stylesheet, I got this equally unhelpful feedback:
SystemID: C:\Documents and Settings\mholmes\My Documents\Borland Studio Projects\my_projects\mdhHelp\mdh_docbook_to_html.xsl Description: net.sf.saxon.style.XSLVariable.getReferenceList()Ljava/util/List;
I don't know from this whether Saxon is broken, or whether it's been updated in oXygen 9.2 to a new version which stumbles over an error it was previously happy to let slide. But this stylesheet was working under 9.1, because I used it to build the IMT Help file, and that no longer builds either. Posted a message on the oXygen forums; we'll see if anyone else has seen it. Failing that, I'll try reinstalling oXygen, and perhaps also running the transformation on other machines.
Several new code libraries had been added to Transformer since the last release, and they were lacking header information, licensing info, descriptions for the Website, etc. The Website source code and requirements page, as well as the installer are now up to date for version 2, but the rest of the site will have to be updated before I can do a release. I need to do this (a beta, perhaps) within the next three weeks.
Working on DB's files, which are numerous and for which the processing is complicated, I found I really wanted to be able to cancel a batch operation, so I've repurposed the Progress form I wrote for Markin 4 to show how the batch is going, and enable the user to press a Cancel button and stop the process. This is a great improvement for large batches. The cancel flag is checked at the end of each replace or script operation, so as long as those operations are completing in good order, it's pretty snappy.
The problem of how to monitor and abort a frozen or looping script operation while it's happening still remains, though. I've been doing a lot more thinking and reading about this, and it seems likely that only something fairly aggressive such as TerminateProcess could do it. The script code would all have to run as a separate thread, and there would have to be a monitor thread with a timer, which was initiated when the process began, and which gave the user the option to kill a script process after a timeout had been reached without the thread terminating. That code could also check for the Cancelled flag, and kill a process when that was True but the process had been running for a (perhaps shorter) timeout. It'll take some work to get this implemented and tested. I'm still not sure there's a clean way to kill a process running in a C++ dll from Delphi without orphaning some resources.
There are now bridging Delphi / JS variables called
JSInputFilePath, which are populated with the appropriate values during transformation operations.
In ColDesp, we're making a point of using the original
.scx filename as the
@xml:id attribute of the XML file created from it. This is easier if the original input filename and path are exposed to the transformation system, so I've added hooks which expose that information through two variables tied to placeholders. If you include
Since the beginning of the project, using the Preferences dialog box to control application font settings, there's been a bug in the repainting of TTntListView column headers, which afflicts the main window of the app. This might just be a bug in some graphics drivers, though; it doesn't show up on some other machines. What happens is that when a font is larger than the default, the original lower border of the column headers is left painted, obscuring part of the text. This works around it:
if LV.ShowColumnHeaders then begin LV.ShowColumnHeaders := False; LV.Invalidate; Application.ProcessMessages; LV.ShowColumnHeaders := True; LV.Invalidate; Application.ProcessMessages; end;
I've been trying to find a fix for this for a couple of years.
I've implemented a more sophisticated system for handling out of memory errors:
TTransformListnow has the ability to retrieve and store error messages when executing a script operation.
- The main form routine
ProcessExternalFilecan now keep track of these as they go by, and report them to the calling procedure.
- The Batch screen can now keep a list of all these errors, and show them to the user via a temp file saved to disk.
- I've confirmed that the operation was running out of memory about 68 files in, so I upped the minimum memory to
1000000by default, but I've also implemented a system whereby you can pass a memory value to Transformer on the command line (
-jsmem=1000000), and if it's larger than the minimum, then the
JSEnginewill be created with the larger value. This (coupled with proper documentation) will give users a way to get around out-of-memory errors if they occur.
- I had to tweak one of the Italian translation strings, to add a new formatting variable for the number of script runs during a batch operation, since this is now reported at the end of the process. If the placeholder is missing, then you get an exception.
Working on the ColDesp project, I discovered that at a certain point in working through a batch job, the script stops executing (it does nothing). I'm pretty sure this must be a memory issue. Hopefully it's not a leak in the SpiderMonkey dll. More likely, I'm failing to free something, or I can do a better job of creating and freeing objects so the memory's retrieved when it's needed. This will need a bit of work.
I've also realized that there's only really one way to do monitoring/policing of operations: spawning a thread to do all the JS stuff, and leaving the main thread counting time, with a dialog box always open which enables you to kill the JS thread. That will also take some work.
But the app is now so useful and convenient that it's worth the time. I'm using it for all sorts of little tasks now.
Tried various ways to police the execution of JS code from Transformer, but no joy. No responses to my question either. This will take some work.
On the other hand, I made a lot of progress on the Waterloo Script conversion sequence; I'm now producing valid XHTML files from most of the script files I throw at it. I still have handlers to add, and there are issues such as what to do with index (
.ix) commands, which only make sense in the context of a larger document, but if I can get DB to help me reconstruct a complete file for each book, I can operate on the single file. There are also some accented characters I'm not handling, and I think it will be wise to run an XSLT transformation on the end result, to rationalize some of the block elements which handle (for instance) indenting. The transition from serial switch commands to hierarchical XHTML is slightly bumpy.
Started work on converting one of DB's script files from Waterloo Script to XHTML, using the new Transformer. Got quite a long way -- headings, paragraphs, and inline underlines are all handled, and I'm building up a Doc object which can process the input script effectively. XHTML is the best option because it's XML, but it can also be loaded directly into a word processor, which I think is what DB wants. Some points re the app itself:
- Added some counters to give back info at the end of the process on how many scripts were run, as well as the original total of replacements made. The counting is being done in both the single-doc GUI and the file processing for multiple docs, but in the latter case, nothing is returned to the calling function; that reporting will have to be added.
- Added detailed error reporting when code is checked in the Script editing form. Works great! Line, line num, and offending symbol are all reported correctly.
- Discovered that I need to handle such situations as endless loops, which just tie up the app. One option is to add a bound JS function to the beginning of each code block I run, which starts a timer that calls back to Delphi every second or so, so I can monitor and offer to kill the process if it runs too long. But I don't know how to kill it yet. I have a message into the DelphiMoz list about this.
- Minor GUI tweaks.
Then I got file i/o working, and added JS Bridge code to the dialog box to check errors. I figured out how to get an error report back from JS Bridge (a new feature, not documented properly). That seems to work a treat.
Tested, bugfixed, and tested again. Then I added and tested handling for the old file format.
Now we're ready for a real test, using DB's Waterloo Script files. I also need to make sure the Unicode stuff really works in the UniSynEdit.
- Finished the code for the transform item classes and the list class.
- Stripped all the old code out of the
main.pasunit (where it should never have been anyway).
- Substituted the new classes.
- Refactored to rename all items, objects, methods etc.
- Updated existing method code so that it still works as before (all replacements etc.).
- Added new icons (see previous post), and assigned one to the old
- Added a new
aAddScriptItemaction; no method body yet, but it has menu and toolbar items which invoke it.
- Built and tested: old code still works, new code remains to be added.
Remaining to do:
- Create a ScriptItem editing form, with a JS Bridge object which can be used to compile/verify JS code.
- Give it two tabbed panels, one for local and one for global JS.
- Give it Check, OK and Cancel buttons. buttons
- Implement the aAddScriptItem method.
- Complete the aEditTransformationItem action.
- Start testing! Emphasis on the Unicode...
- Update the tutorial.
- Update the Help.
- Update tehe installer.
- Update the Website codebase.
- Update the Website content and do the release.
Having established that we have a licence for Visual Studio, so we can distribute the MS dll, and that we can get JS Bridge working, I'm starting into the actual updating of Transformer, taking it to version 2; this will be targetted initially at producing a tool which can do the whole transformation of DB's Waterloo Script files.
Today I wrote a new unit (about 1,000 lines) with four classes:
TTransformItem (base class for the next two),
TTransformList. Much of the code for these is adapted from the old TReplaceItem/TReplaceList, but there are new properties and a slightly more complex hierarchy. Everything is done (although not tested) with the exception of
TTransformList.ExecuteScript, which will execute script directly. It's a method of the list rather than of the item, because is a
GlobalScript field where global stuff can be stashed, and that'll need to be bundled in with any code from the item.
Looking forward to getting into the use of JS Bridge...
I got a test app working with JS Bridge, and confirmed that:
- I can compile and execute a script from a string value.
- I can pass data into the script.
- I can retrieve data from the script.
- I can successfully pass in and retrieve WideString values.
jsconfig.inc file that seems like it might enable it); and licensing/redistribution (the Moz components are MPL, but it also requires
I'll build a test app as soon as I get a chance.
DB from Pacific and Asian has a batch of old SCRIPT files that need converting. Plain search-and-replace won't do for this, and I don't want to create yet another custom application just for this job. It occurred to me that I might be able to build a Pascal interpreter into Transformer, so that Transformer operations could also be actual script blocks. I've done some research and testing today, and discovered that there are two candidates: the JEDI TJVInterpreter, and RemObjects Pascal Script. The former is too simple and completely undocumented. The latter would be great except that it has no packages for Delphi 2005, so I'm stuck with no GUI components or property editors.
I spent the day getting a test application organized, and got a proof-of-concept code block to run. I'm able to execute basic functions that use basic types. However, I've not yet successfully imported classes such as TStringList; when I try to do so, I get no errors, but the function produces no result. I need to figure this out, because I'll also have to be able to use TTntStringList or TWideStringList, which will mean writing my own import units and calling them on the model of the built-in ones.
However, if we get this working, Tranformer will be twice the tool it is right now; so it's worth a couple of days of hacking and learning.
Incorporated a minor tweak to the way output filenames are configured in the Batch window, making it easier to preserve the original file extension, as well as incorporating updates to the Preferences dialog box arising out of the IMT project. Updated documentation and created a new release.
Paolo Cutini reported an oddity in a tooltip hint, and also provided a new Italian translation, so I fixed the hint and built a new release.
While video was processing itself in the other room, I added the new "ASCII with numeric entities" output format to the output text area in the main screen of Transformer, then updated all the documentation, built an installer, and updated the Website for the new release.
Transformer is a great tool for working on Unicode texts, but today I hit a problem, in that I needed to work on Hot Potatoes data files, which are not actually Unicode; they're 8859-1 files with all upper-ascii characters escaped to numeric escapes.
It seemed a shame not to be able to work on those files, since the actual underlying format is Unicode (they use Unicode characters, but encode them as numeric entities). So I made a couple of changes to Transformer to make that possible. First, I analysed the file load routines: there are two, one for the source text in the main screen, and one used to load files during batch operations. These were both failing to load plain ascii files, because they appear to be UTF-8 with no BOM, but they're not. So I added a couple of lines so that, in the event of a failure to load a file as UTF-8, it will be loaded as plain ASCII and then turned into a WideString.
Secondly, I needed a way to save a file as ASCII or ANSI, but deal with any characters over 127. I added an option to the batch window to save as "ASCII with numeric entities", which escapes all characters above 127 to HTML-style numeric entity references, and then saves as ASCII.
This all seems to be working well, but it needs to be documented, and the same save function should also be added to the output text save dialog and routine, for symmetry. Adding this as a task so that I get around to doing that.
Problems with the horrible phpMyAdmin interface and unreliably-encoded tab-delimited text files eventually succumbed to a two-headed attacking force armed with two operating systems.
Paolo's bug report was absolutely correct; the "Files changed: " message was actually embedded in the code. Added a TTntStaticText component to hold it, which makes it translatable. In the process, normalized the names of two other message components to "ust[blah]" instead of "ustmsg[blah]", to match all the others in that form.
The task below completed -- turned out I needed to make an explicit call in the toolbar resize event to reposition the sequence TTntListView control. Fixed the bug, built a new installer, and released version 220.127.116.11. Also fixed links on the Transformer and Image Markup Tool sites to the project blogs (which have now changed location).
Testing on various platforms, especially on VISTA yesterday, where we set up a non-standard font/DPI setting, reveals a minor bug in the toolbar sizing in Transformer. it seems to afflict the top left toolbar (sequences) in the main screen. When icons are set to larger that 24px, the toolbar does not auto-size. This might just be the AutoSize setting being false (check all toolbars in the app), but it could also relate to the resize code handling the display of the grid component below it.
Integrated the Nuvola dll, as done previously for IMT. In the process, found a couple of other libraries common to IMT which were still linked to the original icons.pas file, so I fixed those and rebuilt the IMT (just making the executable smaller).
Also figured out a workaround for the issue of restricted characters in replace sequences reported by Paolo Cutini (they could be saved, but then the file could not be reloaded). I'm now escaping those characters to maintain valid XML 1.0.
Built an installer, did the release, updated the Website and source code archive, etc. etc.
Created a workaround for this Feb 7.
Had some correspondence over the weekend with Dieter Köhler (author of the OpenXML code I'm using to save and load files), and he confirmed that the XML 1.0 specification disallows characters in this range. The question now is how to handle situations in which people insert these characters. The XDom code is asymmetrical in that it fails to raise any error when saving a node containing illegal characters, but it does raise an error when trying to read them back in; I need to allow for this by somehow escaping these characters myself.
Posted time spent researching this and checking into my code. Also made it a task to add the relevant code to the app.
Complete: created a reasonable workaround for this on Feb 7.
Paolo Cutini is using Transformer to recover some old WordStar word-processor files, and encountered a problem reloading the sequence file he had saved. The file appears to be corrupted; a control character occurs throughout the document. oXygen reports:
F An invalid XML character (Unicode: 0x1c) was found in the element content of the document.
That character is "INFORMATION SEPARATOR 4" or "file separator". It wouldn't normally be found in a Unicode document. However, that character is in the Unicode specification, so it ought to be somehow encoded in a format that UTF-8 can handle. This may be a limitation of the XDom engine I'm using for XML file handling, it could be a bug in my code, or it could be that I should automatically exclude control characters on the basis that they shouldn't show up in a text document. I'll look into it. Transformer is intended for working on Unicode texts, rather than ancient word-processor formats, but I do like the idea of using it to retrieve this old data; the program was written as part of a project to rescue some old DOS WordPerfect data, after all.
Entering this as a task with a long deadline, because it's not a major thing; what needs to be done is to investigate the code which uses XDom to save files, and see if the file data is being correctly encoded in UTF-8; if so, look into the specs and see if UTF-8 is supposed to handle this character, and if so, whether it should be somehow encoded or escaped.
While using the program, Stewart found the following bug:
Regular expression syntax was checked in a replace pair even when "Use regular expressions" was not selected. This meant that an expression which was not a valid regular expression would be rejected and could not be used, even if it was not intended as a regular expression.
Fixed this bug, documented it on the site, built a new release package, and released 18.104.22.168. In the process, I converted the Transformer installer build system to use the Inno Setup Preprocessor like IMT does, to make future releases easier and quicker.
Did a new release of the program (22.214.171.124) incorporating:
- use of DocLauncher for Help files and Tutorial inside the app.
- setting of Scaled = False and AutoScroll = False on all TTntForms.
- bugfixes for the translation implementation arising out of work on the Image Markup Tool.
Following the development of a PDF documentation system built on the existing DocBook Help system, another release should be made, also incorporating fixes for any bugs which emerge before the end of the year.
Help for Transformer is built using a system we are developing which incorporates the Image Markup Tool (for interactive screenshots) and DocBook files. Currently this generates only an interactive HTML Help system, but it will eventually also generate printable PDF documentation. Transformer is the testbed project for this documentation system, which will be used to document all our projects in the future.
The first full version (1.1) of UVic's Transformer open-source Unicode search-and-replace tool has been released.
Back in February, we announced a beta version of this application; the full package is now complete, including documentation and source code, and is available from here: http://www.tapor.uvic.ca/~mholmes/transformer/ Transformer was created as part of a project to rescue some very old linguistics data, which was stored in a combination of Lexware and DOS WordPerfect files, by converting it to Unicode. Non-ascii characters were represented in the data by nasty sequences of control characters used to switch between obsolete character-sets and long-gone fonts in WordPerfect. In order to convert the data, we had to create and test a huge sequence of search-and-replace operations which would find these strings and replace them with the correct Unicode codepoints for IPA characters. To make this process easier for ourselves, we created a Transformer, a Windows application which enables you to create, organize and test sequences of search/replace operations (including regular expressions), then run them in batch mode on a set of files. It is released as open-source under the MPL 1.1.
November 1 2006:
- Added a Help topic for the Translation screen (interactive, with IMT).
- Added an Acknowledgements topic (DocBook XML directly coded).
- Began testing the output across various browsers.
- Determined that IE7 still has the bug relating to the absence of text in a div -- can only respond to onmouseover if there is text.
- Added some text to mitigate this, but it's impossible to make the text large enough and position it correctly to make it work well. This will have to be studied in more depth.
- Determined that Safari and Konqueror have a bug related to positioning, so offsets of click areas are wrong.
- Discovered that offsetTop and offsetLeft are calculated differently by the different browsers -- some are relative only to the offsetParent, which may be the doc and may be the actual parent. Adapted an function from the Web for calculating it recursively, which fixes that problem.
- Found another problem related to the hash in the location.hash property; Safari and Konqueror may sometimes double it. Worked around that.
One more problem with Safari not finding the data it needs to show in a popup. The problem may relate to this:
- I'm using innerHTML to copy the contents of a node to another node.
- The contents which are copied contain elements with unique ids.
- Therefore the copy operation will create duplicate ids.
This might be a solution:
- Move elements which are to be displayed as popups, rather than copying them.
- Move them back to be children of the body element and set them to display: none; before moving another element into the popup box.
- Tested on IE6 and found it's close enough to working to make it worthwhile to hack the popup code and make the popup position itself correctly in IE6. Will also do this tomorrow.
November 2 2006:
- Rewrote the DOM code to avoid innerHTML and intelligently move elements around instead of copying them.
- After proving the concept works, encapsulated this code in a DisplayHost object which can handle popups automatically.
- Tested on all browsers -- OK!
- Decided it's worth trying to support IE6, so added some special handling to make the popup work on IE6 too.
- Tested -- working.
- Began working on the problem of empty div areas not being clickable in IE. No decent solution, so we currently have a hack which does the best job possible but still leaves the edges of divs unclickable. Sad, but unavoidable; IE is crap all round (even 7, which has the same bug).
- Updated the source code on the server, and documented mdhHelp.pas library.
- Built an installer, and tested it. Seems OK.
- Tested the Help invocation system when IE is the default browser. It launches the browser, but fails to add the hash to the URL, so it doesn't navigate to the context. Will have to work on this.
- Updated the source code on the server, and added new description files where required.
November 3 2006
- Tested help system on IE6 and IE7. 6 is OK, but 7 can't handle the hash in the path which directs to a specific topic, so it's hopeless for context-sensitive help. Tested on Opera and that works fine, so:
- Rewrote the Help launching system so it looks specifically for Firefox, and failing that for Opera, before falling back to the default handler for .htm. That means that if Opera or FF is installed on the system, they will be used in preference to IE.
- Reworked the Transformer Website. In the process, added more files to the Help system, and more items to the glossary.
- Rebuilt the Help and tested it.
- Built the installer, and tested it.
- Released the application by updating the Website information.
- Posted a topic to the TAPoR news thread.
- Created an updated description of the project for the new HCMC site, and posted an inc file that can be pulled in to the HCMC projects page.