series of GREP searches in TC to isolate subject codes

Posted by on 19 Jan 2007 in Activity log

Did a large number of Regular Expression searches on the three data files to extract all the subject codes embedded in the documents (defined as \r(\w{1,16}/)
Returned a list of those to John Lutz, along with a count of records that have multiple subject codes.

Recommend he get his work studies to rationalize and normalize those codes (unless the idiosyncratic coding in there now is part of the data to be maintained).

This entry was posted by Stewart and filed under Activity log.