Archives for: April 2012, 27

27/04/12

Permalink 04:11:41 pm, by sarneil, 38 words, 165 views   English (CA)
Categories: Activity log; Mins. worked: 60

only basic search for building permits

Modified global variables so only the basic search is available to user. Also added a help file specifically for the business permits, changed global variable that writes link to it and edited the content of the help file.
Permalink 04:08:02 pm, by sarneil, 39 words, 56 views   English (CA)
Categories: Activity log; Mins. worked: 30

etcl : update WordPress

Updated instance of wordpress in the dev site for ETCL. No apparent issues. To do that backed up the SQL db to .sql file on my mac and also copied the front-end pages to my mac, just in case.
Permalink 04:06:41 pm, by sarneil, 24 words, 53 views   English (CA)
Categories: Activity log; Mins. worked: 30

etcl : slight modification to scraper

Final tests and documentation. Modified scraper so that it lists number of contributors and total number of revisions for each page on the log.
Permalink 02:47:58 pm, by mholmes, 37 words, 51 views   English (CA)
Categories: Activity log; Mins. worked: 90

Graves: P5 transformation of TEI P4 almost

My P4 to P5 conversion is now working, and producing valid output on abstracts, entries, and the project metadata file. I may do more work on this, but I'll be moving on to the -ography stuff next.

Permalink 02:46:25 pm, by mholmes, 13 words, 104 views   English (CA)
Categories: Activity log; Mins. worked: 90

TEI tickets

More work ahead of release next month: introducing new recommendation to use xml-model.

Permalink 02:37:50 pm, by mholmes, 386 words, 65 views   English (CA)
Categories: Activity log; Mins. worked: 180

Generating candidate duplicate owners

I have a script currently generating a list of candidate duplicate owners. This is how it was done:

  • Export the owners table as XML.
  • Process with XSLT to sort by surname and forename fields.
  • Edit so each each record constitutes a single line.
  • Remove bracketing XML container, and add five blank lines.
  • Run the following script:
#!/bin/bash

#This script is designed to run a series of comparison tests of xml-encoded owner
#records in an attempt to discover possible duplicates, which are then to be investigated
#by the PI manually.


#Threshold below which to consider a possible dupe
MINSIM=0.1

#First, paths to files.
USM_JAR=/home/mholmes/WorkData/netbeans/uniSimMetric/dist/uniSimMetric.jar
NCD_COMMAND="ncd -l "


INPUTFILE=/home/mholmes/WorkData/history/stanger-ross/properties/xml/owners_12_04_27_flattened.txt
OUTFILE="/home/mholmes/WorkData/history/stanger-ross/properties/xml/owner_dupe_candidates_`date +%Y%m%d`.txt"

#Echo the start out to the output file.
echo "Possible duplicate owners found by string comparison using USM">$OUTFILE
echo "">$OUTFILE

#Initialize a counter
C=0
#Read in the inputs line by line
cat $INPUTFILE | while read line; 
do 
#Ignore empty lines. This ensures we can read five lines forward (there are five empty lines at the end of the file).
	let "C=$C+1"
	LEN=${#line}
	if [ $LEN -gt "3" ]; 
		then
		for ((N=$C+1; N<$C+6; N++))
		do
			STR2=`awk NR==${N} $INPUTFILE`; 
#Call the USM to compare them.
			USM=`java -jar $USM_JAR -compare -str1="$line" -str2="$STR2"`
#Call NCD to compare them
#			NCD=`$NCD_COMMAND "$line" "$STR2"`
#NCD outputs the second string on the command line before the score; we need to remove it.
#			NCD=${NCD/$STR2}
#If the threshold similarity is greater than the specified value, output info to the output file.
			if [[ "$USM" < "$MINSIM" ]];
			then
				echo "Found similarity"
				echo $line | sed -n 's/.*<owners><own_owner_id>\(.*\)<\/own_owner_id>.*/\1/p'>>$OUTFILE
				echo $STR2 | sed -n 's/.*<owners><own_owner_id>\(.*\)<\/own_owner_id>.*/\1/p'>>$OUTFILE
				echo "">>$OUTFILE
			fi
		done
	fi
		 
done

#Display the output file.
`gedit $OUTFILE`

echo "Done!"
exit

This is successfully producing a list of candidate matches right now, outputting the ids of the two candidates followed by a blank line, for each candidate match.

Permalink 01:27:47 pm, by jnazar, 20 words, 34 views   English (CA)
Categories: Activity log; Mins. worked: 15

Hispanic and Italian Studies - Cascade website

Arranged and confirmed meeting next week with DF, DR, SA and myself to discuss next steps with their Cascade website.

Permalink 01:24:15 pm, by jnazar, 13 words, 34 views   English (CA)
Categories: Activity log; Mins. worked: 15

Accounting - HB

Received payment for HB's 2012 1st quarter.
Deposited payment; receipt filed in HCMC records.

All HCMC Blogs

Actions

Reports

Categories

All HCMC Blogs

Transformer blog

Work on this blogging tool

Image Markup Tool blog

HCMC Project Management

Nxaʔamxcín (Moses) Dictionary Blog

Maintenance

FrancoToile

Mariage

Administration

Academic

Depts

Scandinavian-Canadian Studies

EMLS

Scraps

Image Markup and Presentation

Update of Humanities Sites

viHistory

Vacation, Hours and Sickday Log

Times Colonist Transcript Database

Devonshire

CMC Research Collective

Moodle

Humanities Project Showcase

Peter's blog

teiJournal

Projects

Professional Development

Colonial Despatches

Coup De Des - GUI for concrete poem

Capital Trials at the Old Bailey

Agenda Class Timetabling

Lansdowne Lectures

German Medical Exams

Canadian Mysteries

Map Of London

MyNDIR

Canadian Journal of Buddhist Studies

Adaptive Database

Myths on Maps

Properties

Cascade

Vesalius

DHSI

History of the Philosophy of Language

A City Goes to War

Landscapes of Injustice

April 2012
Sun Mon Tue Wed Thu Fri Sat
 << < Current> >>
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30          

XML Feeds