Potential dupe defs tagged with XSLT
Posted by mholmes on 25 Oct 2013 in Activity log
A couple of days ago I ran this to get a candidate list of erroneous defs which should have been quotes:
xquery version "3.0"; declare default element namespace "http://www.tei-c.org/ns/1.0"; declare namespace util="http://exist-db.org/xquery/util"; for $t in //TEI for $e in $t//entry let $quoteSegs := for $s in $e//quote/seg where string-length(normalize-space($s)) gt 0 return translate(normalize-space($s), '*', ''), $defs := for $d in $e//def/seg where string-length(normalize-space($d)) gt 0 return translate(normalize-space($d), '*', ''), $dupes := distinct-values($quoteSegs[.=$defs]) where count($dupes) gt 0 order by util:document-name($t), $e/@xml:id return concat(util:document-name($t/@xml:id), ' / ', $e/@xml:id, ' : ', string-join($dupes, ' | '))
I've now written XSLT (utilities/flag_superfluous_defs.xsl) to insert a comment before each of these candidates.