List of duplicate xml:ids in the db
Posted by mholmes on 19 Nov 2010 in Activity log
I was intending to generate and supply a complete list of duplicate ids in the db, but it turns out that there are 1355 of them, so it's not really helpful to list them. I've sent the full list to SML and ECH. However, here are the first hundred:
- mix: affix.xml, m.xml
- kaʔ: affix.xml, k.xml
- kiyˀ: affix.xml, n.xml
- ɬəm: affix.xml, t.xml
- maʔ: affix.xml, m.xml
- nas: affix.xml, k.xml
- sal: affix.xml, n.xml
- t: affix.xml, pron.xml
- taʔ: affix.xml, n.xml
- wap: affix.xml, ww-glot.xml
- xit: affix.xml, x.xml
- cʼalˀ: c-glot.xml, k.xml
- cʼalˀən: c-glot.xml, k.xml
- cʼəl: c-glot.xml, k.xml
- sn̩cʼaʔqatkW: c-glot.xml, rescued.xml
- cʼax: c-glot.xml, k.xml
- cʼaʔx: c-glot.xml, k.xml
- kcʼaʔxmenən: c-glot.xml, k.xml
- cʼalˀ_2: c-glot.xml, n.xml
- kən_nacʼalˀsən: c-glot.xml, n.xml
- skacʼcʼalˀ: c-glot.xml, k.xml
- n̩cʼəlˀcʼalˀsən: c-glot.xml, n.xml
- cʼaɬ: c-glot.xml, k.xml
- nacʼaɬən: c-glot.xml, n.xml
- n̩cʼcʼaɬn̩: c-glot.xml, n.xml
- ncʼcʼaɬənˀtxW: c-glot.xml, rescued.xml
- kcʼaʔɬn̩čut: c-glot.xml, k.xml
- sqʼəlˀnaskint: c-glot.xml, k.xml
- cʼař: c-glot.xml, n.xml
- cʼařt: c-glot.xml, k.xml
- cʼawˀ: c-glot.xml, k.xml
- sascʼawˀoxW_sqəlawʔ: c-glot.xml, q.xml
- snacʼawʔsən: c-glot.xml, n.xml
- nacʼawˀəlqWpm̩: c-glot.xml, n.xml
- nacʼawˀɬcʼaʔ: c-glot.xml, n.xml
- nacʼəwmən: c-glot.xml, n.xml
- neʔcʼawpqən: c-glot.xml, n.xml
- kacʼawˀəloptn̩: c-glot.xml, k.xml
- ʔawˀt: c-glot.xml, glottal.xml
- cʼow: c-glot.xml, n.xml
- cʼək: c-glot.xml, k.xml
- ncʼkʼcʼkʼax̣ən: c-glot.xml, n.xml
- cʼəkW: c-glot.xml, n.xml
- n̩cʼkWopsən: c-glot.xml, rescued.xml
- kcʼkWicʼaʔən: c-glot.xml, k.xml
- cʼəkʼW: c-glot.xml, k.xml
- kčˀəkʼWxən: c-glot.xml, k.xml
- cʼəl_2: c-glot.xml, k.xml
- necʼəlot: c-glot.xml, n.xml
- cʼəlˀ: c-glot.xml, n.xml
- cʼəlˀxW: c-glot.xml, n.xml
- cʼəɬ: c-glot.xml, k.xml
- cʼəɬt: c-glot.xml, k.xml
- n̩cʼaʔɬstonən: c-glot.xml, rescued.xml
- kɬcʼmˀosənc: c-glot.xml, rescued.xml
- kɬcʼəmcʼəmtwaxW: c-glot.xml, rescued.xml
- cʼən: c-glot.xml, n.xml
- nacʼə̣np: c-glot.xml, n.xml
- cʼəpq: c-glot.xml, n.xml
- neʔcʼəpq: c-glot.xml, n.xml
- cʼəpʼqʼ: c-glot.xml, n.xml
- kcʼəpʼqʼən: c-glot.xml, k.xml
- n̩cʼəpʼqʼsalos: c-glot.xml, n.xml
- cʼəqʼ: c-glot.xml, k.xml
- ncʼqʼaɬcʼaʔən: c-glot.xml, rescued.xml
- n̩cʼqʼosən: c-glot.xml, rescued.xml
- cʼəsən: c-glot.xml, k.xml
- katcʼsatkWən: c-glot.xml, k.xml
- cʼəxW: c-glot.xml, k.xml
- cʼəxWən: c-glot.xml, k.xml
- ka·cʼəxW: c-glot.xml, k.xml
- na·cʼəxWoxW: c-glot.xml, n.xml
- kcʼxWanaʔn: c-glot.xml, k.xml
- kcʼxWus: c-glot.xml, k.xml
- kcʼxWosč: c-glot.xml, k.xml
- snecʼxWawˀsoxW: c-glot.xml, rescued.xml
- sn̩cʼəxWoxWwel: c-glot.xml, rescued.xml
- skacʼacʼəxW: c-glot.xml, k.xml
- kɬn̩cʼxWapəntaʔ_t_šawɬkW: c-glot.xml, k.xml
- sxWskcʼx̣Wapl̥aʔəm: c-glot.xml, k.xml
- neʔcʼikos: c-glot.xml, n.xml
- cʼikʼ: c-glot.xml, k.xml
- nacʼepʼsmən: c-glot.xml, n.xml
- nacʼipʼcʼipʼšəm: c-glot.xml, n.xml
- cʼex̣WoxW: c-glot.xml, k.xml
- cʼqʼ: c-glot.xml, n.xml
- kʼɬcʼaqʼWonˀən: c-glot.xml, rescued.xml
- cʼaʔqʼWonˀəm: c-glot.xml, rescued.xml
- n̩cʼoʔqa·pasən: c-glot.xml, rescued.xml
- n̩cʼowˀqulˀoxWən: c-glot.xml, rescued.xml
- nacʼopkW: c-glot.xml, n.xml
- niʔcʼuqʼWuʔšən: c-glot.xml, rescued.xml
- leyən_t_ʔencʼoqʼWmaʔ: c-glot.xml, l.xml
- cʼos: c-glot.xml, k.xml
- scʼosəm: c-glot.xml, k.xml
- kcʼosəmtən: c-glot.xml, k.xml
- cʼxW: c-glot.xml, k.xml
- cəs: c-rtr.xml, k.xml
- ʔa: glottal.xml, particles.xml
- ʔacʼx̣: glottal.xml, n.xml
This XQuery will generate them (although it's off-the-cuff and doubtless much slower than it could be):
declare default element namespace "http://www.tei-c.org/ns/1.0"; declare namespace f="http://exist-db.org/f-functions"; declare namespace util="http://exist-db.org/xquery/util"; declare function f:getIds() as xs:string* { let $e := collection('/db/moses')//entry for $id in distinct-values($e/@xml:id[string-length(.) gt 0]) where count($e[@xml:id = $id]) gt 1 return xs:string($id) }; declare variable $ids := f:getIds(); for $i in $ids return concat($i, ': ', util:document-name(collection('/db/moses')//entry[@xml:id=$i][1]), ', ', util:document-name(collection('/db/moses')//entry[@xml:id=$i][2]))