XPath Practice (1)

Which XPath expression will find all the paragraphs in the text of a TEI document?

//p
//text/descendant::p
//text/ancestor::p
//p[ancestor::text]

If the current context is the <body> element of a TEI file, how would you find the @xml:id attribute on the root TEI element?

ancestor::TEI[@xml:id]
ancestor::TEI/@xml:id
//@xml:id

How could you find all paragraphs which contain the word "elephant"?

//p['elephant']
p[//elephant]
//p[contains(., 'elephant')]

How might you find all <list> elements which have more than one child <item>?

//list[item gt 1]
count(//list/item) gt 1
//list[count(item) gt 1]

How could you find all the <cit> elements which have a <quote> but no <ref>?

//cit[quote and not(ref)]
//cit[child::quote][nochild::ref]
//cit[with(child) and without(ref)]
//quote[parent::cit and not(ref)]

How might you find all ref elements that link to an external source on the Web?

//ref[@type='external']
//ref[@target]
//ref[starts-with(., 'http')]
//ref[starts-with(@target, 'http')]

If a <date> element has a @calendar attribute, but it has no textual content, something is wrong (since @calendar identifies what calendar is used for the textual content of the date). How could you find such errors?

//date[@calendar and not(text)]
//date[@calendar and string-length(normalize-space(.)) lt 1]
//date/calendar[not(text())]

The previous question was: "If a <date> element has a @calendar attribute, but it has no textual content, something is wrong (since @calendar identifies what calendar is used for the textual content of the date). How could you find such errors?"

Now imagine that you must account for <birth> and <death> elements as well as <date>s. How would you do this in one expression?

//*[self::date or self::birth or self::death][@calendar and string-length(normalize-space(.)) lt 1]
//date[self::birth or self::death][@calendar and string-length(normalize-space(.)) lt 1]
//birth or death or date[@calendar and string-length(normalize-space(.)) lt 1]

Imagine that your transcribers/encoders are making a diplomatic transcription using (for example) the long s character ſ. However, you want to make sure they don't use that character in metadata in the TEI Header. How can you find examples of this?

//teiHeader[ſ]
//teiHeader/descendant::text()[contains(., 'ſ')]
//teiHeader/descendant::text[contains(., 'ſ')]

Your encoders are using the @next and @prev attributes to link elements together. You have decided that every @next must be matched with @prev. What's a simple way to check that the number of @next and @prev attributes in a document is the same?

//@next = //@prev
//@next = count(@prev)
count(//@next) = count(//@prev)

XPath Practice (1)

Quiz