Annotation guidelines for dates
Introduction
Date segments can be recognized by the header
model (for example the date of publication, or the online date) or by the citation
model. These date segments are further structured by the date
model which will try to normalize the date into the standard ISO 8601 format.
For convenience, the structuring of the dates (training files *.date.tei.xml
) does (exceptionally) not follow the TEI but a basic XML format based on <day>
, <month>
and <year>
elements.
Analysis
Each date is enclosed in a <date>
element. Day, month and year are identified respectively with <day>
, <month>
and <year>
elements. Additional text/characters that do not belong to one of these specific elements (punctuations, etc.) must be left untagged under the <date>
elements.
For example:
<?xml version="1.0" encoding="UTF-8"?>
<dates>
<date>Received <month>August</month> <day>17</day>, <year>2005</year>. </date>
</dates>