Research Articles

Archiving grammatical descriptions

  • Sebastian Nordhoff
  • Harald Hammarström


Language documentation projects produce and collect audio, video, and textual data, which they usually deposit in archives. Documenters’ understanding of best practices in archiving the primary content of their domain has made considerable progress over recent years. Methods for archiving derived content, such as dictionaries and especially grammatical descriptions, have received less attention. In this paper, we explore what the goals of archiving grammatical descriptions are, and what tasks an archive has to fulfill. We first discuss a number of parameters which help us to classify archives with regard to the objects they host and the roles they play in their user community. We argue that the text of grammatical descriptions should be archived in a fashion which allows retrieval of individual elements such as sections, paragraphs, and examples, and that for this to work, grammatical descriptions have to be provided with semantic markup. We discuss the Text Encoding Initiative (TEI), originally a philological enterprise, and the TEI tools which are useful for this purpose. Grammatical descriptions contain a number of elements which are not yet found in TEI, and we identify and describe these. We then discuss how annotation of both legacy and future grammatical descriptions can be accomplished...

Keywords: language documentation, archiving, derived content, grammatical descriptions, Text Encoding Initiative

How to Cite:

Nordhoff, S. & Hammarström, H., (2014) “Archiving grammatical descriptions”, Language Documentation and Description 12, 164-186. doi:

Download PDF



Published on
31 Jul 2014
Peer Reviewed