21.438 Language Technology for Cultural Heritage Data (LaTeCH 2008)

From: Humanist Discussion Group (by way of Willard McCarty willard.mccarty_at_kcl.ac.uk>
Date: Sat, 29 Dec 2007 21:40:37 +0000

               Humanist Discussion Group, Vol. 21, No. 438.
       Centre for Computing in the Humanities, King's College London
                     Submit to: humanist_at_princeton.edu

         Date: Thu, 27 Dec 2007 09:59:15 +0000
         From: dobreva_at_math.bas.bg
         Subject: LaTeCH 2008

                        CALL FOR PAPERS

                     LREC 2008 Workshop on

        Language Technology for Cultural Heritage Data
                          (LaTeCH 2008)

                          Special Theme:
           "Resources and Tools for Studying Language
                      Variety and Change"

                1 June, 2008, Marrakech, Morocco


               Submission deadline: 20 February 2008


The Second Workshop on Language Technology for Cultural Heritage Data
(LaTeCH 2008) will be held in conjunction with LREC 2008, and will
take place on June 1 in Marrakech, Morocco.


Museums, archives, and libraries around the world maintain large
collections of cultural and scientific heritage objects, such as
archaeological artefacts, audio and video recordings, or manuscripts,
archival documents and other written sources. Such collections are a
potentially very valuable resource for specialists and laypersons
alike, provided they can be easily accessed and automatically
processed. Furthermore, textual cultural heritage resources, such as
old manuscripts and early printed books, are not only interesting for
their information content, but are also an invaluable source for
linguistic research on diachronic and synchronic language variety and
change. While several large scale digitisation projects are currently
underway to make cultural heritage resources more accessible, it is
equally important to develop powerful tools to search, link, enrich,
and mine the digitised data. Language technology has a crucial role to
play in this, even for collections which are primarily non-textual,
since text is the pervasive medium used for meta-data. At the same
time, the cultural heritage domain poses special challenges for the
NLP community, including the use of historic or non-standard language,
the presence of OCR or transcription errors in the data, and the
necessity to deal with data from various media.

For LaTeCH 2008, we invite papers on language technology for cultural
heritage data in general and on the special theme of "Studying
Language Variety and Change". Topics of interest include, but are not
limited to, the following:

     - enriching cultural heritage data by inducing meta-data
     - dealing with linguistic variation and non-standard or
       non-contemporary use of language
     - automatic error detection and cleaning
     - adapting existing NLP tools for the cultural heritage domain
     - linking and retrieving information from different sources, media,
       and languages
     - representing cultural heritage data to different audiences
       (personalisation, text simplification, text summarisation, text
       generation from databases, hypertext generation)
     - knowledge discovery in cultural heritage data
     - complex annotation tools
     - determination of word and sentence boundaries within manuscripts
     - resources for and treatment of dialects (general solutions)
     - annotations of language variety on the orthographic, morphological,
       and syntactic level
     - global language resource management systems
     - repositories of cultural and scientific heritage data

Received on Sat Dec 29 2007 - 16:54:26 EST

This archive was generated by hypermail 2.2.0 : Sat Dec 29 2007 - 16:54:26 EST