Humanist Discussion Group

Humanist Archives: Dec. 23, 2021, 9:40 a.m. Humanist 35.418 - events: tchnologies for historical & ancient languages

				
              Humanist Discussion Group, Vol. 35, No. 418.
        Department of Digital Humanities, University of Cologne
                      Hosted by DH-Cologne
                       www.dhhumanist.org
                Submit to: humanist@dhhumanist.org




        Date: 2021-12-22 12:26:45+00:00
        From: Sprugnoli Rachele (rachele.sprugnoli) <rachele.sprugnoli@UNICATT.IT>
        Subject: CFP: Second Workshop on Language Technologies for Historical and Ancient LAnguages (LT4HALA 2022) - EvaLatin - EvaHan

Second Workshop on Language Technologies for Historical and Ancient LAnguages (LT4HALA 2022) - EvaLatin - EvaHan
https://circse.github.io/LT4HALA/2022

   *     Place: co-located with LREC 2022, Marseille, France
   *     Date: 25 June 2022 (post-conference workshop)

LT4HALA 2022 is a one-day workshop that seeks to bring together scholars
who are developing and/or are using Language Technologies (LTs) for
historically attested languages, so to foster cross-fertilization
between the Computational Linguistics community and the areas in the
Humanities dealing with historical linguistic data, e.g. historians,
philologists, linguists, archaeologists and literary scholars. LT4HALA
2022 follows LT4HALA 2020 that was organized in the context of LREC 2020
(proceedings: https://aclanthology.org/volumes/2020.lt4hala-1/
<https://aclanthology.org/volumes/2020.lt4hala-1/>). Despite the current
availability of large collections of digitized texts written in
historical languages, such interdisciplinary collaboration is still
hampered by the limited availability of annotated linguistic resources
for most of the historical languages. Creating such resources is a
challenge and an obligation for LTs, both to support historical
linguistic research with the most updated technologies and to preserve
those precious linguistic data that survived from past times.

Relevant topics for the workshop include, but are not limited to:

    *     handling spelling variation;
   *     detection and correction of OCR errors;
   *     creation and annotation of digital resources;
   *     deciphering;
   *     morphological/syntactic/semantic analysis of textual data;
   *     adaptation of tools to address diachronic/diatopic/diastratic
     variation in texts;
   *     teaching ancient languages with NLP tools;
   *     NLP-driven theoretical studies in historical linguistics;
   *     evaluation of NLP tools.

       SHARED TASKS

LT4HALA 2022 will hosts two shared tasks:

    *     the second edition of EvaLatin, an evaluation campaign entirely
     devoted to the evaluation of NLP tools for Latin. The second edition
     of EvaLatin will focus on three tasks (i.e. Lemmatization, PoS
     tagging, and Morphological Feature Identification), each featuring
     three sub-tasks (i.e. Classical, Cross-Genre, Cross-Time).

   *     the first edition of EvaHan, the first evaluation campaign for the
     evaluation of NLP tools for Ancient Chinese. EvaHan first edition
     has one task (i.e. a joint task of Word Segmentation and POS Tagging).

Training data for both shared tasks are available on the conference website:

   *     EvaLatin 2022 training data:
     https://circse.github.io/LT4HALA/2022/EvaLatin#training-data

   *     EvaHan 2022 training data:
     https://circse.github.io/LT4HALA/2022/EvaHan#training-data


       SUBMISSIONS

For the workshop, we invite papers of different types such as
experimental papers, reproduction papers, resource papers, position
papers, survey papers. Both long and short papers describing original
and unpublished work are welcome. Long papers should deal with
substantial completed research and/or report on the development of new
methodologies. They may consist of up to 8 pages of content plus 2 pages
of references. Short papers are instead appropriate for reporting on
works in progress or for describing a singular tool or project. They may
consist of up to 4 pages of content plus 2 pages of references.

We encourage the authors of papers reporting experimental results to
make their results reproducible and the entire process of analysis
replicable, by making the data and the tools they used available. The
form of the presentation may be oral or poster, whereas in the
proceedings there is no difference between the accepted papers. The
submission is NOT anonymous. The LREC official format is requested. Each
paper will be reviewed but three independent reviewers.

As for EvaLatin and EvaHan, participants will be required to submit a
technical report for each task (with all the related sub-tasks) they
took part in. Technical reports will be included in the proceedings as
short papers: the maximum length is 4 pages (excluding references) and
they should follow the LREC official format. Reports will receive a
light review (we will check for the correctness of the format, the
exactness of results and ranking, and overall exposition). All
participants will have the possibility to present their results at the
workshop: we will allocate an oral session and a poster session fully
devoted to the shared tasks in the afternoon.


       IMPORTANT DATES

Workshop

   *     8 April 2022: submission due
   *     29 April 2022: reviews due
   *     3 May 2022: notifications to authors
   *     24 May 2022: camera-ready (PDF) due

Shared Tasks - PLEASE NOTE THAT NO EXTENSION IS PLANNED FOR THE SHARED TASKS

EvaLatin

   *     20 December 2021: training data available

   *     Evaluation Window I - Task: Lemmatization

       o         17 March 2022: test data available
       o         23 March 2022 system results due to organizers

   *     Evaluation Window II - Task: PoS tagging

       o         24 March 2022: test data available
       o         30 March 2022: system results due to organizers

   *     Evaluation Window III - Task: Features tagging

       o         31 March 2022: test data available
       o         6 April 2022: system results due to organizers

   *     26 April 2022: reports due to organizers
   *     10 May 2022: short report review deadline
   *     24 May 2022: camera ready version of reports due to organizers

EvaHan

   *     20 December 2021: training data available

   *     Evaluation Window

       o        31 March 2022: test data available
       o         6 April 2022: system results due to organizers

   *     26 April 2022: reports due to organizers

   *     10 May 2022: short report review deadline

   *     24 May 2022: camera ready version of reports due to organizers

Identify, Describe and Share your LRs!

   *     Describing your LRs in the LRE Map is now a normal practice in the
     submission procedure of LREC (introduced in 2010 and adopted by
     other conferences). To continue the efforts initiated at LREC 2014
     about “Sharing LRs” (data, tools, web-services, etc.), authors will
     have the possibility,  when submitting a paper, to upload LRs in a
     special LREC repository.  This effort of sharing LRs, linked to the
     LRE Map for their description, may become a new “regular” feature
     for conferences in our field, thus contributing to creating a common
     repository where everyone can deposit and share data.

   *     As scientific work requires accurate citations of referenced work so
     as to allow the community to understand the whole context and also
     replicate the experiments conducted by other researchers, LREC 2022
     endorses the need to uniquely Identify LRs through the use of the
     International Standard Language Resource Number (ISLRN,
     www.islrn.org

     Workshop Organizers

   *     Marco Passarotti, Università Cattolica del Sacro Cuore, Milan, Italy

   *     Rachele Sprugnoli, @RSprugnoli, Università Cattolica del Sacro
     Cuore, Milan, Italy


῀...`

     Contact

rachele.sprugnoli[AT]unicatt.it

Please, write “LT4HALA” or “EvaLatin” in the subject of your e-mail.

For more information on EvaHan, please write to libin.njnu[AT]gmail.com
writing “EvaHan” in the subject of the e-mail.

Follow @ERC_LiLa and the hashtag #LT4HALA2022 on Twitter for updates.




_______________________________________________
Unsubscribe at: http://dhhumanist.org/Restricted
List posts to: humanist@dhhumanist.org
List info and archives at at: http://dhhumanist.org
Listmember interface at: http://dhhumanist.org/Restricted/
Subscribe at: http://dhhumanist.org/membership_form.php