Humanist Discussion Group

Humanist Archives: May 17, 2022, 6:18 a.m. Humanist 36.17 - working with unnormalised historical texts

              Humanist Discussion Group, Vol. 36, No. 17.
        Department of Digital Humanities, University of Cologne
                      Hosted by DH-Cologne
                Submit to:

        Date: 2022-05-16 13:46:31+00:00
        From: Gabor Toth <>
        Subject: Re: [Humanist] 36.12: working with unnormalised historical texts

Dear Crystal,

Many thanks for your detailed answer; congratulations for working out all
this, which sounds great.

Could you please explain what you mean by the following two points:

1. "I hand coded the entries for part of speech using a
simplified Penn Tree Bank system, marked known irregulars for hand
processing, and built out the other possible variations algorithmically."

I understand the hand coding part but I am not sure about the following

2. "The result was a first draft of over 3 million forms"

That sounds like a very big number, by forms do you mean types?



Unsubscribe at:
List posts to:
List info and archives at at:
Listmember interface at:
Subscribe at: