        Date: 2022-05-16 13:46:31+00:00
        From: Gabor Toth <>
        Subject: Re: [Humanist] 36.12: working with unnormalised historical texts

Dear Crystal,

Many thanks for your detailed answer; congratulations for working out all
this, which sounds great.

Could you please explain what you mean by the following two points:

1. "I hand coded the entries for part of speech using a
simplified Penn Tree Bank system, marked known irregulars for hand
processing, and built out the other possible variations algorithmically."

I understand the hand coding part but I am not sure about the following

2. "The result was a first draft of over 3 million forms"

That sounds like a very big number, by forms do you mean types?



