Date: Sat, 26 Aug 89 20:25:27 EDT
From: Ian Lancashire <IAN@vm.epas.utoronto.ca>
Subject: Digitized Pages of Text

One strong argument for distributing both transcriptions AND
digitized images of literary texts is that they keep the
transcribers honest.

It's long been standard for thesis editions in some parts of the
world to format image and edited text as facing pages. One
notarious failure of existing text archives is uncertainty about
the accuracy of its texts. (This is no slight at those archives,
just recognition of a problem that begins with those of us who try
our hand at editing machine-readable texts.)

A second reason has to do with tagging. What the editor tags may
well not be what the user wants tagged. Distributing the image
along with the text allows the user to adjust the tagging to
her/his own needs.

The second meets a very serious need in modern English editorial
work. Editors in English Renaissance studies, for instance, argue
that word-spacing, kerning, broken typeface, and other aspects of
the visual layout of a text are just as important as stage
directions, speech prefixes, and running titles to understand the
transmission of a text. Yet many people would object to tagging at
this level of layout. Including digitized images of the pages we
transcribe gives all users the option of editing the texts

If we wish our text archives to meet the scholarly needs of the
current (and next) generation, we will have to include digitized
images of the copytext along with the transcribed text, as Roy
Flannagan has implied. Coffee stains have nothing to do with the