6.0368 Info wanted on Manuscript Scanning (1/162)

Fri, 20 Nov 1992 17:51:26 EST

Humanist Discussion Group, Vol. 6, No. 0368. Friday, 20 Nov 1992.

Date: Fri, 20 Nov 1992 15:44:07 +0100
From: Knut Hofland <knut@x400.hd.uib.no>

Bergen, 19 November 1992


In cooperation with Oxford University Press the Wittgenstein
Archives at the University of Bergen is currently undertaking
a feasibility study of high-quality scanning of manuscripts
for publication on CD-ROM.

With this letter we hope to get in touch with suppliers of
relevant hardware and software, consultancy services, scanning
services, and others who might provide us with experience from
similar projects, offer products or services, etc.

Ludwig Wittgenstein, the Austrian philosopher who is regarded
as the perhaps most important philosopher of our century, left
behind approx. 20,000 manuscript pages when he died in 1951.
Many of the manuscripts are unpublished, and accessible only
by visits to the archival institutions or in photocopies of
variable quality.

We are considering the publication of a facsimile CD-ROM with
bit-mapped raster images of the collected papers as a possible
first step in this process. At high resolution and in full
colors, an electronic facsimile would benefit from the
improved reproduction quality and facilitated access of
electronic compared to traditional media, thus serving both
long-term archival purposes and more short-term needs of
individual scholars. Later, the electronic facsimile would be
supplemented with transcriptions currently in progress at the
Wittgenstein Archives.

We intend to scan from the original manuscript volumes and
store the images in a high-definition primary format for
archival purposes. Since such a format may make too high
demands on storage space and processing speed on the part of
potential users' work stations today, we will consider a lower
quality for the distribution format to be used on the CD-ROM.

Our most pressing need right now is for a high-quality, fast,
flexible and robust scanning system which provides

- absolutely lenient handling of the manuscripts
- a pixel value of at least 4 (preferably 8, possibly 24)
bits at 75 - 300 dpi ("real" colors, no dithering)
- full user control over features such as contrast,
threshold level, color correction, etc.
- a widely used image file format and compression /
decompression algorithm like TIFF, GIF, JPEG or the like,
for which there is a wide range of off-the-shelf software
available, and which goes well with DOS systems as well
as Macintosh, UNIX, and possibly other environments
- simple and robust data management facilities such as
procedures for indexing and backup
- a fast, high-capacity storage medium

According to these requirements the scanner seems to be the
most critical point. The images will have to be captured on
location in the archival institutions. (Bodleian Library
(Oxford), Trinity College Library (Cambridge), Austrian
National Library (Vienna))

Most of the manuscripts occur in bound volumes, in sizes
varying from small pocket notebooks to large ledgers, none of
which exceed A3. Some manuscripts occur on loose sheets or
folios of sizes varying from small postcard-sized to large,
A2-sized sheets.

Writing utensils applied are typewriter (approx 15%), and
variously colored ink or lead pencils. Paper qualities include
blank paper, lined paper, paper of various colors etc. The
contrast between writing and background is often low. There is
occasional damage, stain, bleed-through from verso pages, and
the like.

Our aim is a reproduction quality which will allow scholars to
read the text from the scanned image without recourse to
inspection of the originals in all but exceptional cases. At
this stage, image enhancement techniques are of interest to
the extent that readability may be improved. (Only at a later
stage may image processing techniques be relevant for the
automatic extraction of textual elements.)

What has been said above has convinced us that neither black
and white nor grey scale scanning will suffice - colors are an
absolute requirement. We are more uncertain about the
resolution, however, and would like to perform some
experiments on this point.

Moreover, the originals are fragile and should be handled with
the utmost care. Flatbed scanners and sheetfeed scanners are
out of the question. The scanner should be able to focus all
parts of loose sheets, which may be creased or crumpled, and
also of pages of bound volumes, which tend to bend because of
the binding. We assume that a camera scanner or a digital
camera is required.

We would prefer digital scanning directly from the original
documents. However, we are also prepared to consider
photographic reproduction, if only we can be assured that the
photographic images can be scanned with the required quality.

Time considerations are also important: we are hoping for a
total throughput of at least 20 pages per hour. Therefore, a
fast and tightly integrated system of hardware and software,
i.e. a "scanning work station" is what we really need.

Other aspects of this project on which we would like to be
informed are suitable image database systems for CD-ROM, and
premastering, mastering and production of CD-ROM. We would
also like to be informed if other high-density storage media
might be relevant for our purpose.

A final agreement between all parties concerned, essentially
including also the Trustees in the copyright of the
Wittgenstein Papers, as to if, when, and in what form the
scanning project will be implemented will be sought shortly
after the feasibility study has been finished. We expect the
recommendations given by the feasibility study to be decisive
for this agreement.

Our schedule is very tight: The feasibility study will be
carried out in the period November 18 - December 17, 1992.

So if this is of interest to you and you think that you may
provide us with useful information on services or products,
either through yourself or by directing us to others, please
contact us at your earliest convenience.

We are looking forward to hearing from you.

Yours sincerely,

Claus Huitfeldt

Please contact:

Claus Huitfeldt
The Wittgenstein Archives at
the University of Bergen
Harald Haarfagresgt 31
N-5007 Bergen

Tel: +47 (0)5 212950

Fax: +47 (0)5 322656
e-mail: claus@pc.hd.uib.no

Oystein Reigem
Norwegian Computing Centre
for the Humanities
Harald Haarfagresgt 31
N-5007 Bergen

Tel: +47 (0)5 213242
or +47 (0)5 212954/55/56
Fax: +47 (0)5 322656
e-mail: oystein@pc.hd.uib.no