7.0058 Text Models (1/40)

Wed, 16 Jun 1993 15:46:36 EST

Humanist Discussion Group, Vol. 7, No. 0058. Wednesday, 16 Jun 1993.

Date: 15 Jun 1993 11:53:38 +0000 (GMT)
From: Arjan Loeffen C&L/RUU <Arjan.Loeffen@let.ruu.nl>
Subject: PhD Chapter on Text Models

I am almost finished writing a chapter on text models for my PhD
thesis on the definition of a textbase management system. I guess
the chapter will be ready for comment in july 1993. I am willing
to put the text in PostScript on FTP, but only if some people are
interested, and if I may expect the readers to comment on the text
(as it is, as they say, 'classified information'). I would like
to hear from the HUMANIST subscriber (or any scolars in the field
of electronic text handling) if he/she is interested in the
draft chapter.

The text models are:

TDM (Desai, '85)
P-string (Gonnet & Tompa, '87; Gyssens etc, '89; Tague etc, '91)
PAT (Baeza-Yates & Gonnet, '89; Salminen & Tompa, '91)
Bayan (King, '90)
TOMS (Deerwester, '92)
Containment model ("THS") (Burkowsky, '92)
MdF (Doedens, '93)

Note that the chapter is not a software description nor -comparison.
It deals with the way electronic texts are perceived by the designers,
and the models are described in terms of text structure, operations,
and constraints.

Other products that support an implicit text 'model' (full text
retrieval systems, concordance programs, wordprocessors and the like)
are treated separately, and will not be included in the chapter.
As the TBMS is founded on SGML, this model is treated separately
in a complete chapter, and is not included either.

The report will be about 60 pages long.
If you don't have a PostScript printer | previewer, only a very
rough MRT may be obtained from FTP.


Arjan Loeffen
(Computer & Humanities, Utrecht, The Netherlands)