5.0633 New TEI Character & Writing System Doc. (1/119)

Elaine Brennan & Allen Renear (EDITORS@BROWNVM.BITNET)
Sun, 2 Feb 1992 20:38:36 EST

Humanist Discussion Group, Vol. 5, No. 0633. Sunday, 2 Feb 1992.

Date: 31 January 1992 15:28:22 CST
From: "Wendy Plotkin (312) 413-0331" <U49127@UICVM>
Subject: New TEI Character Set and WSD Document (TR1W4)

New Character Set and Writing System Declaration Documents (TR1W4)

>From Harry Gaylord, the TEI Character Sets work group chair

At the present time, character sets are woefully inadequate for the
needs of humanities scholars encoding texts. Though this will be
remedied by the creation of new character sets in the future, it will be
some time before these sets are widely supported in many computers and
across networks. It is not yet certain whether a single universal
character set will be agreed upon by all parties. Even if this occurs,
variants of that character set may exist. I do not hope that this
happens, but it is possible.

In the SGML standard, the entity mechanism was introduced, in part to
provide an alternative for the existing character sets' limitations. A
number of public entity sets were included in an appendix. In a new
ISO technical report (being issued in parts), work is being done by the
ISO central secretariat to expand these entity sets.

The TEI TR1 work group is also working on public entity sets for
scholarly purposes, and sharing its efforts with ISO. We intend to
make available public entity sets in TEI, and submit them for inclusion
in the ISO technical report.

If anyone would like to help with this work, please contact Harry
Gaylord (galiard@let.rug.nl). Currently, we are working on Hebrew and
Early Slavic sets. It is a chance to set standard use throughout the
computing scholarly community and make it universally available.

Also, if you have any corrections on the material already prepared
(available from TEI-L, as described below), I would appreciate hearing
from you.

The preliminary fruits of this effort are now available as TEI TR1W4.
TEI TR1W4 is available in marked-up form electronically from the
TEI-L file-server and in hard copy from the Chicago TEI office.
This 70-page document includes a general discussion of entities, entity
sets and writing system declarations; also, Appendices
incorporating the ISO and TEI entity sets and the TEI writing
system declarations.

Because of the length of the document, TR1W4 is available in parts on
the file-server. The general discussion is in the file "TR1W4 TEI1",
while the entity sets and writing system declarations are included as
individual files, as shown below (the TEI-L filename and filetype are in

TR1W4 Files on TEI-L File-server

General Discussion: Character entities and public entity sets
(TR1W4 TEI1)

1. ISO public entity sets from 8879

ISOlat1 for western European languages (ISOlat1 Entities)
ISOlat2 for other Latin based languages (ISOlat2 Entities)
ISOdia for diacritics (ISOdia Entities)
ISOgrk1 for basic modern Greek (ISOgrk1 Entities)
ISOgrk2 for accented letters in modern Greek (ISOgrk2 Entities)
ISOcyr1 for modern Russian Cyrillic (ISOcyr1 Entities)
ISOcyr2 for modern non Russian Cyrillic (ISOcyr2 Entities)
ISOnum for numeric and special graphic characters (ISOnum Entities)
ISOpub for publishing (ISOpub Entities)

2. new TEI public entity sets

TEIarb basic Arabic (TEIarb Entities)
TEIcop Coptic (TEIcop Entities)
TEIgrk Classic Greek supplement (TEIgrk Entities)
TEIipa for IPA phonetic characters for interchange
(TEIipa Entities)

3. two TEI writing system declarations (wsd)

TEIgreek wsd (TEIgreek wsd)
TEIphon wsd (TEIphon wsd)

These declarations include

a) entity names and
b) the hexidecimal values of the characters in the latest version
of the draft ISO standard 10646. This draft standard is
currently being voted upon.

The Greek writing system declaration also includes the Thesaurus
Linguae Grecae (TLG) codings.

To obtain a file from the TEI-L fileserver, send a note to
Listserv@UICVM with no subject, and the following message:

Get (Filename) (Filetype)

For example, to obtain the general discussion of TR1W4, send a note
to Listserv@UICVM with the message (and no subject):

Get TR1W4 TEI1

To obtain the list of all files available on the TEI-L file-server,
send a note to Listserv@UICVM with the message (and no subject):

Get TEI-L Filelist

To obtain the hard copy version from the TEI Chicago office, contact

Wendy Plotkin
Computer Center (M/C 135)
P.O. Box 6998
Chicago, IL 60680 USA
E-Mail: U49127@uicvm.bitnet
Phone: (312) 413-0331
Fax: (312) 996-6834

Harry Gaylord
Chair TEI TR1 (Work Group on Character Sets)