6.0461 CETH 1993 Summer Seminar on Humanities E-Texts (1/304)

Wed, 27 Jan 1993 17:34:18 EST

Humanist Discussion Group, Vol. 6, No. 0461. Wednesday, 27 Jan 1993.

Date: Wed, 27 Jan 1993 17:18 EST
Subject: CETH 1993 Summer Seminar on Electronic Texts in the Humanities


Electronic Texts in the Humanities: Methods and Tools

The Second Annual Summer Seminar

at Princeton University, New Jersey
August 1-13, 1993

organized by
The Center for Electronic Texts in the Humanities, Princeton and Rutgers
with the co-sponsorship of
the Centre for Computing in the Humanities, University of Toronto

The Center for Electronic Texts in the Humanities (CETH) is again offering an
intensive two-week seminar during August 1993. The seminar will address a
wide range of challenges and opportunities that electronic texts and software
offer to teachers, scholars and librarians in the humanities. Discussions on
the capture, markup, retrieval, presentation, transformation, and analysis of
electronic text will prepare students for extensive hands-on experience with
illustrative software, e.g., MTAS, Micro-OCP, WordCruncher, Tact, and
hypertext. Resources on CD-ROM and Internet, such as the OED, Perseus, CDWORD,
and several large textual collections in classical Greek, Latin, French,
Italian, and English, will be demonstrated so that participants may make
informed evaluations of their significance in the light of current and future
technologies. Approaches to markup, from ad hoc schemes to the systematic
design of the Text Encoding Initiative, will be surveyed and considered.

The focus of the Seminar will be practical and methodological, with the
immediate aim of assisting participants in their own teaching, research, and
advising. It will be concerned with the demonstrable benefits of using
electronic texts, with typical problems and how to solve them, and with the
ways in which software fits or can be adapted to common methods of textual
study. Participants will be expected to work on coherent projects, preferably
of their own devising, and will be given the opportunity to present them on
the last day.

Throughout the Seminar, the instructors will provide assistance with designing
projects, locating sources for texts and software, and solving practical
problems. Ample computing facilities will be available 24 hours per day.
A small library of essential articles and books in humanities computing will
be on hand to supplement printed seminar materials, which include an extensive

Special lectures will describe current research in the field and address
research topics, as well as the role of the library in the use of electronic

The Seminar is intended for faculty, students, librarians, technical advisers,
and academic administrators with direct responsibilities for humanities
computing support. It assumes basic computing experience but not necessarily
with the application of computers to academic research and teaching. The
number of participants will be limited to 30.

Provisional Schedule

Week 1, August 1-6, 1993

Sunday, August 1. Registration and introductions

Monday, August 2. The electronic text

a.m. What electronic texts are and where to find them; survey of
existing inventories, archives, and other current resources.
History of computer-assisted text analysis in the humanities.
Introduction to simple concordancing with MTAS, including
practical session.

p.m. Creating and capturing texts in electronic form; keyboard entry
vs. optical scanning. Demonstration of optical
character-recognition technology. Introduction to text encoding,
surveying ad hoc methods, e.g. COCOA, WordCruncher, TLG beta code;
problems of these methods. Practical exercise in deciding what to
encode in typical texts.

Tuesday, August 3. Concordancing

a.m. A focussed look at computer-assisted concordance generation; types
of concordances, their specific advantages and disadvantages.
Alphabetization, character sequences, sorting, and forms of
presentation. Introduction to Micro-OCP; practical session in its

p.m. Further work on concordancing with Micro-OCP.

Wednesday, August 4. The interactive concordance

a.m. Indexed, interactive retrieval vs. batch concordance generation.
Textual problems and interpretative approaches particularly
suitable to an interactive system; the continuing use of
concordances in hardcopy. Preparation of text for indexed
retrieval; differing roles of markup and external "rules"; kinds
of displays and their augmentation through post-processing.
Introduction to Tact.

p.m. Practical work using Tact: simple markup, compilation of a textual
database, and methods of inquiry.

Thursday, August 5. Stylistics; SGML

a.m Stylistic comparisons and authorship studies using concordance
tools; basic statistics for lexical and stylistic analysis. Case
studies, e.g. Federalist Papers, Kenny on Aristotle, Burrows on
Jane Austen.

p.m. Introduction to the Standard Generalized Markup Language (SGML)
and the Text Encoding Initiative (TEI). Document structure and
SGML elements. Start-tags, end-tags, and empty tags. Document
type declarations. Group tagging of simple examples. SGML
entities and their uses: character representation, boilerplate
text, file management. Introduction to TEI Core tags and base
tags for prose. Group tagging of examples using TEI tags.

Friday, August 6. SGML and TEI

a.m. The TEI Header: documentation for electronic texts. The file
description; the encoding description; the text profile; the
revision history. Overview of the TEI DTDs: base tag sets,
additional tag sets, and auxiliary document types.

p.m. Using TEI in practice. Overview of available commercial and
public-domain software (the latter will be distributed to
participants). Creating TEI texts; validation; processing. Tools
for processing SGML texts: commercial and public-domain.
Examples: translating a TEI text into COCOA (for OCP),
Word-Cruncher format, TACT format. Practical session creating and
validating TEI-conformant texts.

Week 2, August 9-13, 1992

Monday, August 9. Scholarly editions

a.m. Overview of tools for preparing critical editions. Constructing
glossaries and material for commentary; application of Micro-OCP
and/or Tact. Collation; single-text vs. multiple-text methods.
Overview of software tools. Introduction to Collate.

p.m. Electronic publication. Discussion of methods and implications.

Tuesday, August 10. Electronic Dictionaries

a.m. The electronic dictionary; from machine-readable dictionary to
computational lexicon. What the New OED and other online
dictionaries can do for the scholar. Uses of lexical knowledge
bases in text retrieval. Building a simple online lexicon with

p.m. Individual project work.

Wednesday, August 11. Hypertext

a.m. Hypertext and hypermedia: techniques of presentation and
organization of textual data for analysis; possible combinations
of hypertext and concordancing methods. Reading and writing the
hypertextual book; hypertextual note-taking and annotating.
Practical introduction to constructing a hypertext.

p.m. Further practical session on building a hypertextual system.
Demonstration and discussion of Perseus, StorySpace and Voyager

Thursday, August 12. Evaluation; Projects

a.m. Review of the previous week's work. Discussion on the limitations
of existing software. Advanced analytical tools not commonly
available, e.g. pattern recognizers, lemmatization systems,
morphological analyzers, parsers; overview of these. The
contributions of computational linguistics and artificial
intelligence, and where research in these areas is headed.
Examination of some existing resources.

p.m. Completion of project work.

Friday, August 13. Projects

a.m. Presentation of participants' projects.

p.m. Concluding discussion of basic questions. What from a scholarly
and methodological perspective is to be gained? What are the
probable effects on research and teaching? What can one learn
from the collision of automatic methods with intuitive
perceptions? What it is the role of humanities computing: merely
an efficient facilitator of traditional work or a fundamental
component for pursuing new questions? Where do we go from here
with software, and with its application? How can the machine
better assist us in educating the imagination?

The Center for Electronic Texts in the Humanities

The Center for Electronic Texts in the Humanities was established in October
1991 by Rutgers and Princeton Universities with external support from the
Mellon Foundation and the National Endowment for the Humanities. As a
national focus of interest in the U.S. for those who are involved in the
creation, dissemination and use of electronic texts in the humanities, it also
acts as a national node on an international network of centers and projects
which are actively involved in the handling of electronic texts. Developed
from the international inventory of machine-readable texts which was begun at
Rutgers in 1983 and is held on RLIN, the Center is now reviewing the records
in the inventory and continues to catalog new texts. The acquisition and
dissemination of text files to the community is another important activity,
concentrating on a selection of good quality texts which can be made available
over Internet with suitable retrieval software and with appropriate copyright
permission. The Center also acts as a clearinghouse on information related to
electronic texts, directing enquirers to other sources of information.


The seminar will be taught by Susan Hockey and Willard McCarty, with
assistance from Michael Sperberg-McQueen (SGML and TEI), Elli Mylonas
(Hypertext) and staff of Computing and Information Technology, Princeton.

Susan Hockey is Director of the Center for Electronic Texts in the Humanities.
Before moving to the USA in October 1991, she spent 16 years at Oxford
University Computing Service where her most recent position was Director of
the Computers in Teaching Initiative Centre for Textual Studies. At Oxford
she was responsible for various humanities computing projects including the
development of the Oxford Concordance Program (OCP), an academic typesetting
service for British universities, and OCR scanning. She has taught courses on
humanities computing for fifteen years and has given numerous guest lectures
on various aspects of computing in the humanities. She is the author of three
books and numerous articles on humanities computing and has been Chair of the
Association for Literary and Linguistic Computing since 1984. She is a member
(currently Chair) of the Steering Committee of the Text Encoding Initiative.

Willard McCarty has been active in humanities computing since 1977. With its
founding Director, Ian Lancashire, he helped to set up the Centre for
Computing in the Humanities, University of Toronto, of which he is now the
Assistant Director. He was the founding editor of Humanist, the principal
electronic seminar for computing humanists, and has edited several other
publications in the field. He regularly gives talks, papers, and lectures
throughout North America and Europe. McCarty took his Ph.D. in English
literature in 1984; his current literary research is in classical studies,
especially the Metamorphoses of Ovid. In support of a forthcoming book, he has
an electronic edition of that poem underway for the text-retrieval program

Elli Mylonas is a Research Associate in Classics at Harvard University, and is
currently the Managing Editor of the Perseus Project. She has co-taught
tutorials on "Teaching with Hypertext" at the Hypertext meetings in San
Antonio and Milan (1991, 1992). In addition to coordinating the Perseus
Project, her responsibilities cover the creation and structuring of the
textual component of the project, and working together with the user interface
designers and documentation specialists. She is the project leader for
Pandora, a Macintosh search program for the TLG and PHI disks. Elli Mylonas
is a founding member and one of the two organizers of CHUG (Computing in the
Humanities User's Group), a humanities computing seminar that has been meeting
biweekly at Brown University for the last 4 years. She is also on the Text
Representation Committee of the Text Encoding Initiative, where she has worked
on identifying SGML structures for tagging reference systems, drama and verse
in literary texts. She has published and spoken on hypertext, descriptive
markup and literary texts, and the use of computers in education.

C. M. Sperberg-McQueen studied Germanic medieval literature in the comparative
literature program at Stanford University; since 1980 he has been working to
bring computing technology to bear on problems of textual research. In 1985
and 1986, he served as a consultant for humanities computing in the Princeton
University Computer Center; since 1987 he has worked at the academic computer
center at the University of Illinois at Chicago, where he is now a senior
research programmer. He is a member of the steering committee, and the
editor in chief, of the Text Encoding Initiative.


The cost of participating in this Summer Seminar will be $895, including
tuition, use of computer facilities, student accommodation, breakfast and
lunch at Princeton for the two weeks, and banquet and reception. Students pay
a reduced rate of $795. For those who prefer hotel accommodations, the cost
is $645 to cover tuition, lunch, the banquet and reception, and $565 for
students. There will be 24-hour access to networked microcomputers in the
student accommodation throughout the seminar.

Application Procedure

To apply for participation in this Summer Seminar, submit a one-page
statement of interest. The statement should indicate (1) how
participation in the Seminar would be relevant for your teaching, research,
librarianship, advising or administrative work, and possibly that of your
colleagues; (2) what project you would like to undertake during the Seminar,
or what area of the humanities you would most like to explore; and (3) the
extent of your computing experience. Applications must be attached to a
cover sheet specifying your name, current institutional affiliation and
position, postal and email addresses, and phone and fax numbers, as available,
as well as natural language interest and computing experience. Currently
enrolled students must also include a photocopy of a valid student ID. E-mail
submissions should have a subject line `Summer Seminar Application'. The
statement must be received by the reviewing committee, consisting of members
of the Center's Governing Board, by APRIL 15, 1993, at the address below.
Those who have been selected to attend will be notified by May 15, 1993.
Payment will be requested at this time.

Summer Seminar 1993
Center for Electronic Texts phone: (908) 932-1384
in the Humanities fax: (908) 932-1386
169 College Avenue bitnet: ceth@zodiac
New Brunswick, NJ 08903 internet: ceth@zodiac.rutgers.edu