7.0617 Susanne Corpus

Wed, 13 Apr 1994 00:00:25 EDT

Humanist Discussion Group, Vol. 7, No. 0617. Wednesday, 13 Apr 1994.

Date: 12 Apr 1994
From: ide@cs.vassar.edu (Nancy M. Ide)

Release 3 of the SUSANNE Corpus is now complete and is available, like
earlier releases, by anonymous ftp from the Oxford Text Archive.
Release 3 incorporates several thousand modifications dealing with
errors and inconsistencies in the Corpus which came to light during the
process of preparing the book ENGLISH FOR THE COMPUTER for publication.
It also includes additional information in the documentation file.

To obtain a copy of SUSANNE Release 3, log in by anonymous ftp to
black.ox.ac.uk, move to the directory ota/susanne, and follow the
instructions in the README file in that directory.

A number of users have enquired about the publication schedule for the
book. The manuscript of ENGLISH FOR THE COMPUTER was delivered to
Oxford University Press in August 1993, and the copy-editing process was
completed in March 1994. Publication is expected late in 1994. I am
sorry that it is taking a long time; but it is a very long and complex
book, and the Press are putting a great deal of effort into getting
details right.

For those not familiar with the SUSANNE Corpus: this is an annotated
sample comprising about 130,000 words of written American English text,
produced to exemplify a set of annotation standards which attempt
to specify an explicit notation for all aspects of the surface and logical
grammar of real-life English in sufficient detail that analysts
independently applying the standards to the same text must produce
identical annotations. These standards are defined in the book ENGLISH
FOR THE COMPUTER; a skeleton outline of the scheme is included in
the electronic documentation file which accompanies the Corpus. The
texts of the SUSANNE Corpus are a subset of the texts included in the
(unannotated) Brown University Corpus.

Geoffrey Sampson

School of Cognitive and Computing Sciences
University of Sussex