Willard McCarty (MCCARTY@vm.epas.utoronto.ca)
Fri, 16 Feb 90 22:38:41 EST

Humanist Discussion Group, Vol. 3, No. 1064. Friday, 16 Feb 1990.

(1) Date: Thu, 15 Feb 90 05:20:00 EST (17 lines)
From: "HELEN ARISTAR-DRY" <islhad@es.uit.no>
Subject: RE: 3.1037 e-Tennyson, e-Browning (74)

(2) Date: 15 Feb 90 21:42:01 EST (37 lines)
From: James O'Donnell <JODONNEL@PENNSAS>
Subject: Augustine-Pelagius e-texts

(3) Date: Fri, 16 Feb 90 15:53 GMT (40 lines)
From: Oxford Text Archive <ARCHIVE@VAX.OXFORD.AC.UK>
Subject: spoken archive corpora

(4) Date: Fri, 16 Feb 90 16:21:29 MST (30 lines)
From: "R. Jones" <JONES@BYUVM>
Subject: Pfeffer Spoken German Corpus

(1) --------------------------------------------------------------------
Date: Thu, 15 Feb 90 05:20:00 EST
From: "HELEN ARISTAR-DRY" <islhad@es.uit.no>
Subject: RE: 3.1037 e-Tennyson, e-Browning (74)

Having just read Lou Burnard's message on pricing, I think I'll put
in a plug for the Oxford Text Archive. Not only do they have the
most extensive list I know of for e-texts, but the people who fill
the orders have occasionally gone out of their way for me--getting
rush orders filled before I left the country, talking about prices
and formats from Oxford to Texas, etc. I don't know of any other
"supplier" who offers that quantity of product with the quality
of service I've always received--allied to such a low price. As
Burnard reminds us, one megbyte tape is approx. 3 Victorian novels!

Helen Aristar-Dry

(2) --------------------------------------------------------------42----
Date: 15 Feb 90 21:42:01 EST
From: James O'Donnell <JODONNEL@PENNSAS>
Subject: Augustine-Pelagius e-texts

From: Jim O'Donnell (Classics, Penn)

The complete corpus of Augustine's writings has been put into the computer by
the editors of the Augustinus-Lexikon at the University of Wuerzburg. This
includes all A.'s own works plus such works of others as are usually
incorporated in editions of A.: letters from others, the extensive fragments
of Julian, Faustus, and others, to which A. responds by quoting passages. It
does not include other works such as the ep. ad Demetriadem. Some of those
others may be in machines controlled by the two corpora of editions (CCSL and
CSEL, in Turnhout and Vienna respectively), esp. texts from relatively (???
post-1970 or so ???) recent editions published in those series.

Access to all these e-texts is difficult. CETEDOC has all of Augustine and
threatens to publish a microfiche concordance (12,000 pages on microfiche): I
have advised our university library not to purchase this $1000 dinosaur.
Vienna has about half the total corpus of Augustine on machine and is
publishing fragments of a linguistic lexicon of Augustine's language. None of
these sources is willing to part with copies of their texts to interested
customers: experto credite, Teucri, I've tried. Oh, I've tried.

The Wuerzburg corpus, however, is willing to release some information directly
and they have made rather better access possible indirectly. If you write to
Wuerzburg and ask for specific word-searches to be done on the data-base, they
will comply, sending you floppy disks and an invoice. But they have also now
released a copy of the entire data-base to Villanova University (both
Wuerzburg's Aug.-Lex. and Villanova U. are run by the Order of Saint
Augustine), where steps are being taken to make it more accessible. I have
been allowed to do word-searches by modem; within limits of practicality, they
are willing to work out similar arrangements with other interested scholars
and would probably require only minimal reimbursement for out-of-pocket
expenses. Interested parties should e-mail to FITZGERAL@VUVAXCOM.bitnet. Fr.
Fitzgerald is not on HUMANIST, but I will send him a copy of this note to warn
him that others may write.
(3) --------------------------------------------------------------52----
Date: Fri, 16 Feb 90 15:53 GMT
From: Oxford Text Archive <ARCHIVE@VAX.OXFORD.AC.UK>
Subject: spoken archive corpora

Thank you for the nice things you say about the Text Archive; I'm
sending in a separate message a slightly more up to date list of
our holdings than that which your report derives from. You will see
that we no longer include information about the holdings of other
centres: this is because of the difficulty of getting such information
in a reasonably accurate form that can also be easily loaded into
our database. We do however maintain lists of such information and
On written English corpora, you dont mention the Lancaster-Oslo-Bergen
corpus, which is probably the most important (and certainly the most
extensively analysed) corpus after the Brown. It was designed as an
exact counterpart tothe Brown, but of British rather than American
English. It has been extensively syntactically analysed and tagged and
(unlike the Brown) is freely available in this form. You list this
under spoken english, which is wrong.

There are other corpora in preparation or available which follow the brown
Corpus model and represent other varieties of English. A detailed survey of
these and several other corpora was recently (last summer) made available -
I think on Humanist - by Lita Taylor of the University of Lancaster. If you
have not read this file, you should. It contains a description of
the Polytechnic of Wales corpus of childs language for example, which
should interest you.

Two recent acquisitions at Oxford (not yet in the catalogue) which might
interest you include a copy of the Ulm Textbank, to which you refer, and
the PIXI Project Data, which you probably dont know of. This is a highly
detailed analysis of a corpus of book -shop encounters in both English and

I wish you every success in gathering information about corpora of
spoken language. I hope you will share the information with us. And
of course we are always very interested in new acquisitions...

Lou Burnard
Oxford Text Archive

(4) --------------------------------------------------------------39----
Date: Fri, 16 Feb 90 16:21:29 MST
From: "R. Jones" <JONES@BYUVM>
Subject: Pfeffer Spoken German Corpus


The Pfeffer Spoken German Corpus was collected in 1961 under
the direction of Professor J. Alan Pfeffer, then of the
University of Buffalo. It contains 400 12-minute spontaneous
interviews covering 25 different topics, recorded in 60 locations
in The Federal Republic of Germany, the German Democratic
Republic, Austria and Switzerland. The speakers reflect
demographic statistics with regard to gender, age, education and
geography. The interviews were transcribed and encoded for the
computer at Stanford University in 1984.

For information on ordering the Pfeffer Spoken German Corpus
write to:

Randall L. Jones
Department of German
4096 JKHB
Brigham Young University
Provo, Utah 84602


Tel: 801-378-3513