16.595 LLC 17.3 table of contents

From: Humanist Discussion Group (by way of Willard McCarty willard.mccarty@kcl.ac.uk)
Date: Thu Apr 03 2003 - 02:44:17 EST

Next message: Humanist Discussion Group (by way of Willard McCarty

               Humanist Discussion Group, Vol. 16, No. 595.
       Centre for Computing in the Humanities, King's College London
                   www.kcl.ac.uk/humanities/cch/humanist/
                     Submit to: humanist@princeton.edu

         Date: Thu, 03 Apr 2003 08:24:20 +0100
         From: Edward Vanhoutte <evanhoutte@kantl.be>
         Subject: TOC Literary & Linguistic Computing 17/3

Literary and Linguistic Computing

Volume 17, Issue 3, September 2002

Articles

- 'Delta': a Measure of Stylistic Difference and a Guide to Likely
Authorship
John Burrows
pp. 267-287
This paper is a companion to my 'Questions of authorship: attribution
and beyond', in which I sketched a new way of using the relative
frequencies of the very common words for comparing written texts and
testing their likely authorship. The main emphasis of that paper was not
on the new procedure but on the broader consequences of our increasing
sophistication in making such comparisons and the increasing (although
never absolute) reliability of our inferences about authorship. My
present objects, accordingly, are to give a more complete account of the
procedure itself; to report the outcome of an extensive set of trials;
and to consider the strengths and limitations of the new procedure. The
procedure offers a simple but comparatively accurate addition to our
current methods of distinguishing the most likely author of texts
exceeding about 1,500 words in length. It is of even greater value as a
method of reducing the field of likely candidates for texts of as little
as 100 words in length. Not unexpectedly, it works least well with texts
of a genre uncharacteristic of their author and, in one case, with texts
far separated in time across a long literary career. Its possible use
for other classificatory tasks has not yet been investigated.

- The Pascal Digital Archive
Shuji Shiraishi, Yutaka Wada and Shou Fujimura
pp. 289-310
This paper presents an overview of the Pascal Database System. The
Pascal Database includes all the text from the Oeuvres completes de
Blaise Pascal in four volumes. The online database was released
experimentally in October 2000. It is possible to display material,
perform a vocabulary search, and make frequency lists of material in the
database via the Internet. The content display can access each volume,
plus manuscript data, edition, references, annotations of J. Mesnard,
and other documents, which is a great advantage when studying the
material. The vocabulary search can perform Boolean searches with 'And',
'Or', and 'Not', and can also use the wild card '[starf]'. Frequency
lists can be made using alphabetical or frequency order, and it is even
possible to create a list based on the alphabetical order of the
reversed words. Finally, we comment on the personal pronouns in Pascal's
letters and discuss the uses of the word figure in the second volume of
Pascal's work.

- How Accurate Were Scribes? A Mathematical Model
Matthew Spencer and Christopher J. Howe
pp. 311-322
Until printing was invented, texts were copied by hand. The probability
with which changes were introduced during copying was affected by the
kind of text and society. We cannot usually estimate the probability of
change directly. Instead, we develop an indirect method. We derive a
relationship between the number of manuscripts in the tradition and the
mean number of copies separating a randomly chosen pair of manuscripts.
Given the rate at which the proportion of words that are different
increases with the mean number of copies separating two manuscripts, we
can then estimate the probability of change. We illustrate our method
with an analysis of Lydgate's medieval poem The Kings of England.

- Computer-Assisted Teaching of Translation Methods
Chi-Chiang Shei and Helen Pain
pp. 323-343
This paper introduces an intelligent tutoring system designed to help
student translators learn to appreciate the distinction between literal
translation and liberal translation, an important and forever debated
point in the literature of translation, and some other methods of
translation lying between these two extremes. We identify four prominent
kinds of translation methods commonly discussed in the translation
literature-word-for-word translation, literal translation, semantic
translation, and communicative translation-and attempt to extract
computationally expedient definitions for them from two researchers'
discussions on them. We then apply these computational definitions to
the preparation of our translation corpus to be used in the intelligent
tutoring system. In the basic working mode the system offers a source
sentence for the student to translate, compares it with the inbuilt
versions, and decides on the most likely method of translation used
through a translation unit matching algorithm. The student can guess
where on the literal and liberal continuum their translation stands by
viewing this verdict and by comparing their translation with other
versions for the same sentence. In the advanced working mode, the
student learns some translation techniques such as the contrastive
analysis approach to teaching translation, while appreciating the
working of translation methods in relation to these techniques.

- Encoding Medieval Abbreviations for Computer Analysis (from
Latin-Portuguese and Portuguese Non-literary Sources)
Stephen R. Parkinson and Antnio H. A. Emiliano
pp. 345-360
This paper proposes a solution to the problem of handling scribal
abbreviations in TEI-conformant transcriptions of medieval texts,
following a conservative editorial strategy. A key distinction is drawn
between alphabetic abbreviations, which represent sequences of letters,
and logographic abbreviations which represent whole words. The TEI
elements [lang]expan[rang] and [lang]abbrev[rang] can be used
systematically to separate these two types: alphabetic abbreviations
will be expanded in the main text, recording the abbreviated form
(including TEI entities representing the main abbreviation marks) as an
attribute of [lang]expan[rang], while logographic abbreviations will be
represented in their abbreviated form, with the expanded form recorded
as an attribute of [lang]abbrev[rang]. The proposals are illustrated
from common abbreviations and short text samples from tenth-century
Latin-Portuguese and thirteenth-century Old Portuguese.

Reviews

- David Crystal: Language and the Internet
Reviewed by Jean Aitchison
pp. 361-367

- Darrel Ince: A Dictionary of the Internet
Reviewed by Jean Aitchison
pp. 361-367

- I. Dan Melamed: Empirical Methods for Exploiting Parallel Texts
Reviewed by Dan Tufis
pp. 368-370

- M. Stubbs: Words And Phrases
Reviewed by Oliver Mason
pp. 370-372

--
=============
Edward Vanhoutte
Co-ordinator
Centrum voor Teksteditie en Bronnenstudie - CTB (KANTL)
Centre for Scholarly Editing and Document Studies
Reviews Editor, Literary and Linguistic Computing
Koninklijke Academie voor Nederlandse Taal- en Letterkunde
Royal Academy of Dutch Language and Literature
Koningstraat 18 / b-9000 Gent / Belgium
tel: +32 9 265 93 51 / fax: +32 9 265 93 49
evanhoutte@kantl.be
http://www.kantl.be/ctb/
http://www.kantl.be/ctb/vanhoutte/

This archive was generated by hypermail 2b30 : Thu Apr 03 2003 - 02:50:49 EST