9.669 CHWP announcement

Humanist (mccarty@phoenix.Princeton.EDU)
Fri, 29 Mar 1996 19:30:42 -0500 (EST)

Humanist Discussion Group, Vol. 9, No. 669.
Center for Electronic Texts in the Humanities (Princeton/Rutgers)
Information at http://www.princeton.edu/~mccarty/humanist/

[1] From: Russon Wooldridge <wulfric@epas.utoronto.ca> (196)
Subject: Announcement of Computing in the Humanities Working

***Computing in the Humanities Working Papers***
ISSN 1205-5743

[Version francaise a la suite du texte anglais.]

The editors of TCH Working Papers are pleased to announce a number of
changes and additions to our online publication series.

First, we have decided to change its name, from TCH to CH Working Papers
(Computing in the Humanities Working Papers, or CHWP). The new name is
intended to reflect more clearly the international nature of an Internet
publication with readers and contributors from around the world. The new URL
of the series is now


(CHASS, Computing in the Humanities and Social Sciences, is the new
administrative entity at the University of Toronto that provides the online
service.) Please make note of the this URL. Accesses to the old address will
continue to work for some time.

Second, to encourage participation from computing humanists world-wide and
signify the importance of our field, we have created an international
Advisory Board, or Comite/ de patronage, of distinguished scholars, as follows:

John Burrows (University of Newcastle, Australia),
Susan Hockey (CETH, Rutgers University, USA),
Stig Johansson (University of Oslo, Norway),
Wilhelm Ott (University of Tuebingen, Germany),
Claude Poirier (Universite Laval, Canada) and
Bernard Quemada (Conseil superieur de la langue francaise, France)

Third, we have added a new member to the Editorial Board, Michael
Sperberg-McQueen, who has also agreed to serve as Associate Editor for
German. His active competence in that language allows us to extend the scope
of possible contributions. (We continue to welcome contributions in
languages other than English, French, and now German, and are pledged to
respond accordingly, as we receive them.) In addition, Michael's leading
role in the Text Encoding Initiative signifies our intention to move toward
adoption of the TEI for our publications.

Fourth, we have established in principle a complementary relationship with
Humanist, which we would like to serve as the regular means of
conversational and epistolary response to the more formal papers in CHWP. In
turn, we would like to suggest to the members of Humanist that they consider
CHWP to be the natural venue for publishing arguments developed from topics
raised in the online seminar.

Fifth, a set of new or recent CHWP titles provide, we think, the basis for
considerable discussion. These titles are as follows:

-- BURROWS, J.F. "Numbering the streaks of the tulip? Reflections on a
Challenge to the Use of Statistical Methods in Computational Stylistics"
(February 1996)
-- KLING, Rob and Lisa COVI. "Electronic Journals and Legitimate Media in
the Systems of Scholarly Communication" (January 1996)
-- SIEMENS, R.G. "Lemmatization and Parsing with TACT Preprocessing
Programs" (February 1996)
-- WINDER, W. "Reading the Text's Mind: Lemmatisation and Interpretation
from a Peircean Perspective" (March 1996)

Here we present a brief summary of their highlights in order to provoke this
discussion. Comments are most welcome.


The electronic medium of on-line publishing has properties that are
different from those of the print medium of traditional publishing, as well
as having others in common with the latter. The analysis of these
differences is receiving attention from a number of scholars, such as KLING
and COVI. Whereas SIEMENS discusses lemmatisation from a traditional
perspective, WINDER takes the differences of the two media into account in
his examination of lemmatisation in computational criticism. In this context,
"lemmatisation can be defined as the generation of a derivative text
through an algorithm that combines a database (dictionary and tagging
rules) and a source text. In this general acceptation of lemmatisation,
the source text is interpreted -- reformulated -- in the context of the
knowledge stored in the dictionary. How external information, both
intratextual and extratextual, is used to generate such
(re)categorisations is a fundamental problem that traverses all levels
of the interpretative process."

In WINDER's article,

"Peirce's type/token/tone trichotomy is used to explore some of the
ramifications of the text-generation model of lemmatisation. It is
argued that interpretation in the new medium is ultimately founded
on a kind of quotation, called an attestation. To know what a text
means is to know how it may be involved in attestation generation.
This semantic model establishes a practical, useful, and theoretically
coherent junction between lemmatisation in the narrow sense and complex
critical interpretation."

Traditional type/token tagging is based on the Saussurean langue/parole
dichotomy. The less well-known Peircean type/token/tone trichotomy may well
pose interesting conceptual and methodological problems for endeavours such
as the Text Encoding Initiative.

BURROWS questions the appropriateness of the application of the
statistician's generalized use of randomness to language (in the Saussurean
sense of langue). He asks:

"Are statistical methods, which take randomness of data as their
starting-point, appropriate to the study of something so highly
systematic as the English language?"

For "predictive stylistics", he proposes:

"the idea of *specimens from a repertoire* instead of the statistician's
usual *samples from a population*, and looks forward to the establishing
of a 'grammar of probabilities' to replace the abstract postulate of


Russon Wooldridge & Willard McCarty
Editors, CHWP
University of Toronto
E-mail: epc-chwp@chass.utoronto.ca


[Version francaise]

***Computing in the Humanities Working Papers***
ISSN 1205-5743

Les redacteurs des TCH Working Papers ont le plaisir d'annoncer plusieurs
changements et additions.

Tout d'abord, nous avons decide de changer le nom de cette serie de
publications en ligne sur l'internet. Elle s'appelle desormais les CH
Working Papers (Computing in the Humanities Working Papers, CHWP).
L'objectif de ce changement de nom est de refleter plus clairement le
caractere international d'une publication electronique ayant des lecteurs et
des collaborateurs dans plusieurs pays. L'adresse (URL) des CHWP est la


(CHASS, "Computing in the Humanities and Social Sciences", est le nom de la
nouvelle entite administrative de l'Universite de Toronto responsable du
serveur que nous utilisons.) L'ancienne adresse continuera a fonctionner
pendant quelque temps.

En deuxieme lieu, pour encourager la participation des humanistes
utilisateurs de l'informatique ou qu'ils se trouvent et pour signifier
l'importance de ce domaine d'etudes, nous avons cree un Comite de patronage
international compose des membres -- tous chercheurs distingues -- suivants:

John Burrows (Universite de Newcastle, Australie),
Susan Hockey (CETH, Universite Rutgers, Etats-Unis),
Stig Johansson (Universite d'Oslo, Norvege),
Wilhelm Ott (Universite de Tuebingen, Allemagne),
Claude Poirier (Universite Laval, Canada) et
Bernard Quemada (Conseil superieur de la langue francaise, France)

En troisieme lieu, le Comite de lecture s'est dote d'un nouveau membre,
Michael Sperberg-McQueen, qui sera en meme temps Responsable editorial pour
l'allemand. (Outre l'anglais, le francais et, dorenavant, l'allemand, les
manuscrits rediges dans d'autres langues sont les bienvenus.) En plus, le
role de premier plan que joue Michael dans la Text Encoding Initiative
signale notre intention d'adopter, le moment venu, le systeme d'encodage TEI
pour notre publication.

En quatrieme lieu, nous avons etabli en principe une relation complementaire
avec Humanist, que nous aimerions voir servir de moyen de reponse
conversationnelle et epistolaire aux articles des CHWP. A charge de
revanche, nous voudrions suggerer aux membres de Humanist de considerer les
CHWP comme lieu de publication d'arguments formalises issus des sujets
presentes dans ce seminaire en ligne.

Enfin, plusieurs articles nouveaux ou recents des CHWP donnent bien, a notre
avis, matiere de debat. Il s'agit des titres suivants:

-- BURROWS, J.F. "Numbering the streaks of the tulip? Reflections on a
Challenge to the Use of Statistical Methods in Computational Stylistics"
(fevrier 1996)
-- KLING, Rob et Lisa COVI. "Electronic Journals and Legitimate Media in
the Systems of Scholarly Communication" (janvier 1996)
-- SIEMENS, R.G. "Lemmatization and Parsing with TACT Preprocessing
Programs" (fevrier 1996)
-- WINDER, W. "Reading the Text's Mind: Lemmatisation and Interpretation
from a Peircean Perspective" (mars 1996)

Nous presentons ci-apres un bref resume des themes principaux de ces
articles afin d'en solliciter la discussion.


En plus d'aspects communs aux deux, la publication electronique possede des
proprietes differentes de celles de la publication imprimee traditionnelle.
L'analyse de ces differences attire de plus en plus l'attention des
chercheurs, tels KLING et COVI. Alors que SIEMENS traite la lemmatisation
d'un point de vue traditionnel, WINDER tient compte des differences de
l'imprime et de l'electronique dans son examen de la lemmatisation en
critique computationnelle. Dans ce contexte,

"la lemmatisation se laisse decrire comme l'engendrement d'un texte
second par un algorithme combinant une base de donnees (dictionnaire
et regles d'etiquetage) et un texte source. Dans cette acception de
la lemmatisation, le texte source est interprete -- reformule -- dans
le contexte des informations stockees dans le dictionnaire. Comment
de l'information, que ce soit externe ou interne au texte source,
sert a engendrer de telles (re)categorisations est un probleme
fondamental qui apparait a tous les niveaux interpretatifs." (Extrait
du resume francais.)

WINDER se sert

"d'une trichotomie peircienne, type/token/ton, pour explorer les
consequences de ce modele de lemmatisation fonde sur l'engendrement
de textes." (Ibid.)

L'encodage traditionnel type/token est fonde sur la dichotomie saussurienne
langue/parole. Il se peut bien que la trichotomie peircienne, moins connue,
pose des problemes conceptuels et methodologiques pour des projets tels que
la Text Encoding Initiative.

BURROWS met en question le bien-fonde de l'application a la langue du
concept de l'aleatoire.

"Les methodes statistiques, qui prennent comme point de depart
l'aleatoire des donnees, conviennent-elles a l'etude d'un objet
aussi hautement systematise que la langue anglaise?" (Extrait du
resume francais.)

Pour la "statistique predictive", il propose:

"l'idee de *specimens pris dans un repertoire* a la place des
habituels *echantillons preleves sur une population* du statisticien"
ainsi que "l'etablissement d'une 'grammaire des probabilites' qui
remplacerait le postulat abstrait de l'aleatoire." (Ibid.)


Russon Wooldridge & Willard McCarty
Redacteurs, CHWP
University of Toronto
Courrier electronique: epc-chwp@chass.utoronto.ca
Russon Wooldridge, Department of French, Trinity College,
University of Toronto, Toronto M5S 1H8, Canada
Tel: 1-416-978-2885 -- Fax: 1-416-978-4949
E-mail: wulfric@epas.utoronto.ca
Internet: http://www.epas.utoronto.ca:8080/~wulfric/