Date: Fri, 18 Dec 1998 22:22:37 -0500
From: John Unsworth <jmu2m@virginia.edu>
Subject: ACH-ALLC Keynotes

The organizers of the 1999 joint international conference of the Associatio=
for Computers and the Humanities and the Association for Literary and
Linguistic Computing are pleased to announce that the conference's two keyn=
speakers will be Cathy Marshall and George Farr.

Cathy Marshall is a member of the research staff at the Xerox Palo Alto
Research Center. She has led a series of projects investigating analytical
work practices and collaborative hypertext, including two system
development projects, Aquanet (named after the hairspray) and VIKI. Her
recent publications include "Making Metadata: a study of metadata creation
for a mixed physical-digital collection" in Proceedings of the ACM Digital
Libraries '98 Conference, Pittsburgh, PA (June 23-26, 1998) pp. 162-171.
(winner of 1998 Vannevar Bush Best Paper Award), and "Toward an ecology of
hypertext annotation" in Proceedings of ACM Hypertext '98, Pittsburgh, PA
(June 20-24, 1998) pp. 40-49. (winner of 1998 Engelbart Best Paper Award).
More information about Ms. Marshall can be found at

George Farr is the Director of the Division of Preservation and Access at t=
National Endowment for the Humanities. This division provides support for
projects that will create, preserve, and increase the availability of
resources important for research, education, and lifelong learning in the
humanities. It is from this division that NEH makes its grants for
textbases in the humanities and research and demonstration projects that
focus on the use of digital technology in the humanities. Mr. Farr has
also been the principal NEH representative to the second round of the
Digital Library Initiative, co-sponsored by the National Science
Foundation, NEH, the Defense Advanced Research Projects Agency, The
National Library of Medicine, The Library of Congress, and The National
Aeronautics & Space Administration. At the request of the Library of
Congress, Mr. Farr also designed and helped implement the evaluation of
proposals submitted to the Ameritech/LC National Digital Library=20
Competition. More information about the Division of Preservation an=20
Access is available at: http://www.neh.gov/html/div_pres.html

Date: Mon, 21 Dec 1998 16:14:53 -0500
From: David Green <david@ninch.org>

December 21, 1998

December RLG DigiNews Available
"Digital Archiving: Approaches for Statistical Files,
Moving Images, and Audio Recordings."

The December 1998 (Volume 2, Number 6) issue of RLG DigiNews is now
available at:

http://www.rlg.org/preserv/diginews/ (from all points other than
http://www.thames.rlg.org/preserv/diginews/ (from Europe)

Oya Rieger, Co-Editor of RLG DigiNews, opens this issue's feature article,
"Digital Archiving: Approaches for Statistical Files, Moving Images, and
Audio Recordings." The article addresses digital archiving from a different
viewpoint - responding to the digital preservation needs presented by
different types of digital material. A set of co-authors describe the
problems and issues associated with their specific material (statistical
files, moving images and audio files) in:

- Archiving Statistical Data: The Data Archive at the University of Essex
by Simon Musgrave and Bridget Winstanley;
- Universal Preservation Format (UPF): Conceptual Framework
by Thom Shepard; and
- Norwegian Digital Radio Archive Initiative
by Svein Arne Brygfjeld and Svein Arne Solbakk.

This issue's technical feature addresses a question often raised at the
beginning of digital imaging projects - "What kind of digital camera or
scanner should I use?" Peter Hirtle, Assistant Director, Cornell Institute
for Digital Collections, and Carol DeNatale, Registrar, Herbert F. Johnson
Museum of Art, Cornell University, relate their experience in "Selecting a
Digital Camera: the Cornell Museum Online Project."

Rounding out this issue is a current calendar of events, project
announcements, a highlighted web site, and a FAQ regarding technical issues
associated with UMI's Digital Vault Initiative.

For more information about RLG or PRESERV, please contact Robin Dale

Date: Mon, 4 Jan 1999 15:35:45 -1000 (HST)
From: "Philip A. Bralich, Ph.D." <bralich@hawaii.edu>

Ergo Linguistic Technologies would like to announce its first
annual parsing contest based on a fixed set of sentences and a
fixed set of tasks to be performed on that set of sentences.
The area of NLP to be explored is that of increased syntactic
analysis to provide: 1) improvements in navigation and control
technology through more complex grammar, 2) improvements in the
implementation of question/answer, statement/response dialogs
with computers and computer characters, and 3) improvements in
web and database searching using natural anguage queries. =20

The contest will be based on a comparison of results for parses of
a fixed set of sentences (included at end of this message) and
various tasks that can be performed as a result of those parses.
That is, the comparison will be based on the actual parse tree and
the ability to use that parsed output to generate theory independent
parse trees and output and to perform various NLP tasks. The
judging will be based on the standards for evaluating NLP that have
been proposed previously on this list by myself and Derek Bickerton
and which are currently being developed into an ISO standard for the
Virtual Reality Modeling Language (VRML) as part of the VRML
Consortium's development efforts (http://www.vrml.org/WorkingGroups/
NLP-ANIM). The standards proposed are theory and field independent
standards which allow both linguists and non-linguists to evaluate NLP
systems in the areas of navigation and control, question/answer
dialogues, and database and web searching. I will also be at the
annual meeting of the Linguistic Society of America this week in Los
Angeles for those who would like to discuss this in more detail. =20

The sentences chosen for this contest are rather simple, but as we find
more and more parsers that can accomplish the tasks on this list, we
will add more complex sentences and tasks to the list. Please, be aware
that systems that may be designed for large corpora of unrestricted text
actually cannot work in this domain. Thus, while such systems may be
useful for certain searching tasks, they are not useful in the domain
explored in this contest ^=D7 and this is evidenced by their inability to
perform on tests such as the one provide here. =20

The full contest instructions and an HTML document of Ergo's results in
this area can be found at http://www.ergo-ling.com. The standards were
designed to allow the developers of a parsing system (statistical or
syntactic) to demonstrate the thoroughness and accuracy of the parses they
produce by using the parsed output to perform a number of straightforward,
traditional syntactic tasks such as changing a statement to a question or
an active to a passive as well as demonstrating an ability to create
standard trees (Using the Penn Treebank II guidelines) and standard
grammatical analyses. All the standards chosen were chosen to be theory
independent measures of the accuracy of a parse through the use of standard
and ordinary grammatical and syntactic output. =20

The contest officially begins on January 15th and will be closed on March
31st. This will allow developers 2.5 months to develop tools and to work
with trouble spots that they may have with the set of sentences offered in
this contest. The contest will be offered in subsequent years from January
to March. As time develops we hope the parsers, the contest rules, and the
test sentences will all grow in sophistication and scope. However, as most
parsers have existed many more years than ours, it is reasonable to think
these tools exist already. =20

Anyone who joins must submit an HTML document and the parser (source code
only) that created it. The parser can be in any format but it must require
a minimum of effort for the contest judges to set up and run. For example,
a WIN95 Interface that takes input files and produces the html output file
would be considered a minimum effort parser. There will be tests to ensure
that the output is genuine parsed output rather than a synthesis such as a
series of print calls that merely present the correct output for a
particular string rather than generating it. =20

The HTML files of all contestants will be made available at the Ergo web
site (http://www.ergo- ling.com). Those who wish to join even though their
parsing system is not robust or complete enough for all the tasks or all
the sentences in the contest are also welcome to join. Reviewers will then
look at these documents as promising parsers for future contests. Their
results will be posted on our web site as well. =20

Judging will be based on the percentage of sentences that parsed, the
percentage of the tasks that are completed and on the accuracy of the parse=
that result and the success on the parsing tasks. Currently, the judges wi=
be Derek Bickerton and myself, but we will welcome others to join in the ta=
Because of the home court advantage of the judges, there will be printed
reports of the judging available on the Ergo web site for review by the
overall community of professionals in this area. Complaints or criticisms
will also be posted. =20

Anyone who would like to review the judging and the comments on the judging
are welcome to do so. Anyone who wishes to be a volunteer judge may also
contact us. However, the criteria for all judging will be the accuracy of
the parser in creating a correct parse of all the sentences and completing
all the tasks set forth in the test materials. =20

We would like this contest to remain open not only to challengers but also
to those who would like to design and improve the contest itself through th=
addition of more sentences or more tasks added to the parsing task. There =
one condition, however, on being able to this, we will hold rigidly to the
rule that those who would improve on or add to the contest must first meet
the original challenge at a minimum level of 75% accuracy before being
allowed to contribute. We are starting with a small set of relatively sim=
sentences to make this as available as possible to as many people as possib=
In this manner researchers in industry, academia, and government will be ab=
to compare their results without exposing any proprietary or confidential
information. We also do not want the contest to be unduly influenced by th=
who would like to target some ideal of parsing that is not thoroughly groun=
in what is currently possible in these domains. =20

At a Virtual Reality and Multi-Media Conference in Japan (VSMM ^=D198), Erg=
o was
awarded the "Best Technical Award" for its NLP technology. I believe the
main reason that judges and others were able to notice this is because I wa=
IS BEING HELD HOSTAGE BY GRAMMAR." And then I went on to explain that the
main reason many VR and Multi-Meida sites and programs are not catching on
is because their users cannot ask even a simple question of the characters =
about the objects they encounter. Thus, a UNESCO virtual world such as
reconstructed cathedral will receive many visitors but they will not stay
and explore because they cannot ask even the simplest questions like "How
many stairs in this Cathedral?" "When was the Nave built?" and so on. I
then pointed out that while speech and graphics were actually ready to work
with such projects, the fact that their grammatical abilities is so limited=
no one is using them with these products. The missing link between speech, =
and multi- media and users actually talking to avatars and sites is GRAMMAR=
When I then demonstrated that this was so with the use of the Ergo tools, w=
won the award. The main reason I am sponsoring this contest is so that al=
linguists and NLP researchers who would like to paticipate in this very lar=
future source of jobs can do so as soon as possible. So in order to
stimulate research and interest this contest is proposed. =20

AREAS. =20

The full set of sentences for the contest is available at the=20
http://www.ergo-ling.com web site. This list contains five from each of th=
three sections: 1) theory independent parsing, 2) navigation and control, a=
3) Question/answer, statement/response repartee. The full list contains 10=
sentences and will grow and be modified over the years as this annual conte=
takes root. =20

Section 1:=09Theory independent parsing. =20
=091.=09there is a dog on the porch
=092.=09John's house is bigger than mary's house
=093.=09the tall thin man in the office is reading a technical report
=094.=09the man who mary likes is reading the book that john gave her
=095.=09learning how to cope with stress is of primary importance in the wo=
rk world
Section 2:=09Navigation and Control.
=091.=09Erase all files that end in .doc
=092.=09print the file called teach.doc
=093.=09send an email to bob that says "meeting at eight"
=094.=09send a fax to bob that says "there is a meeting at eight tonight"
=095.=09go to yahoo and find information about golf courses in Georgia
Section 3:=09Question and Answer/Statement Response Repartee. =20
=091.=09bill's email is bill@server.com
=09=09what is bill's email address
=09=09what is bill's email
=092.=09john has romantic books
=09=09what kind of books does john have
=093.=09My appointment with bob is at six o'clock
=09=09what time is my appointment
=09=09what time are my appointments
=094.=09the tall thin man in the office is reading a technical report book
=09=09what is the man reading
=09=09what is the man doing
=09=09is the man reading a report
=09=09who is reading a report
=095.=09John gave mary a book because it was her birthday
=09=09who gave mary a book
=09=09who did john give a book
=09=09what did john give mary
=09=09why did john give mary a book
=09=09did john give mary a book
=09=09did john give mary a book because it was her birthday
=09=09did john give mary a pencil
=09=09did john give mary a book because it was bob's birthday
Philip A. Bralich, Ph.D.
President and CEO
Ergo Linguistic Technologies
2800 Woodlawn Drive, Suite 175
Honolulu, HI 96822

Tel: (808)539-3920
Fax: (808)539-3924

Date: Tue, 05 Jan 1999 23:15:56 +0200
From: Costis Dallas <dallas@hol.gr>
Subject: Entopia: Museums, Cultural Heritage and IT (GR/EN)

Announcing Entopia: Museums, Cultural Heritage and Information Technology [=

Athens, 3 January 1999. - Entopia is a new web site, mostly in the Greek
language, hosting news, personal commentary and brief presentations of
resources in the domain of museums, cultural heritage, archaeology,
cultural management and policy, new media and information technology. It is
intended as a communication medium for Greek museum professionals,
archaeologists and cultural information specialists on relevant national
and international developments. The notices section summarises and point to
items of interest (interactive exhibits, conferences etc.) as reported in
the major mailing lists of the field. A selective index of web sites
relevant to these subjects, with brief presentations, is planned for the

You may visit Entopia at the following address:


You may also subscribe to the entopia mailing list, providing a regular
digest of the web site:


Information items of interest to potential Entopia visitors, written
contributions, and comments on the content and form of the site, are
especially welcome.

Best regards,

Dr Costis Dallas (mailto:dallas@hol.gr)
Entopia site curator


[Message in Greek follows - requires monotonic Greek fonts]

Date: Thu, 07 Jan 1999 14:46:50 +0800
From: Toby Burrows <tburrows@library.uwa.edu.au>
Subject: Recent books: XML, SGML, and Web sites

Recent books: XML, SGML, and Web sites

If the number of books about it is any indication, XML - the eXtensible
Markup Language - is one of the most important recent developments in
computing. In the last twelve months, no less than 24 books on XML have
been published; another 15 will be appearing in the first half of 1999. The
good news is that XML is expected to revolutionize Web publishing. It is
much more sophisticated and flexible than HTML but less complicated than th=
full SGML standard. The bad news is that there is as yet comparatively
little software for creating and viewing XML documents. XML has major
implications for computing in the humanities, but you may prefer to ignore
it until the Web browsers support it. On the other hand, if you want to
anticipate its effects and explore its implications now, here are some
places to start. Most are comparatively technical, but some are more
approachable than others.

Dr Toby Burrows
Scholars' Centre, University of Western Australia

Flynn, Peter. Understanding SGML and XML tools: practical programs for
handling structured text. Boston: Kluwer Academic Publishers, 1998. xxvi,
432 p. + 1 CD-ROM. ISBN 0-7923-8169-6 US$84.00

Flynn introduces and discusses a wide range of software applicable to SGML
and XML documents, including editors, parsers, converters, and viewers. A
selection of these programs - mostly freeware - is on the accompanying
CD-ROM. Though he also provides an introduction to SGML and XML, Flynn's
book is perhaps not the best starting-point for complete novices. But it's =
remarkably valuable and very practical resource for the slightly more
experienced user. A bonus for humanities scholars is the inclusion of
helpful material on the Text Encoding Initiative (TEI).

Goldfarb, Charles F., and Paul Prescod. The XML handbook. Upper Saddle
River, NJ: Prentice-Hall PTR, 1998. xliv, 639 p. + 1 CD-ROM. (Charles F.
Goldfarb series on open information management) ISBN 0-13-081152-1 US$44.=

Goldfarb - the main force behind the development of SGML - now turns his
attention to XML. This is really two books in one: an authoritative, if
somewhat technical, exposition of the XML specifications, and a series of
case studies using specific commercial software. The case studies, sponsore=
by the software companies involved, are descriptive rather than evaluative,
but they give a good idea of the range of realistic and practical
applications for XML. The accompanying CD-ROM contains no less than 55
different pieces of free software, as well as demonstrations from the
sponsors and copies of XML-related standards and specifications.

Harold, Elliotte Rusty. XML: Extensible Markup Language. Foster City, CA:
IDG Books, 1998. xxiii, 426 p. + 1 CD-ROM. ISBN 0-7645-3199-9 US$39.99

This XML guide is aimed at Web site developers, and assumes a familiarity
with such tools as HTML and JavaScript. Harold provides a thorough, but
fairly technical, explanation of the main features of XML, including the us=
and creation of Document Type Definitions (DTDs) and style sheets. Among th=
other topics covered are links and pointers, and the Channel Definition
Format (CDF). The full text of the XML specification is included in an
appendix. The accompanying CD-ROM contains the examples from the book.

Jelliffe, Rick. The XML & SGML cookbook: recipes for structured information=
Upper Saddle River, NJ: Prentice-Hall PTR, 1998. xxvii, 621 p. + 1 CD-ROM.
(Charles F. Goldfarb series on open information management) ISBN
0-13-614223-0 US$55.00

Document structures and patterns - as expressed in terms of SGML - are
Jelliffe's focus in this book, which is pitched at a specialized technical
level. He discusses techniques for designing and building Document Type
Definitions (DTDs), with helpful advice drawn from practical experience. Th=
second half of the book deals with character sets and the representation of
special characters in SGML. The accompanying CD-ROM contains the DTDs
developed in the text, as well as various character sets and some SGML- and
XML-based software.

Leventhal, Michael, David Lewis, Matthew Fuchs. Designing XML Internet
applications. Upper Saddle River, NJ: Prentice-Hall PTR, 1998. xxxii, 582 p=
+ 1 CD-ROM. (Charles F. Goldfarb series on open information management)
ISBN 0-13-616822-1 US$44.95

Though this book is intended mainly for programmers with experience in
constructing dynamic Web sites, the brisk introduction to XML concepts and
tools could prove very useful for a less technical audience. Leventhal and
his colleagues focus on ways of using Perl and Java to build XML Internet
applications, with six worked examples which include a bulletin board, a
search engine, and a document conversion tool. The explanations are clear
and detailed, and the broader architectural issues are nicely brought out.
The accompanying CD-ROM contains Java and Perl tools, as well as XML
material and software.

McGrath, Sean. XML by example: building e-commerce applications. Upper
Saddle River, NJ: Prentice-Hall PTR, 1998. xlviii, 470 p. + 1 CD-ROM.
(Charles F. Goldfarb series on open information management) ISBN
0-13-960162-7 US$49.95

XML is expected to have a significant impact on electronic commerce. Sean
McGrath's book gives managers and developers of commercial Web sites a
detailed look at the XML specifications, including hypertext links and
formatting with style sheets. He also discusses several current application=
of XML in the area of electronic commerce, as well as looking at the
benefits and commercial advantages of using XML. The accompanying CD-ROM
contains a selection of XML-based software, together with some sample
Document Type Definitions (DTDs) and various documents about XML.

Megginson, David. Structuring XML documents. Upper Saddle River, NJ:
Prentice-Hall PTR, 1998. xxxvii, 420 p. + 1 CD-ROM. (Charles F. Goldfarb
series on open information management) ISBN 0-13-642299-3 US$39.95

Document Type Definitions (DTDs) are crucial to both XML and SGML, and
provide specific markup languages for particular types of documents.
Megginson offers a detailed and exhaustive look at DTDs: how to analyse
them, how to build them or adapt existing models, and how to link DTDs usin=
the "architectural forms" methodology. Five DTDs are analysed, including
HTML 4.0, the Text Encoding Initiative's TEI-Lite, and ISO 12083 - the
publishing industry's DTD for books, serials, and articles. The treatment i=
very thorough, but definitely for experts. The accompanying CD-ROM includes
the five DTDs, plus a selection of XML-related software.

Powell, Thomas A., with David L. Jones and Dominique C. Cutts. Web site
engineering: beyond Web page design. Upper Saddle River, NJ: Prentice-Hall
PTR, 1998. x, 324 p. ISBN 0-13-650920-7 US$39.95

As Web sites have grown bigger, their technical characteristics have become
more complex. Dynamic, programmed sites are replacing static collections of
HTML pages. Powell and his co-authors look at ways of designing and
engineering large Web sites, from defining the problem and analysing
requirements through to building, implementation, and testing. With its
pragmatic and realistic approach, pitched at a level which is not too
technical, this is a very valuable guide for managers who need to make
strategic decisions about Web site projects.=20

Simpson, John E. Just XML. Upper Saddle River, NJ: Prentice-Hall PTR, 1998=
xiv, 381 p. ISBN 0-13-943417-8 US$34.99

Simpson gives a straightforward introduction to the main features of XML,
with plenty of material on links and pointers, and styles and stylesheets.
Document Type Definitions (DTDs) are also covered quite fully. XML-related
software is listed and discussed. An entertaining example runs through the
book: using XML to describe and catalogue "B" movies. This is a clearly
written guide to the basics of XML, which is not overly technical in
approach. But the wider context is not covered: current and future
applications of XML are not examined in any detail, and there is little
attempt to relate it to HTML or SGML.=20

St. Laurent, Simon. XML: a primer. Foster City, CA.: MIS:Press, 1998. xix,
348 p. ISBN 1-5582-8592-X US$24.99

St. Laurent's introduction to XML is aimed at people with substantial
experience in using HTML and developing Web sites. Though there is a
succinct explanation of XML's main features - particularly Document Type
Definitions (DTDs) - the main focus is on potential applications for XML,
and on its relationship to existing tools like HTML and Cascading Style
Sheets (CSS). Especially interesting are the author's comments on the likel=
effects of XML on Web browser software and the architecture of Web sites.

Tittel, Ed, Norbert Mikula & Ramesh Chandak. XML for dummies. Foster City,
CA: IDG Books, 1998. xxviii, 377 p. + 1 CD-ROM. ISBN 0-7645-0360-X US$29.=

This is one of the best guides to XML for non-technical people. Tittel and
his co-authors provide a lively and clear account of the main features of
XML, as well as a good explanation of its value and its relationship to SGM=
and HTML. There is also an extensive look at the ways in which XML is
already being applied in various disciplines. All this is presented in the
familiar "Dummies" style, with easy-to-read layouts and plenty of graphics.
The accompanying CD-ROM contains the text of the book and examples from it,
together with a range of free and evaluation software tools for XML.

Vinf, Danny R. SGML at work. Upper Saddle River, NJ: Prentice-Hall PTR,
1998. xvi, 845 p. + 1 CD-ROM. ISBN 0-13-636572-8 US$55.00

Vinf covers the major stages of a publishing process based on SGML
documents: developing a Document Type Definition (DTD), converting non-SGML
("legacy") documents, constructing and editing SGML documents, delivering
documents in printed or on-line form, and managing documents. Each section
is closely linked to the use of specific software, with worked examples.
Technical knowledge and familiarity with SGML are assumed. The accompanying
CD-ROM contains a variety of free and shareware software tools, and sample
documents for use with commercial programs.

Date: Sat, 09 Jan 1999 17:19:15 +0000
From: SJ Stauffer <stauffes@gusun.georgetown.edu>
Subject: Platform Independent Perseus 2.0

4. Platform Independent Perseus 2.0
Classics Technology Center [.pdf]
Perseus Project -- Tufts University

The Perseus Project at Tufts University (discussed in the October 17, 1997
Scout Report) is an ongoing initiative to create a comprehensive,
interactive, multimedia digital library for the study of Archaic and
Classical Greece. Recently, the Perseus Project released a free beta
version of Platform Independent Perseus 2.0. PIP2 is a graphical user
interface for the Perseus 2.0 database, the latest version of the digital
library. Once users install the interface locally, they may seamlessly
access and navigate the numerous texts, maps, and images available on the
Perseus server via an Internet connection. The benefit of PIP2 is that
users are provided with a specialized interface for the online database and
are able to avoid the annoying encumbrances encountered when using an
unwieldy Web browser. Installation requirements and downloading
instructions are posted at the site for both Mac and Windows operating
systems. The Classics Technology Center, provided by Able Media, offers a
collection of free materials to help educators and students make the most
of Perseus. Provided in .pdf format, these include instructions for each
Perseus component, tips and tricks, curriculum guides, teachers's
companions, and a showcase of published works from students and educators
around the world. [AO], [MD]

Date: Sat, 09 Jan 1999 17:31:07 +0000
From: Willard McCarty <Willard.McCarty@kcl.ac.uk>
Subject: online dictionaries

Humanists may wish to know of the Online Dictionary page maintained by the
Neuroscience department, Florida State University,

Humanist Discussion Group=20
Information at <http://www.kcl.ac.uk/humanities/cch/humanist/>