14.0299 primitives

From: Humanist Discussion Group (willard@lists.village.Virginia.EDU)
Date: 10/04/00
Next message: Humanist Discussion Group: "14.0300 recommended readings?"
Previous message: Humanist Discussion Group: "14.0296 Latin letter frequency"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
              Humanist Discussion Group, Vol. 14, No. 299.
      Centre for Computing in the Humanities, King's College London
              <http://www.princeton.edu/~mccarty/humanist/>
             <http://www.kcl.ac.uk/humanities/cch/humanist/>

        Date: Tue, 03 Oct 2000 13:16:02 +0100
        From: Wendell Piez <wapiez@mulberrytech.com>
        Subject: Re: 14.0295 primitives

Osher Doctorow writes:

> Could I prevail
>upon Wendell to possibly restate his thesis, if any, in one sentence
>comparable to my political history-prehistory declaration that permutations
>of A, B, and N in Shakespearean play contexts contain all the content of
>political history-prehistory?
>
>Yours Faithfully,
>
>Osher Doctorow

I'm afraid not: I really have no talent for such flights at least in the
context of an e-mail list. Rather, let Osher take the post for what it's
worth to him -- if that's not much, that's perfectly fine; I don't expect
any post I write to be on target for all readers.

Instead (and as long as I'm being summoned back to the floor), I'd like to
try and take the discussion a step further -- I accept Mr. Doctorow's
challenge to be more abstract and far-reaching, even if I'm not more
concise and conclusive. There are five points; please feel free to use your
delete key (or the moral equivalent thereof).

1. There is apparently a difference between "methodological primitives" in
the sense that Ott, Bradley and myself were taking them, which is to say
core operations to be performed on a specified data set via an automated
process, and in the sense that Prof. Unsworth is meaning them, as
irreducible operations performed by a scholar as he or she goes about the
work of tracing, understanding, and presenting a thesis about a text or
subject of research. (I'll let Willard speak for himself.) There is also,
at least potentially, a relation between these two things, as many of us
have experienced in our own work. The implication has been that if we have
the first (paraphrase this as "if we can teach our computers to help us
read, find, sort, filter and so forth") we can facilitate the second.

2. A key difference between what a computer does in performing operations
on a text, and what a human reader does, is that the data set (the "input")
on which the computer operates is finite and bounded, whereas what the
human reader brings is unknown and variable. It may be finite, although
large, but since its bounds are unknown, and since no two human readers (or
even readings) bring the same context to bear on a text, practically
speaking, it is infinite and unknowable.

(Caveat: the Internet and the web now make it possible for a computer's
inputs themselves be practically infinite and unbounded, because
unknowable; nevertheless we have hardly begun to think about what this may
mean for automated processing of texts.)

3. One ancient technique for bridging this gap, is to teach the computer
something about what we know about a text, and to design its interfaces and
its processes in such a way to give us better access to the full range of
this knowledge, than we can ourselves achieve unaided. I say "ancient"
because this work is far older than digital processing. Add a table of
contents or an index to a text, or line and verse numbering, or lay out the
text on the page with chapter titles in a larger type face, and you are
beginning to "teach [the book] to help us read, find, sort, filter and so
forth". With computers, examples of this practice would include text
encoding, or markup, as well as the addition of external sources of
information such as databases, dictionaries, "knowledge bases" etc.

4. Historically, one barrier to this work has been (as far as computers and
automation have been concerned) that to design these interfaces and
processes, we have had to invest in technologies and methods that mask the
processes as much as they reveal them. This has largely been because of the
design of our tools and the esoteric knowledge they have themselves
required. It is as if we had created indexed commentaries on Classical
Chinese poetry, but written them in English (finding that with our
keyboards it is easier to compose an alphabetical index in English),
thereby requiring our Chinese audience to learn English (on top of
Classical Chinese) to get the benefit of the commentaries. (Not only that,
but we have used a dialect of English that will be largely obsolete in five
years.)

This problem has been faced not only by "Computing Humanists" but also by
the culture as a whole (or marketplace, if you like), that has invested
untold millions in systems of computer-based automation that, whatever
benefits they have delivered, have always fallen short of promises.
Consequently, there have been waves of development working to ameliorate
the problem in one way or another. The emergence of object-oriented
programming methodologies, including the notion of "strong data typing", is
one such wave; the emergence of standards-based markup languages is
another. My earlier post tried to trace how these two developments should
in theory complement one another, and how industry is now moving forward
quickly on that basis to deal with its own analogous problems.
Nevertheless, I argued, in the context of Humanities research we have a
considerable way to go, even to match what has long been done with such
structures as indexes and footnotes in the printed book -- at least, that
is to say, if we want to do it on a basis that can reach beyond that
five-year half-life that computer applications have faced.

5. Even so, the gap remains between an automated process, working on known
inputs, and a human process, working with who-knows-what "extraneous" but
all-important -- all-pervasive and all-conditioning --knowledge, memory,
intuition, assumptions, imagination. Human readers perceive in a text (just
for example) the implicit logics of narrative ordering; intertextual
references; metaphorical correspondences; ironies. What would it take to
teach a computer to perceive these on our behalf? Out of what
methodological primitives, subject to automation, can such operations be
built?

Respectfully,
Wendell


======================================================================
Wendell Piez                            mailto:wapiez@mulberrytech.com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================




-------------------------------------------------------------------------
                       Humanist Discussion Group
       Information at <http://www.kcl.ac.uk/humanities/cch/humanist/>
              <http://www.princeton.edu/~mccarty/humanist/>
=========================================================================
Next message: Humanist Discussion Group: "14.0300 recommended readings?"
Previous message: Humanist Discussion Group: "14.0296 Latin letter frequency"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
This archive was generated by hypermail 2b30 : 10/04/00 EDT