14.0268 methodological primitives

From: by way of Willard McCarty (willard@lists.village.Virginia.EDU)
Date: 09/25/00
Next message: by way of Willard McCarty: "14.0269 seminar, workshop, conference"
Previous message: by way of Willard McCarty: "14.0266 letter frequency in Latin"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
               Humanist Discussion Group, Vol. 14, No. 268.
       Centre for Computing in the Humanities, King's College London
               <http://www.princeton.edu/~mccarty/humanist/>
              <http://www.kcl.ac.uk/humanities/cch/humanist/>

   [1]   From:    cbf@socrates.Berkeley.EDU                            (6)
         Subject: Re: 14.0262 methodological primitives

   [2]   From:    Stephen Ramsay <sjr3a@virginia.edu>                 (22)
         Subject: Word lists

   [3]   From:    Willard McCarty <willard.mccarty@kcl.ac.uk>         (52)
         Subject: level of granularity & other questions


--[1]------------------------------------------------------------------
         Date: Mon, 25 Sep 2000 07:00:04 +0100
         From: cbf@socrates.Berkeley.EDU
         Subject: Re: 14.0262 methodological primitives

Does anyone remember, except me, the set of little UNIX utilities that
Bill Tuthill wrote at Berkeley about 20 years ago? They were dumb as
paint, to quote one of my colleagues, and incredibly useful.

Wilhelm, as usual, makes some excellent points.

Charles Faulhaber	The Bancroft Library	UC Berkeley, CA 94720-6000
(510) 642-3782		FAX (510) 642-7589    cfaulhab@library.berkeley.edu

--[2]------------------------------------------------------------------
         Date: Mon, 25 Sep 2000 07:01:02 +0100
         From: Stephen Ramsay <sjr3a@virginia.edu>
         Subject: Word lists

Willard's question regarding methodological primitives is quite apropos to
my current research, since I am in the process of creating a general-use
textual analysis tool (which I think of, in my more grandiose moments, as
a free, non-proprietary successor to TACT).

Much of what I'm working with right now involves the use of large word
lists, and I was wondering if my colleagues in computational linguistics
might be able to point me in the right direction.  How do I go about
getting my hands on large word lists and corpora?  I am particularly
interested in word lists that map individual words to parts of
speech.

Surfing the web has turned up a few possibilities, but I wonder if anyone
would be willing to supplement my scattershot approach with a professional
sense of what's out there?

Any help would be appreciated.

Steve

Stephen Ramsay
Senior Programmer
Institute for Advanced Technology in the Humanities
Alderman Library, University of Virginia
phone: (804) 924-6011
email: sjr3a@virginia.edu
web:   http://www.iath.virginia.edu/

"By ratiocination, I mean computation" -- Hobbes

--[3]------------------------------------------------------------------
         Date: Mon, 25 Sep 2000 08:10:04 +0100
         From: Willard McCarty <willard.mccarty@kcl.ac.uk>
         Subject: level of granularity?

Many thanks to Wilhelm Ott in Humanist 14.262 for his thoughtful response
to my posting about methodological primitives, and to Charles Faulhaber,
above, for his recollection of those UNIX tools. Indeed, the idea is an old
one. I suppose one could argue that programming languages comprise
statements that are methodological primitives of a sort and that the UNIX
toolbox approach identified the notion quite early on. Certainly TUSTEP is
an example of the kind of working environment I was asking about, and I'd
hazard to say that no work along these lines could afford to ignore it.
I'll confess to have said nothing about previous work and thinking in order
to provoke whatever interest was among us, and now must beg your
forgiveness if this seems a silly way to reopen an old topic.

Perhaps the real question is, what remains to be done? -- assuming, of
course, that we all agree the notion of methodological primitives is
worthy. I'd venture to say that on the humanities computing research agenda
there are at least two big items, or two big groups of items, metadata
(i.e. encoding) and primitives.

Francois Lachance asks what we gain from calling these things "primitives"?
What I intended to suggest was a class of objects at the lowest practical
level as this is defined by the operations of humanities scholarship. For
practical purposes, black boxes (perhaps with switches and knobs, for minor
adjustments) that a scholar could select and arrange ad lib. As Wilhelm Ott
said in his message, lemmatising the words of an inflected language is at
the moment not a primitive (...thus the discontent provider?) -- because so
much intervention is required (as I know from having given up on a similar
project in the same language). For this example, what I do not understand
is whether it can ever be a primitive. I also rather ignorantly wonder if
sorting, given enough of the right sort of switches and knobs, could be
one, or if we could have distinct sort-primitives for groups of languages.

Would a productive approach begin with asking at what level of
"granularity" primitives can be defined, and whether this level is subject
to change (i.e. to rise) with technological progress? Is there
methodological value for the humanities in asking about algorithmically
specifiable primitives? Is there simply too much variation in approaches to
problems in the humanities ever to allow for significant progress beyond
what has already been done?

Two quite similar images stick in mind from work done many years ago. One
is from some scientific visualisation software I saw demonstrated once: it
allowed the user to construct a computational process by plugging together
graphically represented sub-processes, allowing for various adjustments and
interventions along the way. Another is from a lecture given by Antoinette
Renouf (Liverpool, www.rdues.liv.ac.uk), who described her
neologism-processor by a similar sort of industrial representation. Both
have caused me to wonder if we couldn't have (with a great deal more work)
something like a set of computational Legos to play with, and if we had
such, whether we couldn't learn a fair bit by playing with them.

Comments, please, esp those which open the windows.

Yours,
WM
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Dr. Willard McCarty, Senior Lecturer, King's College London
voice: +44 (0)20 7848 2784 fax: +44 (0)20 7848 5081
<Willard.McCarty@kcl.ac.uk> <http://ilex.cc.kcl.ac.uk/wlm/>
maui gratias agere
Next message: by way of Willard McCarty: "14.0269 seminar, workshop, conference"
Previous message: by way of Willard McCarty: "14.0266 letter frequency in Latin"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
This archive was generated by hypermail 2b30 : 09/25/00 EDT