Humanist Discussion Group, Vol. 14, No. 268. Centre for Computing in the Humanities, King's College London <http://www.princeton.edu/~mccarty/humanist/> <http://www.kcl.ac.uk/humanities/cch/humanist/> [1] From: cbf@socrates.Berkeley.EDU (6) Subject: Re: 14.0262 methodological primitives [2] From: Stephen Ramsay <sjr3a@virginia.edu> (22) Subject: Word lists [3] From: Willard McCarty <willard.mccarty@kcl.ac.uk> (52) Subject: level of granularity & other questions --[1]------------------------------------------------------------------ Date: Mon, 25 Sep 2000 07:00:04 +0100 From: cbf@socrates.Berkeley.EDU Subject: Re: 14.0262 methodological primitives Does anyone remember, except me, the set of little UNIX utilities that Bill Tuthill wrote at Berkeley about 20 years ago? They were dumb as paint, to quote one of my colleagues, and incredibly useful. Wilhelm, as usual, makes some excellent points. Charles Faulhaber The Bancroft Library UC Berkeley, CA 94720-6000 (510) 642-3782 FAX (510) 642-7589 cfaulhab@library.berkeley.edu --[2]------------------------------------------------------------------ Date: Mon, 25 Sep 2000 07:01:02 +0100 From: Stephen Ramsay <sjr3a@virginia.edu> Subject: Word lists Willard's question regarding methodological primitives is quite apropos to my current research, since I am in the process of creating a general-use textual analysis tool (which I think of, in my more grandiose moments, as a free, non-proprietary successor to TACT). Much of what I'm working with right now involves the use of large word lists, and I was wondering if my colleagues in computational linguistics might be able to point me in the right direction. How do I go about getting my hands on large word lists and corpora? I am particularly interested in word lists that map individual words to parts of speech. Surfing the web has turned up a few possibilities, but I wonder if anyone would be willing to supplement my scattershot approach with a professional sense of what's out there? Any help would be appreciated. Steve Stephen Ramsay Senior Programmer Institute for Advanced Technology in the Humanities Alderman Library, University of Virginia phone: (804) 924-6011 email: sjr3a@virginia.edu web: http://www.iath.virginia.edu/ "By ratiocination, I mean computation" -- Hobbes --[3]------------------------------------------------------------------ Date: Mon, 25 Sep 2000 08:10:04 +0100 From: Willard McCarty <willard.mccarty@kcl.ac.uk> Subject: level of granularity? Many thanks to Wilhelm Ott in Humanist 14.262 for his thoughtful response to my posting about methodological primitives, and to Charles Faulhaber, above, for his recollection of those UNIX tools. Indeed, the idea is an old one. I suppose one could argue that programming languages comprise statements that are methodological primitives of a sort and that the UNIX toolbox approach identified the notion quite early on. Certainly TUSTEP is an example of the kind of working environment I was asking about, and I'd hazard to say that no work along these lines could afford to ignore it. I'll confess to have said nothing about previous work and thinking in order to provoke whatever interest was among us, and now must beg your forgiveness if this seems a silly way to reopen an old topic. Perhaps the real question is, what remains to be done? -- assuming, of course, that we all agree the notion of methodological primitives is worthy. I'd venture to say that on the humanities computing research agenda there are at least two big items, or two big groups of items, metadata (i.e. encoding) and primitives. Francois Lachance asks what we gain from calling these things "primitives"? What I intended to suggest was a class of objects at the lowest practical level as this is defined by the operations of humanities scholarship. For practical purposes, black boxes (perhaps with switches and knobs, for minor adjustments) that a scholar could select and arrange ad lib. As Wilhelm Ott said in his message, lemmatising the words of an inflected language is at the moment not a primitive (...thus the discontent provider?) -- because so much intervention is required (as I know from having given up on a similar project in the same language). For this example, what I do not understand is whether it can ever be a primitive. I also rather ignorantly wonder if sorting, given enough of the right sort of switches and knobs, could be one, or if we could have distinct sort-primitives for groups of languages. Would a productive approach begin with asking at what level of "granularity" primitives can be defined, and whether this level is subject to change (i.e. to rise) with technological progress? Is there methodological value for the humanities in asking about algorithmically specifiable primitives? Is there simply too much variation in approaches to problems in the humanities ever to allow for significant progress beyond what has already been done? Two quite similar images stick in mind from work done many years ago. One is from some scientific visualisation software I saw demonstrated once: it allowed the user to construct a computational process by plugging together graphically represented sub-processes, allowing for various adjustments and interventions along the way. Another is from a lecture given by Antoinette Renouf (Liverpool, www.rdues.liv.ac.uk), who described her neologism-processor by a similar sort of industrial representation. Both have caused me to wonder if we couldn't have (with a great deal more work) something like a set of computational Legos to play with, and if we had such, whether we couldn't learn a fair bit by playing with them. Comments, please, esp those which open the windows. Yours, WM - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Dr. Willard McCarty, Senior Lecturer, King's College London voice: +44 (0)20 7848 2784 fax: +44 (0)20 7848 5081 <Willard.McCarty@kcl.ac.uk> <http://ilex.cc.kcl.ac.uk/wlm/> maui gratias agere
This archive was generated by hypermail 2b30 : 09/25/00 EDT