Humanist Discussion Group

Humanist Archives: March 9, 2021, 8:37 a.m. Humanist 34.264 - the first computer-generated concordance

				
              Humanist Discussion Group, Vol. 34, No. 264.
        Department of Digital Humanities, University of Cologne
                      Hosted by DH-Cologne
                       www.dhhumanist.org
                Submit to: humanist@dhhumanist.org


    [1]    From: Dr. Herbert Wender <drwender@aol.com>
           Subject: Re: [Humanist] 34.263: the first computer-generated concordance: (81)

    [2]    From: Manfred Thaller <manfred.thaller@uni-koeln.de>
           Subject: Re: [Humanist] 34.263: the first computer-generated concordance: (34)

    [3]    From: Willard McCarty <willard.mccarty@mccarty.org.uk>
           Subject: Ellison's Concordance (58)


--[1]------------------------------------------------------------------------
        Date: 2021-03-08 22:52:08+00:00
        From: Dr. Herbert Wender <drwender@aol.com>
        Subject: Re: [Humanist] 34.263: the first computer-generated concordance:

The point in Thaller's statement seems to be that there was a UNIVAC based Bible
Concordance (KWOC ?) before all IBM based KWICs. Schaffner's remark points to
the problem what is meant by 'literary' (vs. biblical) text.

Busa's priority seems to be out of discussion if we ask nor for concordances in
general but for computer-aided processing of textual items; the information hold
by his punch-cards - at the end of the 1940s - was comprising linguistic
information. And probably one can say that Parrish stands on the schoulders of
Busa insofar as Luhn, the 'programming father' of the IBM-KWICs, was influenced
by Busa's ongoing work.

Using GoogleBooks I find for "Luhn KWIC programme":

"An auto - indexing method has been developed for producing for a given set of
documents a " Keyword - in - Context Index ” or " KWIC Index . ... ( 4 ) Luhn ,
H. P. “ The Automatic Creation of Literature Abstracts , " IBM Journal of
Research and Development , April 1958 ."

Regards, Herbert


-----Ursprüngliche Mitteilung-----
Von: Humanist <humanist@dhhumanist.org>
An: drwender@aol.com
Verschickt: Mo, 8. Mrz 2021 8:37
Betreff: [Humanist] 34.263: the first computer-generated concordance:

                  Humanist Discussion Group, Vol. 34, No. 263.
        Department of Digital Humanities, University of Cologne
                          Hosted by DH-Cologne
                      www.dhhumanist.org
                Submit to: humanist@dhhumanist.org




        Date: 2021-03-07 16:42:04+00:00
        From: Henry Schaffer <hes@ncsu.edu>
        Subject: Re: [Humanist] 34.260: psychology of quantification

Responding to one historical DH tidbit:

On Sun, Mar 7, 2021 at 2:48 AM Humanist <humanist@dhhumanist.org> wrote:

>                  Humanist Discussion Group, Vol. 34, No. 260.
>        Department of Digital Humanities, University of Cologne
>                                Hosted by DH-Cologne
>                        www.dhhumanist.org
>                Submit to: humanist@dhhumanist.org
>
>
>    [1]    From: Henry Schaffer <hes@ncsu.edu>
>            Subject: Re: [Humanist] 34.259: psychology of quantification
> (128)
>
>    [2]    From: Manfred Thaller <manfred.thaller@uni-koeln.de>
>            Subject: Re: [Humanist] 34.259: psychology of quantification
> (88)
> ...
> (What is frequently overlooked is, that Busa did NOT produce the first
> major computer generated concordance. That was John W. Ellison in 1956.
> ...


  I had been under the impression that the first one was attributed to
Stephen Maxfield Parrish. His publication of a book in this area
https://www.historyofinformation.com/detail.php?id=4017 was in 1959 and is
mentioned in the 1959-60 Cornell Univ. President's report at
https://ecommons.cornell.edu/bitstream/handle/1813/37495/CUA_v52_1960_X1_25.pdf?
sequence=1&isAllowed=y,
but my recollection is that he had produced computer generated concordances
some years earlier. The IBM 704 mentioned was introduced in 1954, but
Cornell's first computer installation was in 1953 (The History of Computing
at Cornell University By John W. Rudan p 1) It was an IBM CPC (Card
Programmed Calculator) (op. cit. p 13.) An IBM 650 was installed in 1956
(op. cit. p 14) and I saw it in 1957. That book doesn't mention arrival of
an IBM 704. So I'm wondering where Parrish did his computation, in addition
to wondering when he did his first concordance. I knew him (he was my
Freshman English instructor) but don't remember any mention of concordances.

--henry schaffer

--[2]------------------------------------------------------------------------
        Date: 2021-03-08 08:18:00+00:00
        From: Manfred Thaller <manfred.thaller@uni-koeln.de>
        Subject: Re: [Humanist] 34.263: the first computer-generated concordance:

Dear Henry,

apologies for missing to quote a possibly not widely known source:
>   I had been under the impression that the first one was attributed to
> Stephen Maxfield Parrish.

I based my statement on Roy A. Wisbey: “Computers and Lexicography”, in:
Dell Hymes (ed.): <it>The Use of Computers in Anthropology</it>, Mouton,
1965 (= Studies in General Anthropology II), 215-234, here: 225-226. The
publishing dates quoted there (and verifiable in the library systems)
are 1956 for Ellison and 1959 for Parrish. And it stats explicitly that
also the work of Ellison predates Parrish.

I do not want to turn this into an antiquarian's session, but there is a
story which I used occasionally as a fun fact to liven up presentations,
so one or the other Humanist might also appreciate it. Ellison was
supported by Remington Rand Univac and their magnetic tape drives (which
according to my memory was a major argument for Busa to convince IBM to
equip what so far had mainly been a card sorter exercise by computers).
When at ca. 1980 I had to sort "large", i.e. more-than-one-mag-tape,
data sets on a UNIVAC and consulted the relevant manual I was slightly
surprised to encounter huge excerpts from the bible. Seemingly Univac
had taken the support for Ellison as a motive to develop their sorting
facilities. Humanities triggering the development of missing technology.
That's exactly as it should be.

Kind regards,
Manfred Thaller


--
Prof. em. Dr. Manfred Thaller
Zuletzt Universität zu Köln /
Formerly University at Cologne

--[3]------------------------------------------------------------------------
        Date: 2021-03-08 08:04:25+00:00
        From: Willard McCarty <willard.mccarty@mccarty.org.uk>
        Subject: Ellison's Concordance

From the Preface of John W. Ellison, NELSON’S Complete
Concordance of the Revised Standard Version Bible, 2nd edn.
(Nashville TN: Thomas Nelson, 1984).

> An exhaustive concordance of the Bible, such as that of James
> Strong, takes about a quarter of a century of careful, tedious work
> to guarantee accuracy. Few students would want to wait a generation
> for a CONCORDANCE of the REVISED STANDARD VERSION of the HOLY BIBLE.
> To distribute the work among a group of scholars would be to run the
> risk of fluctuating standards of accuracy and completeness. The
> use of mechanical or electronic assistance was feasible and at hand.
> The Univac 1 computer at the offices of Remington Rand, Inc., was
> selected for the task. Every means possible, both human and
> mechanical, was used to guarantee accuracy in the work.
[...]
> John W. Ellison.
> Winchester, Massachusetts
> Whitmonday, 1956

The Preface is well worth reading. Nelson discusses the work that went
into the work. His note that,

> A study of the frequency and significance of the words of the Bible
> showed that certain words could well be ommitted either because of
> their frequency, or because they would seldom be the key word under
> whose heading the passage would be sought. Following this preface is
> a list of words which were not included in this CONCORDANCE, and only
> a portion of the occurrences of the following words are listed:
> had, has, have, having, and will.

The object was, of course, still to produce a printed book, hence the
imperative to eliminate words unlikely to be useful as 'key words'
facilitating a reader's search for this or that passage. This brings to
mind the opening words of John Burrows' Computation into Criticism: A
Study of Jane Austen's Novels and an Experiment in Method (1987):

> It is a truth not generally acknowledged that, in most discussions of
> works of English fiction, we proceed as if a third, two-fifths, a
> half of our material were not really there. For Jane Austen, that
> third, two-fifths, or half comprises the twenty, thirty, or fifty
> most common words of her literary vocabulary... For most readers and
> most critics, it seems, such very common words as these are to be
> taken as perfect specimens of the harmless drudge, performing
> necessary tasks but deserving no particular attention. For
> lexicographers, the comparatively limited semantic ranges attaching
> to most — not all — of them can soon be spanned. For the makers of
> concordances, they are fit only to be excluded as ‘non-significant’
> words...



Yours,
WM
--
Willard McCarty,
Professor emeritus, King's College London;
Editor, Interdisciplinary Science Reviews;  Humanist
www.mccarty.org.uk


_______________________________________________
Unsubscribe at: http://dhhumanist.org/Restricted
List posts to: humanist@dhhumanist.org
List info and archives at at: http://dhhumanist.org
Listmember interface at: http://dhhumanist.org/Restricted/
Subscribe at: http://dhhumanist.org/membership_form.php