4.0160 Concording: Micro OCP, WordCruncher, TACT (2/70)

Elaine Brennan & Allen Renear (EDITORS@BROWNVM.BITNET)
Sat, 2 Jun 90 14:39:02 EDT

Humanist Discussion Group, Vol. 4, No. 0160. Saturday, 2 Jun 1990.

(1) Date: Fri, 1 Jun 90 16:57:23 BST (15 lines)
From: J J Higgins <Higgins@np1a.bristol.ac.uk>
Subject: Re: 4.0139 Queries

(2) Date: 1 June 90, 23:35:33 EMT (55 lines)
From: Knut Hofland +47 5 212954/55/56 FAFKH at NOBERGEN
Subject: Micro OCP (re: 4.0139 .0153)

Micro OCP

A survey review of concordancers including Micro OCP and WordCruncher
will appear in System (Pergamon Press) either Vol 18, 3 or 19, 1.

Micro OCP incorporates COCOA formatting (a metalanguage for embedding
codes) which allows sophisticated sorting, eg separating citations
according to which character is speaking in a play. On most direct
comparisons WordCruncher comes out ahead.

[ ... ]
Micro OCP (and the main frame OCP) is very slow so you will need
a rather fast machine if you don't want to drink a lot of coffee...
I have use "A Dolls House" (155Kb) by Ibsen to test several indexing and
concordance programs. On a 16 Mhz 386 machine, Micro OCP used 3:00
minutes to pick out the context of two words (43 occurences).
On a standard 4.77 MHz PC, OCP used 25:18 minutes and would have
used more than 17 hours if I wanted to search in all of Ibsens plays
(about 670.000 running words).

WordCruncher and TACT used 2:30 to index "A Dolls House" and then
the context of any word could be searched in a matter of seconds.

Mark Zimmermanns "Free Text Browser" on a MAC SE/30 (also a 16 Mhz
32 bit machine) index the text in 23 seconds. (This program is available
via mail or msg from MACSERVE at PUCC/IRLEARN or ftp from sumex-aim).

Other stand alone KWIC programs are 5-6 times as fast as Micro OCP.

Micro OCP has a general and powerful command language, but it does
not seem to be optimised for speed. The command generator is an
interactive part (and a good help for the novice compared to
working with OCP on a main frame), but the rest of the program is
a batch program. It should be possible to run several jobs in one
run, but this is only possible if you makes your own copies of the
control file, renames these on after another and run only the
search program in between. (This is not mentioned in the manual).
My experience with the program is with the first version for PC.

WordCruncher is mainly for retriving contexts to words, pattern of words
or combinations of words. In the later versions (4.2 or 4.3) it is
easy to save the results to a printer or file (F6). It has no option
to make a word list sorted by frequency or in reverse order, this has to
be done outside WordCruncher (with your own program or a standard
program), but the alphabetic word list made by WordCruncher could
be a starting point.

As a rival to WordCruncher I would recommend TACT from University
of Toronto, which has several features not found in WordCruncher,
like general pattern match, better possibilities for having
references in the text and using them in a search, collocations
(in version 1.2). For information about TACT send a note to
CCH@UTOREPAS.BITNET. TACT is freeware, for version 1.1 they only
charged CDN $30 for printed manual and handling.

Knut Hofland

The Norwegian Computing Centre for the Humanities
Street adr: Harald Haarfagres gt. 31
Post adr: P.O. Box 53, University, N-5027 Bergen, Norway
Tel: +47 5 212954/5/6 Fax: +47 5 322656