18.463 indexing local machines

From: Humanist Discussion Group (by way of Willard McCarty willard.mccarty_at_kcl.ac.uk>
Date: Thu, 6 Jan 2005 07:27:45 +0000

               Humanist Discussion Group, Vol. 18, No. 463.
       Centre for Computing in the Humanities, King's College London
                     Submit to: humanist_at_princeton.edu

         Date: Thu, 06 Jan 2005 07:22:05 +0000
         From: Willard McCarty <willard.mccarty_at_kcl.ac.uk>
         Subject: indexing local machines

Recently I have tried out two programs for indexing the text- and
email-files on my local machines and one for cataloguing my images. This
is, in effect, a query about such programs, with a long preamble on my
experience so far.

Like most others here, I suppose, I've accumulated sufficient amounts of
texts and images to make finding what I need sometimes quite difficult.
During 2003-4 I started a systematic and large-scale effort to accumulate
Web-pages, PDFs and other forms of text to support my research. (The
collection now stands at ca. 1/2GB -- it's small because I actually read
the stuff.) At first I evolved a reasonably complex directory structure for
these files, but soon I realised that I was spending significant amounts of
time deciding in which of the sub-sub-subdirectories to put a newcomer and
looking through the many such sub-sub-subdirectories for one I had
judiciously placed somewhere not too long before. So I set up a parallel
unstructured bit-bucket in which I put an identical copy of everything,
with the idea of seeing which way my wind was blowing. I also adopted the
practice of putting as many copies of newcomers in as many places in the
highly structured collection as I thought they belonged.

It took only about a month before I deleted the highly structured
collection in favour of the unstructured one. Perhaps, if I had been able
to replicate myself and my equipment a number of times, I might have
assigned some of these imaginary selves to a cataloguing, metadata-writing
party, but under the circumstances I could only find that notion amusing.
Seriously, in the life of an interdisciplinary computing humanist nearly
every intellectual object falls under so many distinct categories, whatever
the scheme, that I cannot see any such thing working. Except, perhaps, for
those who devote themselves to the scheme rather than to what it schematizes.

Automatic indexing then became a priority. Eventually I gave up on Windows
XP's native indexing -- the finding mechanism is too slow and clumsy. A
visiting lecturer (may his tribe increase) drew my attention to X1
(www.x1.com/), which I tried out, then purchased. A friend then told me
about Google's Desktop free Search (desktop.google.com/), which I tried,
then discarded: what works for the Web at large does not, in my experience,
work well for one's private collection.

Meanwhile I picked up Google's Picasa Photo Organizer
(www.google.com/downloads/), which is as good as anything I've seen.

What have others done? What's been the experience?


[NB: If you do not receive a reply within 24 hours please resend]
Dr Willard McCarty | Senior Lecturer | Centre for Computing in the
Humanities | King's College London | Kay House, 7 Arundel Street | London
WC2R 3DX | U.K. | +44 (0)20 7848-2784 fax: -2980 ||
willard.mccarty_at_kcl.ac.uk www.kcl.ac.uk/humanities/cch/wlm/
Received on Thu Jan 06 2005 - 02:37:30 EST

