21.014 text visualization

From: Humanist Discussion Group (by way of Willard McCarty willard.mccarty_at_kcl.ac.uk>
Date: Sat, 12 May 2007 07:10:56 +0100

                Humanist Discussion Group, Vol. 21, No. 14.
       Centre for Computing in the Humanities, King's College London
  www.kcl.ac.uk/schools/humanities/cch/research/publications/humanist.html
                        www.princeton.edu/humanist/
                     Submit to: humanist_at_princeton.edu

   [1] From: Daniel O'Donnell <daniel.odonnell_at_uleth.ca> (87)
         Subject: Re: 21.012 text visualization?

   [2] From: Desmond Schmidt <schmidt_at_itee.uq.edu.au> (39)
         Subject: Re: 21.012 text visualization?

   [3] From: Martin Mueller <martinmueller_at_northwestern.edu> (39)
         Subject: Re: 21.012 text visualization?

   [4] From: "Matt Kirschenbaum" <mkirschenbaum_at_gmail.com> (21)
         Subject: Re: 21.012 text visualization?

   [5] From: Stan Ruecker <sruecker_at_ualberta.ca> (70)
         Subject: Re: 21.012 text visualization?

   [6] From: Willard McCarty <willard.mccarty_at_kcl.ac.uk> (22)
         Subject: the liberty to experiment

--[1]------------------------------------------------------------------
         Date: Sat, 12 May 2007 06:53:54 +0100
         From: Daniel O'Donnell <daniel.odonnell_at_uleth.ca>
         Subject: Re: 21.012 text visualization?

This is a good question: we had quite a discussion about it in the halls
at the TEI meeting last year in Victoria. A couple of the papers there
used quite interesting text visualisations, but the issue of usefulness
came up.

I often wonder about literary text processing if it is not better to see
many techniques as tools of very focussed (a more positive term than
limited) application. One good visualisation used to be in the old
Multidoc SGML browser: it drew a cluster diagram of search hits in the
scroll bar, which really helped navigation. Another useful tool I think
I remember is the results of the stemma diagrams you can get for
individual variants in the Canterbury Tales Project.

This seems to me to be typical of the best usage of such things. I used
to be struck in dealing with digital editions of the number of tools
editors provided me with that I didn't think I'd need and certainly had
never imagined I would. In literary research, I think that's because the
goals of processors/tool designers and literary scholars are quite
different. Tools are good for doing a specific task very well along a
single plane. But literary and textual researchers rarely ask
single-plane questions--which is why literary research is such a magpie
field. So I think there is very often a sense of "that's it?" when we
seen something like a text visualisation: it is a very complex tool that
tends to answer a relatively simple question--clustering, or the like.

-dan

On Fri, 2007-05-11 at 16:24 +0100, Humanist Discussion Group (by way of
Willard McCarty ) wrote:
> Humanist Discussion Group, Vol. 21, No. 12.
> Centre for Computing in the Humanities, King's College London
> www.kcl.ac.uk/schools/humanities/cch/research/publications/humanist.html
> www.princeton.edu/humanist/
> Submit to: humanist_at_princeton.edu
>
>
>
> Date: Fri, 11 May 2007 16:18:45 +0100
> From: Martyn Jessop <martyn.jessop_at_KCL.AC.UK>
> >
> I'm researching a book on visualization in the digital humanities and
> need some views from those working in text visualization. I've
> reviewed visualization of quantitative data, spatial data and
> temporal data and there is plenty of support for the new
> visual methods.
>
> When I came to eamine text visualization things were very
> different. My difficulty is that many of the people I've spoken
> to argue that the more imaginative visualizations of text are merely
> decorative and we have not seen much in the way of useful insight
> emerging from the use of such visualizations. Is this really an
> accurate assessment?
>
> I come from a background of visualization of spatial and quantitative
> data and can see how adaptions of statistical graphics mesh well with
> 'conventional' text analysis but I'm interested in the more creative
> applications such as word brush, word rain and similar imaginative
> tools present in projects like TAPoR and TextARC.
>
> I'd like to keep things simple and not risk limiting the discussion so
> I will not say more at this stage. So are the "blobby clusters" and
> "concept shapes" of text visualizations a waste of time or are visual
> analysis strategies providing useful insights which are being translated
> into concrete research outcomes?
>
> Regards
>
> Martyn Jessop
>
> ----------------------
> Martyn Jessop
> Centre for Computing in the Humanities
> King's College London
> Strand
> London WC2R 2LS
>
> email: martyn.jessop_at_kcl.ac.uk
> Phone: 0207-848-2470
> Fax: 0207-848-2980

-- 
Daniel Paul O'Donnell, PhD
Department Chair and Associate Professor of English
Director, Digital Medievalist Project http://www.digitalmedievalist.org/
Chair, Text Encoding Initiative http://www.tei-c.org/
Department of English
University of Lethbridge
Lethbridge AB T1K 3M4
Vox +1 403 329-2377
Fax +1 403 382-7191
Email: daniel.odonnell_at_uleth.ca
WWW: http://people.uleth.ca/~daniel.odonnell/
--[2]------------------------------------------------------------------
         Date: Sat, 12 May 2007 06:55:08 +0100
         From: Desmond Schmidt <schmidt_at_itee.uq.edu.au>
         Subject: Re: 21.012 text visualization?
I don't know what you mean by text visualisation. If you mean semantic
mapping I can suggest you look at Leximancer (www.leximancer.com). It
is a commercial product and I am ashamed to say on this discussion
group that I work on it. However, advertisements aside, Leximancer
deduces the semantic content of a set of texts, which can be huge, in a
variety of formats (pdf, word, rtf, html, xml, text, csv), works out
what the key concepts are and produces a semantic map that you can
click on and explore. There are some examples on the Leximancer website
- try the "case studies". What it gives you when run on its own is a
semantic key to the whole work - lets you see what a body of text says
without actually reading it. The web version is  limited in that you
can't zoom in but the desktop version can expand the map to full screen
and then you can do two-concept combinations, say "emperor" and "power"
in the Gibbon example and it gives you all the passages where the two
CONCEPTs co-occur. The actual words "emperor" and "power" may not occur
in all passages cited because it is a semantic index. What I mean is
that passages where "kennel", "bone" and "barking" occur are obviously
about "dog" but that's a semantic link, not a textual one. If you want
to seed it with several concepts, to compare and contrast ideas in a
particular text you can do that too, or add tags by simply putting
texts into various folders - we did things like comparing debates from
different sides of the houses of parliament by just dropping the
liberal debates into one folder and labour ones in another. It also
does spreadsheets, taking the column headings as tags, so you can do
survey results. The good thing about this approach is that it is
unbiased. It doesn't rely on human tagging. It is grounded theory as it
should be - the meaning emerges from the text. The trial is free for
one month. There will be a new version soon for enterprise use that is
also much faster (this is what I am working on). Leximancer is not
decorative. It is a very practical tool that we have sold all over the
world to government agencies, lawyers, police forces, and academics
doing research.
--------------------------
Dr Desmond Schmidt
PhD student
School of Information Technology and Electrical Engineering
University of Queensland
Ph: 3365 7171 (wk)
http://www.itee.uq.edu.au/~schmidt
--[3]------------------------------------------------------------------
         Date: Sat, 12 May 2007 06:56:05 +0100
         From: Martin Mueller <martinmueller_at_northwestern.edu>
         Subject: Re: 21.012 text visualization?
Martyn Jessop's question is of deep interest to me since
visualization is a central concern of the Monk Project (http://
monkproject.org) for which my tag line is "Towards a billion words of
written English from four centuries in a metadata rich environment."
MONK stands for Metadata Offer New Knowledge.
One might begin with the observation that text is always already
visualization since it is 'drawn language'. And one might further
observe that there are a lot of graphic designers on and off the web
who will go to almost any length to make text less readable. From
that perspective you can understand the response of austere readers
who want readable text and for whom anything else interferes with the
only visualization that really matters: make it easy to see the text you read.
But if 'text' becomes 'data' in whatever environment and for whatever
reason you don't 'read' but look at results that are in some form
tabulated or quantified. For instance, the Philologic search engine
lets you scan across some 600 million words of English, pick out
144,000 occurrences of various spellings and grammatical forms of
'liberty' and will return results by decade and frequency per 10,000
words. These results are much more easily interpreted as a chart
because you "see" at once that there are quite sharp spikes in the
1650's and 1680's.
So there is a fairly simple but stringent test for visualization:
does it help you "see" things in the strong cognitive sense of the
word?  Such seeing is subject to no more or less distortion than in
other forms of visualization. State of the Union addresses are now
routinely analyzed in terms of word frequency and increased
frequencies are marked by bigger circles. If a word is used twice as
often and the radius of the circle is doubled, the area of the circle
is four times as large. Is that a new form of deception or does it
help you "see" something more clearly?
There is a statistic called Dunning's log likelihood ratio, which
shows you what words in text A are disproportionately common or
uncommon when compared with Text B. These lists provide surprisingly
effective keywords for interpretation. A plain list will certainly
do. But something might be said for representing this last as a cast
of character that drift across the screen as thinly scrawled or big
and fat letters.
The trouble with this and other forms of visualization is that it is
more fun to be cool than true.
--[4]------------------------------------------------------------------
         Date: Sat, 12 May 2007 06:56:51 +0100
         From: "Matt Kirschenbaum" <mkirschenbaum_at_gmail.com>
         Subject: Re: 21.012 text visualization?
 >When I came to eamine text visualization things were very
 >different. My difficulty is that many of the people I've spoken
 >to argue that the more imaginative visualizations of text are merely
 >decorative and we have not seen much in the way of useful insight
 >emerging from the use of such visualizations. Is this really an
 >accurate assessment?
I would strongly recommend soliciting Johanna Drucker for a copy of
her "Graphesis" essay, which addresses this question directly. The
frequent characterization of aesthetics that are beautiful or cool but
also, somehow, "merely" ornamental or decorative is one of the most
stifling in our field, IMO, and needs to be challenged with a robust
theoretical conversation. Johanna's piece is the first step. Matt
-- 
Matthew Kirschenbaum
Assistant Professor of English
Associate Director,
Maryland Institute for Technology in the Humanities (MITH)
University of Maryland
301-405-8505 or 301-314-7111 (fax)
http://www.mith.umd.edu/
http://www.otal.umd.edu/~mgk/
--[5]------------------------------------------------------------------
         Date: Sat, 12 May 2007 06:57:17 +0100
         From: Stan Ruecker <sruecker_at_ualberta.ca>
         Subject: Re: 21.012 text visualization?
Hi Martyn,
I work in this area and so would naturally say there is some good
work being done, but much of it may not yet have translated into
concrete research outcomes.
I make the distinction between scientific visualization and
humanities visualization in that the former converts primarily
numeric data into visual forms for manipulation, while the latter
works with elements consisting of text or image displayed in
unconventional ways. I'm guessing your own definition would have more
to do with content areas than with visual elements.
We've been working recently in the MONK project (www.monkproject.org)
on  a visualization issue raised by Tanya Clement, who is interested
in repetition with variation in Gertrude Stein's novel The Making of
Americans. How can someone see patterns of repetitions across a
thousand pages, or as Tanya sometimes says, 3000 paragraphs? We have
a session at SDH-SEMI at the end of May outlining three projects
dealing with this topic. Catherine Plaisant and her postdoc Anthony
Don at the Human-Computer Interaction Lab at University of Maryland
have also created a working prototype that displays 3-grams. It is
online here: http://www.cs.umd.edu/hcil/textvis/featurelens
I hope this is some help.
yrs,
Stan Ruecker
Assistant Professor
Humanities Computing Program
Department of English and Film Studies
University of Alberta
Edmonton AB CANADA
Humanist Discussion Group (by way of Willard McCarty
<willard.mccarty_at_kcl.ac.uk>) wrote:
 >                Humanist Discussion Group, Vol. 21, No. 12.
 >       Centre for Computing in the Humanities, King's College London
 >  www.kcl.ac.uk/schools/humanities/cch/research/publications/humanist.html
 >                        www.princeton.edu/humanist/
 >                     Submit to: humanist_at_princeton.edu
 >
 >         Date: Fri, 11 May 2007 16:18:45 +0100
 >         From: Martyn Jessop <martyn.jessop_at_KCL.AC.UK>
 >         >I'm researching a book on visualization in the digital 
humanities and
 >need some views from those working in text visualization. I've
 >reviewed visualization of quantitative data, spatial data and
 >temporal data and there is plenty of support for the new
 >visual methods.
 >When I came to eamine text visualization things were very
 >different. My difficulty is that many of the people I've spoken
 >to argue that the more imaginative visualizations of text are merely
 >decorative and we have not seen much in the way of useful insight
 >emerging from the use of such visualizations. Is this really an
 >accurate assessment?
 >I come from a background of visualization of spatial and quantitative
 >data and can see how adaptions of statistical graphics mesh well with
 >'conventional' text analysis but I'm interested in the more creative
 >applications such as word brush, word rain and similar imaginative
 >tools present in projects like TAPoR and TextARC.
 >I'd like to keep things simple and not risk limiting the discussion so
 >I will not say more at this stage. So are the "blobby clusters" and
 >"concept shapes" of text visualizations a waste of time or are visual
 >analysis strategies providing useful insights which are being translated
 >into concrete research outcomes?
 >Regards
 >Martyn Jessop
 >----------------------
 >Martyn Jessop
 >Centre for Computing in the Humanities
 >King's College London
 >Strand
 >London WC2R 2LS
 >email: martyn.jessop_at_kcl.ac.uk
 >Phone: 0207-848-2470
 >Fax: 0207-848-2980
--[6]------------------------------------------------------------------
         Date: Sat, 12 May 2007 06:58:31 +0100
         From: Willard McCarty <willard.mccarty_at_kcl.ac.uk>
         Subject: the liberty to experiment
My colleague Martyn Jessop's question is an excellent one. It's
interesting to me that questions of this form are being asked in all
sorts of other areas, perhaps because the level of activity is now
such that there are many examples of work to wonder about.
Presentation tools of various sorts make it relatively easy to get
something demonstrable out there, the most immediately striking of
which are visualizations. It's healthy to be called to account. But
at the same time, one has to be able to act on intuitions and
convictions not supported by much if any evidence. One has to have
the liberty to follow one's nose, to invest significant amounts of
resources to create something in order to see what will happen.
Otherwise true innovation becomes shackled by the known and predictable.
There are tensions here of course, and they get worse the more it
costs to work forward into the unknown. But we should be clear about
what's involved in going where no one has gone before. Sometimes it
may be better to go it alone and relatively unaided rather that
require a costly spaceship.
Yours,
WM
Dr Willard McCarty | Reader in Humanities Computing | Centre for
Computing in the Humanities | King's College London |
http://staff.cch.kcl.ac.uk/~wmccarty/. 
Received on Sat May 12 2007 - 02:20:50 EDT

This archive was generated by hypermail 2.2.0 : Sat May 12 2007 - 02:20:50 EDT