Humanist 35.467 - cautions about digital studies of words

        Date: 2022-01-17 23:01:48+00:00
        From: Alasdair Ekpenyong
        Subject: Re: [Humanist] 35.463: cautions about digital studies of words

Great insight, Dr. McCarty. Two thoughts in response. I guess I speak as a
burgeoning data scientist, currently working on a “sentiment analysis” literary
studies book chapter similar in method to the approach that these scholars take
in their rationality vs emotion word study. Natural language processing is one
of the relevant keywords for the methodology these scholars have taken; but I’m
probably preaching to the choir as you have way more academic experience than
me. Anyway I appreciate the chance to practice speaking intelligently as a
current masters student, ha.

One: You’re very correct to ask “Is this not just the first step in study which
would go beyond isolated words?” Yes, a dominant trend in natural language
processing is to move beyond simple one-word-at-a-time studies (these are called
unigrams) to looking at two-word sequences (bigrams) or even three-word-
sequences (trigrams). So the scholars have opportunity to refine their analysis
by looking to see what sentiments the bigrams 1850 to 1980 represents.

I’m currently working on a literary comparison of F. Scott Fitzgerald vs one of
his contemporaries, Owen Johnson, looking at patterns in the bigrams that appear
in each author’s writing.

Two: You’re also insightful to ask deeper theoretical questions about the
validity and soundness of using quantitative computational analysis to derive
qualitative literary conclusions about trends in human thought. You might
appreciate the comments in this 2014 Big Data & Society article, “Big Data, new
epistemologies, and paradigm shifts.”

The article presents optimistic perspectives about the epistemological merit of
computational analysis. There seem to be two main camps of optimism:

The first group believes that new digital humanities techniques – counting,
graphing, mapping and distant reading – bring methodological rigour and
objectivity to disciplines that heretofore have been unsystematic and random in
their focus and approach (Moretti,
2005, []; Ramsay,
2010 (]). In
contrast, the second group argues that, rather than replacing traditional
methods or providing an empiricist or positivistic approach to humanities
scholarship, new techniques complement and augment existing humanities methods
and facilitate traditional forms of interpretation and theory-building, enabling
studies of much wider scope to answer questions that would be all but
unanswerable without computation (Berry,
2011 []; Manovich,
2011 []).

The article also presents the pessimistic perspective that large-scale
computational analysis fails to suffice as a viable Humanities method.

The digital humanities has not been universally welcomed, with detractors
contending that using computers as ‘reading machines’ (Ramsay,
2010 []) to
undertake ‘distant reading’ (Moretti,
2005 []) runs
counter to and undermines traditional methods of close reading. Culler (2010:
22) [] notes that
close reading involves paying ‘attention to how meaning is produced or conveyed,
to what sorts of literary and rhetorical strategies and techniques are deployed
to achieve what the reader takes to be the effects of the work or passage’ –
something that a distant reading is unable to perform. His worry is that a
digital humanities approach promotes literary scholarship that involves no
actual reading. Similarly, Trumpener
(2009 []: 164)
argues that a ‘statistically driven model of literary history … seems to
necessitate an impersonal invisible hand’, continuing: ‘any attempt to see the
big picture needs to be informed by broad knowledge, an astute, historicized
sense of how genres and literary institutions work, and incisive interpretive
tools’ (pp. 170–171). Likewise, Marche
(2012) [] contends
that cultural artefacts, such as literature, cannot be treated as mere data. A
piece of writing is not simply an order of letters and words; it is contextual
and conveys meaning and has qualities that are ineffable. Algorithms are very
poor at capturing and deciphering meaning or context and, Marche argues, treat
‘all literature as if it were the same’.

Just throwing out a lot if of theory because I wager some of this listserv’s
readers may find it interesting, enjoy reading the theory, want to refer to some
of this in their academic work, or all of the above.

If interested in diving more into some of the themes mentioned here, some works
I have found helpful are:

The Humanities In Transition From Postmodernism to the Digital Age

Reading Machines: Toward an Algorithmic Criticism

Hermeneutica: Computer-Assisted Approaches to the Humanities

Macro analysis: Digital Methods and Literary History

Last, by way of closing out: hey everyone, I’m technically a librarian but have
oriented my library school studies mostly toward data science and helping get
people excited about the coming tech changes of the 21st century and their
social implications. So while this was definitely an info dump, I was also
pretty strategic in what I chose to include. Really appreciate any feedback from
any readers to help me be a better librarian/information scientist — otherwise
feel free to enjoy the goodies quietly! Haha.

Happy MLK day.


        Date: 2022-01-17 15:31:53+00:00
        From: <>
        Subject: Re: [Humanist] 35.466: cautions about digital studies of words


Readers of Humanist and keen on the conjecture/conclusion thread, might be
interest in a short series of tweets by Marieke Dwarswaard

which concludes thus:

Laatste punt: het is tricky om te doen alsof we nu weten hoe het precies gegaan
is. Als je met zo'n verhaal naar de media gaat weet je dat in de headlines de
nuances verloren gaan, dus daar moet je ontzettend mee uitkijken -
wetenschapscommunicatie 101.

Last point: it's tricky to pretend that we now know exactly how it went. If you
go to the media with a story like that, you know that the nuances are lost in
the headlines, so you have to be very careful with that - science communication

Thanks to Joris van Zundert
for bring this to my/our attention.

François Lachance, Ph.d.

living in the beginning of the long 22nd century; sequencing the  "future


