Humanist Archives: March 12, 2019, 8:15 a.m. Humanist 32.537 - the illusion of 'progress' and transfer of knowledge

                  Humanist Discussion Group, Vol. 32, No. 537.
            Department of Digital Humanities, King's College London
                   Hosted by King's Digital Lab
                Submit to: humanist@dhhumanist.org

        Date: 2019-03-11 21:45:47+00:00
        From: desmond.allan.schmidt@gmail.com
        Subject: Re: [Humanist] 32.533: the illusion of 'progress' and transfer of knowledge


I don't understand why you react so strongly to the slight comment by
Joris that he had no need of hierarchies. In this I concur with him.
On this topic I'd like to quote Claus Huitfeldt, who in 1995 reacted
to the then recent adoption of SGML:

"I am not convinced that Wittgenstein’s manuscripts are basically
hierarchical structures. Potentially, for all I know, any feature may
overlap with any other feature. Besides, I do not even know what the
hierarchies should consist of, or whether the identification of such
hierarchies would be particularly illuminating."

It is interesting to note that all the arguments we are having in this
thread and preceding ones are about these same issues: that no one can
definitively decide what the hierarchies are or even mount a
convincing case that they communicate useful information. The
hierarchies are mostly a requirement of the underlying markup system.
They are created in a self-justifying way to enable syntax checking of
themselves. So when our document parses correctly we feel chuffed that
the text is now properly encoded when in fact only the tags are.

On reading your article written with Raffaele I am reminded of the
many attempts I made to beat some consistency out of the original TEI
encodings of our Harpur manuscripts. Three times I thought I "had it"
in developing ways to determine the surrounding context of a
particular insertion or deletion or variant. But the result always
left me dissatisfied however much I tweaked my program. And I am an
experienced software engineer who knew the material well. In the end
reliability could only be achieved by combining an automated method
with human intervention in difficult cases.

The reason for this failure is that the information we need to enable
cross-document collation simply isn't present in the transcription.
I'll repeat what I said earlier: Encoding primarily for graphical
features (deletion, insertion) or physical layout (zones, x,y
coordinates) means that we can't hope to extract what amounts to
temporal information about how the text stood locally at any one time.
And we need that if we are going to collate.

To illustrate this point I refer to No 2. of my "tough cases" observed
in making our edition of Harpur.
(http://charles-harpur.org/tough-cases). Here you will see repeated
revision of various parts of a stanza extending to two pages. The
author doesn't even bother to delete earlier versions consistently.
Trying to work out from spatial data here as to what replaces what is
a decidedly difficult problem that only humans can ever hope to
resolve. Or example 6 d) which contains revisions on a recto of a
printed text with corrections on the verso. Where do these changes on
the verso actually go? And which words are repeated (not crossed out)
or superseded by the ones on the recto?

What you appear to be claiming in this article is that you have
developed an automated way to collate any texts conforming to TEI by
removing the markup to a standoff representation. I would like to
point out that the transcribed base texts of corrected manuscripts in
many cases are jumbled-up nonsense. For that reason collation
algorithms cannot reliably compare texts with embedded variants. I
challenge you to show me an algorithm that can do it in a
mathematically proven way.

Maybe it is OK for humanists who are used to fuzzy things. But if we
are truly going to interoperate as you say - and in this I agree that
it is a highly desirable property for our digital encodings - we need
a reliable way to express variation between documents and between
editorial projects, and XML cannot meet that requirement, however much
we might want it to. The reason is not that the XML format cannot
express it but that humans cannot write it consistently enough to
allow that to happen.

Desmond Schmidt
Queensland University of Technology

>         Date: 2019-03-10 18:17:52+00:00
>         From: Elisa Beshero-Bondar 
>         Subject: Re: [Humanist] 32.532: the illusion of 'progress' and
> transfer of knowledge
> Dear Joris,
> Perhaps this post was written in fatigue and ennui, but I was sorry to see
> that it feeds one of the persistent problems of dogmatic points of view,
> and that is the perpetuation of a falsely simplistic binary opposition.
> Here the problem is our notions that hierarchy and dimension are on
> opposits poles, with one singular and limited and the other multiple and
> free.
> One reason a debate continues is that the debaters have reduced their
> positions to the illusion that each is in entirely the opposite camp. I
> agree with you that dogma is reductive and tiresome, and will add that it
> is needlessly polarizing and does damage to the extent that, as you say, it
> stilts the way we form communities and whom we think we can respect.
> Let's consider the position that hierarchy and structure are inherently
> limiting and singular, and by implication are not to be found in
> multi-dimensional models. Dashed off in a such a way, we confuse data
> models with culture and text and fail to perceive anything more than a
> polarized position--and it's so compelling that we turn people away from
> bothering to find out what they could do with XML to express hierachy and
> connect to other hierarchies in multi-dimensional ways. Having no use for
> hierarchy seems a little self-limiting in real life, and I'd assert that
> eventually hierarchy must and will assert itself in non-XML models (yes, as
> soon as you need to nest something in JSON). Describing semantic
> relationships requires expressing dependencies regardless of the format we
> apply. A dogmatically polarized view, however, will abruptly dismiss XML
> and pretend it doesn't matter in HTML or in anything we decide to describe
> as "simple" and also "multidimensional". I suppose that's how we talk to
> each other when dogma asserts itself, but in real life, modeling texts and
> data is more complicated than convenient polarities permit us to express in
> debate.
> In practice, we learn how to express our ideas and perspectives in
> different formats. We become multi-lingual. We write code that maps across
> data models. We write XML and map to JSON. We write multiple formats of XML
> to mediate across hierarchies. And we respect the intellectual work of
> modeling and structure. Topographies and nestings are part of modeling, and
> we work, really, in intermediary ways--that can be the fun of it. Or so
> I've found. You can read more about our intermediating work (via the
> Frankenstein Variorum project), and our views of interchange and
> interoperation in our Balisage paper that Raff cited earlier, and I hope
> you will hear us speak about these things at Utrecht. Maybe we can be
> moving beyond tiresomely dogmatic debates into the real world where we work
> together on mediating in between data models.
> Joris, I do believe we're on the same page by the end of your post. I agree
> with you, whole-heartedly, in asking this question: "And why are we so bad
> at skilling scholars to choose a technology, data structure, or algorithm
> based on reasoning its applicability for purpose instead of community based
> or individual dogma?" I shake hands with you there, but I must dissent with
> your dismissal of "technical hype" and the rough treatment of hierarchy and
> structure. I have heard such rapid-fire dismissals from several
> quarters--not just from this post, but in conversations at DH conferences
> about what's supposedly wrong about markup. If we do want to honor the
> history of DH and we want to sustain an informed and respectful connection
> with each other's work, perhaps less debunking and more bridge-building
> would be helpful. When do we recognize that hierarchies can be plural and
> address multiple dimensions? And haven't we been aware of this all along?
> Best,
> Elisa
> Beshero-Bondar, Elisa E., and Raffaele Viglianti. “Stand-off Bridges in
> the
> Frankenstein Variorum Project: Interchange and Interoperability within TEI
> Markup Ecosystems.† Presented at Balisage: The Markup Conference 2018,
> Washington, DC, July 31 - August 3, 2018. In *Proceedings of Balisage: The
> Markup Conference 2018*. Balisage Series on Markup Technologies, vol. 21
> (2018). https://doi.org/10.4242/BalisageVol21.Beshero-Bondar01.
> --
> Elisa Beshero-Bondar, PhD
> Associate Professor of English
> University of Pittsburgh at Greensburg
> Humanities Division
> 150 Finoli Drive
> Greensburg, PA  15601  USA
> E-mail: ebb8@pitt.edu

