Home About Subscribe Search Member Area

Humanist Discussion Group


< Back to Volume 32

Humanist Archives: March 14, 2019, 6:32 a.m. Humanist 32.544 - Illusions of Progress...

                  Humanist Discussion Group, Vol. 32, No. 544.
            Department of Digital Humanities, King's College London
                   Hosted by King's Digital Lab
                       www.dhhumanist.org
                Submit to: humanist@dhhumanist.org


    [1]    From: Dr. Herbert Wender 
           Subject: Illusions of Progress? (Frankenstein Variorum) (48)

    [2]    From: Richard Cunningham 
           Subject: Re: [Humanist] 32.542: the illusion of 'progress' and transfer of knowledge (50)

    [3]    From: desmond.allan.schmidt@gmail.com
           Subject: Re: [Humanist] 32.542: the illusion of 'progress' and transfer of knowledge (177)


--[1]------------------------------------------------------------------------
        Date: 2019-03-14 00:32:42+00:00
        From: Dr. Herbert Wender 
        Subject: Illusions of Progress? (Frankenstein Variorum)

Dear HUMANISTs,

since the Frankenstein Variorum project is sermoning about hierarchies in
general they obviously wouldn't  answer my (stupid? annoying?) questions
addressed to Raffaele Vigilanti last week. Maybe the right hand doesn't knows
what the left does? Or might eventually someone else deny my suspicion that the
encoding doesn't represent correctly the structure of the revision process.

To remember the case I excerpt today from "msCollPrep_c56.xml" (taken from
GitHub):

{line}Thoseevents which materially influence our {w ana="start"/}fu{/line}
{line}ture{w ana="end"/} destinies {delrend="strikethrough"}are{/del} often
{mod}{del rend="strikethrough"}caused{/del}
{del rend="strikethrough"}by slight or{/del}
{add hand="#pbs" place="superlinear"}derivethier origin from a{/add}
{/mod}
{wana="start"/}tri{/line}

 {line}vial{wana="end"/} occurence{delrend="strikethrough"}s{/del}....{/line}


My thesis: every critical apparatus which is aware of the sentence's structure
would distinguish two substiitutions:

1. are ... caused by
--} derive from
(BTW: the Oxford companion has in his list of changes from MWS to PBS
"cause"/"derive from"; in the correct form this cahnge would not support the
argument in the contexxt)

2. slight or trivial occurences
--> a trivial occurence.
'Interpretative' I think it is disputable what was intended by changing one side
of the causal relationship from plural to singular, and the answer may play a
role in deciding if the reduction from "slight or trivial" to "trivial" is
dependent from the change of numerus:
2a. slight or trivial
--> trivial
2b. occurences
--> a ... occurence

Having now consulted via Google Books the original facs ed. (Ch. Robinson) I
have a further question: It seems that the insertion of "a" before "occurences"
has at first resulted in a grammatical discrepancy. Why this phenomenon isn't
encoded?

Regards, Herbert

--[2]------------------------------------------------------------------------
        Date: 2019-03-13 16:13:24+00:00
        From: Richard Cunningham 
        Subject: Re: [Humanist] 32.542: the illusion of 'progress' and transfer of knowledge

I haven't been following the discussion entitled "the illusion of progress" with
any kind of faith, but perhaps this is because I am not much invested in any
part of it.  That said, I do want to comment on something Desmond said in post
32.537.

I'm generally suspicious of both assertions of progress and hierarchies, but
while the jury will be forever out for me on the question of progress (e.g. I
dearly love love having central heating but my physical health was certainly
much better when I had to harvest, chop, and carry firewood to keep warm.  My
progress, like all progress, has come at a substantial cost.)

But the same quality of suspicion doesn't apply to hierarchies, for me.  I have
a feminist colleague who can be nearly rabid at the mere whiff of a hierarchy,
and years ago she and I were on opposite sides of the issue.  Whether she
deserves credit for my conversion or not I can't say, but what I can say is that
I have had a lingering, nagging sense that the notion of documents as
hierarchies of objects has seemed, well, suspicious, for some time now.

When I read Desmond Schmidt's assertion yesterday that "when our document parses
correctly we feel chuffed that the text is now properly encoded when in fact
only the tags are" I felt the previously free floating sense of unease touch
ground.  Maybe I've been influenced by quantum theory, but I can't help feeling
that when we use a gauge like SGML, XML, or the TEI to measure things, whatever
things we measure are suspiciously liable to conform to exactly the kind of
measurements those technologies are designed to extract from data.  But I doubt
that is the same thing as meaning that the things "naturally" or "really" exist
always and only in conformance with the technologies used (to measure them, in
keeping with the metaphor I introduced above, or, more generally) to enable our
interaction with them.  At the risk of over-simplifying, an analogy might be to
say that the person measuring the field with a yard stick finds that the field
is naturally 50 yards long, while the person using a metre stick finds it to be
naturally 45 3/4 metres long.  It's not a very good analogy, but it communicates
the corruptive influence of the gauge in the process.

To conclude, what I think Desmond's comment communicates to me is the fact that
the hierarchy we "find" in documents we find because we are using tools that
look for and measure the precision of their own hierarchical deployment in the
tool-using process.

I apologize for the muddiness of this note.  I hope at least some can make
enough sense of what I'm trying to say to grant to that expression "when our
document parses correctly we feel chuffed that the text is now properly encoded
when in fact only the tags are" the beauty and value and explanatory force I
think it deserves.


Cheers,

Richard


--[3]------------------------------------------------------------------------
        Date: 2019-03-13 13:28:31+00:00
        From: desmond.allan.schmidt@gmail.com
        Subject: Re: [Humanist] 32.542: the illusion of 'progress' and transfer of knowledge

Elisa,

I confess to not making myself clear. I was talking about markup
hierarchies, not hierarchies in writing or thought, or conceptions of
grammar. My mistake.

We don't use any hierarchies in the text, in the markup, versions,
layers or metadata with which the text is embellished. Our edition is
complex and fully featured and covers a wide range of text types:
poems with sections, stanzas and lines with various indents, notes by
the author, letters written by various correspondents etc. If
hierarchies are as essential as you maintain, how is it that we can do
without them?

What you seem to be saying is that hierarchies are in our thoughts
therefore we should use them in our encodings of texts. This sounds a
lot like the old argument by Coombs, Renear and DeRose that
punctuation is a kind of markup, which justifies adding a whole lot
more foreign markup to the text. Then we can kid ourselves that it is
somehow just natural.

> we
> should stop pretending that we have "no use for" hierarchy and start
> thinking in terms of multiple hierarchies because that's what we do and how
> we work when we build our projects.

What makes you think that multiple hierarchies are somehow better? How
can it be better to keep piling more hierarchies onto a text that is
already overloaded by the first one?

> We can even describe the vexing troubles caused by overlap using hierarchy
> and twisting around it.

I have used XML extensively for marking up literary texts and I know
very well what "twisting around it" means. It means hacking, fudging,
misrepresenting, repeating text and introducing non-hierarchical
structures into the supposedly hierarchical markup. It's a deficiency
in the model and the resulting complexity drives encoders mad. It also
leads them to refusing to code the difficult bits even when they are
properly trained.

> We can give up
> the effort in despair, of course, and I'm afraid that's what happens to
> many a nascent very complicated edition project, but we can also do what we
> can by expressing what we observe.

You make the case SO very well for me here. Actually we have nearly
finished our edition in only a few years, and there were 4,600 complex
manuscript pages in 24 volumes, 700 newspaper cuttings of poems, notes
as voluminous as the poems, 300 letters, a mass of biographical
material and we got on top of all that because we didn't use markup
hierarchies. The Tagore edition at Jadavpur University in India was
much larger than ours and also was finished in record time from not
using XML.

> we may never agree that our
> hierarchies are adequate, but the least we can do is recognize that they
> are multiple and that they are going to intersect and overlap. And we can
> study those areas of intersection and overlap and find them worthy of
> continued discussion.

I don't see the point of this. If you admit that the hierarchies
overlap and intersect, doesn't that prove that the hierarchical model
is inadequate? We don't have overlap or intersections. The data of the
edition are not organised on hierarchical principles, but there is
still a place for everything and everything is in its place. And yet
our model is simple, not complex like the one you are describing.

> The
> difference is in the thinking we apply to our data models, not the presence
> or absence of hierarchy.

But you ARE thinking about how to express your thoughts in terms of
nested elements and attributes, aren't you?

> I don't pretend
> that working manuscript encoding into collation was easy work, but I did
> not consider myself having to "beat" anything violently out the encoding we
> were working with, and I'm grateful to the coders who came before me to
> make our variorum work possible.

Here you more or less confirm that the success of your collation
depended on a lot of manual intervention. I had a similar experience
of spending 8 months of my life - I'll say it again - beating sense
out of the XML encoding of our author's writings. Do you really think
this is the way forward?

> We
> opted to include deletion and insertion data in our collation to make the
> little-known deleted passages of the manuscript visible in comparison with
> the print editions that are better known.

This is a fine ambition. However, you didn't explain how you did it.
You didn't provide measurements of its accuracy, nor did you apply it
to any texts other than the Shelley-Godwin Archive. I believe that in
general we can't do this because the tags used to encode those changes
are subject to interpretation by the encoder(s), and they are focused
on graphical properties of the text not the temporal sequence of
corrections. And also because the text as you describe it is
non-linear and so can't be compared. So I am quite skeptical that you
got reliable and verifiable results here.

> We differ fundamentally here if you believe that the
> running stream of text to be wound and woven into collation is the essence
> of an edition. I don't believe it to be--I believe it to be an illusion
> generated to ease processing, and that the edition we build needs to raise
> hierarchies again.

I'd agree that in spite of its simplicity our versions and layers
model is hard for someone coming from an XML background to accept and
understand. However, I believe it is the right way forward. It doesn't
leave out any significant detail.

> The application of a shared vocabulary would be
> intellectually deprived if every edition were encoded according to the
> convenience of a programmer who wants a one-size-fits-all interface. I
> would like people to think about how to write code in TEI that communicates
> with other markup

The problem here is that these variations in the use of markup
vocabularies paradoxically prevent us collaborating or sharing texts.
It may be a purely technical point but interoperability is worth
having, and it can't be done if humans are left to encode texts
manually in XML.

> Perhaps we can be properly introduced someday
> in a context where we're not trying to prove one another wrong, but can
> find ways to bridge our differences and respect our data modeling.

While I would like that too, we shouldn't forget that this is a public
text not a letter, and people are reading it for the juicy bits.
Forgive me if I push my case too hard.

J. H. Coombs, A. Renear and S. DeRose (1987). Markup systems and the
future of scholarly text processing, Communications of the ACM 30.11,
pp.933-947.

James,

I would like to repeat the point I made above for Elisa. I was talking
about markup hierarchies but trying to simplify by talking about
hierarchies in general. I'm not trying to say that that we don't use
hierarchies to organise our thoughts. People have always done that, of
course. My point was that we don't need hierarchies in markup to
encode what is already self-evident. I can't help noticing that all
the printers of Shakespeare until the advent of SGML had no need of
hierarchies to produce books that everyone could read. They just put
black marks on a page, not caring if the text was divided conceptually
into acts, scenes, speeches and lines because every reader could
already see it.

I feel bound to point out that all the cases you cite are imperfect
examples of hierarchy. May I ask if you can recognise topic sentences
100% reliably? And are you confident that someone analysing that same
text would always come up with the exact same result? And are you
quite certain that every sentence can be analysed unambiguously using
Chomskyan grammars and that our speech is perfectly regulated as if it
came from a machine?

If you can answer yes to those questions then you can feel at ease
marking up a text with rigid hierarchical structures that are an
interpretation of the encoder in spite of all the technical problems
that ensue from using an admittedly efficient but mechanical model to
represent what came from the mind of a human being. But you must ask
yourself: why did I do this? Where's is the gain for all my pain?

Desmond Schmidt
eResearch
Queensland University of Technology



--
Dr Desmond Schmidt
Mobile: 0481915868 Work: +61-7-31384036




_______________________________________________
Unsubscribe at: http://dhhumanist.org/Restricted
List posts to: humanist@dhhumanist.org
List info and archives at at: http://dhhumanist.org
Listmember interface at: http://dhhumanist.org/Restricted/
Subscribe at: http://dhhumanist.org/membership_form.php


Editor: Willard McCarty (King's College London, U.K.; Western Sydney University, Australia)
Software designer: Malgosia Askanas (Mind-Crafts)

This site is maintained under a service level agreement by King's Digital Lab.