Humanist Discussion Group

Humanist Archives: June 15, 2021, 6:07 a.m. Humanist 35.87 - obsolescence of markup

				
              Humanist Discussion Group, Vol. 35, No. 87.
        Department of Digital Humanities, University of Cologne
                      Hosted by DH-Cologne
                       www.dhhumanist.org
                Submit to: humanist@dhhumanist.org


    [1]    From: Dr. Herbert Wender <drwender@aol.com>
           Subject: Re: [Humanist] 35.86: obsolescence of markup (156)

    [2]    From: Tito Orlandi <orlandi@cmcl.it>
           Subject: Re: [Humanist] 35.86: obsolescence of markup (26)


--[1]------------------------------------------------------------------------
        Date: 2021-06-14 15:28:08+00:00
        From: Dr. Herbert Wender <drwender@aol.com>
        Subject: Re: [Humanist] 35.86: obsolescence of markup

Dear Manfred,

I thank you for your clarifications because in many cases obviously I do not
understand well the intentions of the authors in this discussion place. I
appreciate that you differentiate between technical and epistemic resp.
conceptual aspects in the discussion of markup technologies, and I see a third -
financial or political - point if you metnion the investments maybe coming to be
obsolete by technological advancements. The latter IMHO is the most important
when I see in Germany projects like the Digitale Faustedition or now the XML/TEI
digitization of the obsolete book edition of Georg Büchner's Complete Writings
(Marburger Ausgabe); but this theme I would like to discuss in a German-speaking
context - here I think it would be annoying.

The questions you raise with respect to handling obsolete markup are questions of
'technical intelligence', questions of software engineering. Why shuld it be
conceivable that an advanced, 'intelligent' process recognize automatically what
a TEI encoder in the past has put e.g. in the Faust text but cannot be
trained to handle different sorts of encodings?

Probably you'll insist in the discussion of epistemics. As sufficiently old men
we overview a long series of 'conceptual' 'revolutions' in the sphere of DBS and
DBMS grounded by hierarchical, network, relational/algebraic, object ...
'philosophies'. Not to mention the ERS ;-) If we could involve Willard to
explain the difference between 'paradigm changes' and such conceptual
revolutions? At the end, I think, the wise wo/man does not trust in one
conception but is alway ready to migrate in another framework.

Kind regards, Herbert

-----Ursprüngliche Mitteilung-----
Von: Humanist <humanist@dhhumanist.org>
An: drwender@aol.com
Verschickt: Mo, 14. Jun. 2021 9:43
Betreff: [Humanist] 35.86: obsolescence of markup

                  Humanist Discussion Group, Vol. 35, No. 86.
        Department of Digital Humanities, University of Cologne
                          Hosted by DH-Cologne
                      www.dhhumanist.org
                Submit to: humanist@dhhumanist.org




        Date: 2021-06-14 07:11:53+00:00
        From: Manfred Thaller <manfred.thaller@uni-koeln.de>
        Subject: Re: [Humanist] 35.85: obsolescence of markup

Dear Herbert,

Sorry for a delayed answer.

I am afraid, that my original mail to Humanist has been a bit too
elliptical. So I have to ask for your patience for a slightly longer
extrapolation before responding to your remark below.

Willard's original question was:

> Currently (correct me if I am wrong) markup intervenes to embed human
> intelligence about an object where artificial processes of detection and
> analysis fall short. Does this not suggest that some kinds of markup will
> become obsolete at some point? (I do not have in mind scholarly
> commentary!) Has anyone speculated intelligently along these
> lines?
  In response to some reactions, he expanded the question to:
> Jonah Lynch responded to my speculation about the obsolescence of
> markup, asking what I had in mind by the distinction I made between the
> kind I thought would not ever prove obsolescent and the kind that would.
> My overall intention was to draw attention to the impermanence of work
> in computing, and so to raise the question of invasive curation. Of
> course every thing is impermanent, in constant flux &c, but some
> artefacts of scholarship do survive because we care about them. Adding
> to them with highly interpretative metatext would be regarded as a
> different sort of contribution than denoting layout, would it not?
>
> Thus an example: metatext that says "this is a paragraph" versus
> metatext that comments on the author's likely intention in breaking the
> flow of prose in the particular version in question. I think we can say
> that completely reliable automatic recognition of paragraphs is only a
> matter of time -- except in relatively rare circumstances. No
> hard-and-fast rules, only a doubtlessly annoying observation.
>
> Is there yet another argument here for standoff markup? For working even
> harder on statistical methods of analysis? Something else?

For me, this can be "operationalized" from two points of view:

A conceptual / epistemic one. An operational / algorithmic one.

The epistemic one is in my opinion the one which lies at the heart of my
own scepticism of the TEI, or, as a wrote in the opening of my "post"
you quote:

> The markup embedded into
> a document shall: (a) represent characters, which do not exist in the
> fonts available or
> which are non-alphabetic like interpunctuation. (b) Allow the
> representation of
> abstract texts resulting from the evaluation of various witnesses in a
> critical edition.
> (c) Annotate a text with interpretations.

As long as these three - at least for me - completely different
epistemic layers are inseparably mixed in <emph>one</emph> markup
system, conceptual chaos ensues. But be that as it may conceptually,
there is also a technical problem, that is behind Willard's question for
obsolescence.

How to you confront situations, where (1) a heavy investment has been
made in the proofreading of a raw text, creating a perfect base for
further work, (2) the conceptual comments added into the Sachapparat are
obsolete, however, as they reflect a theoretical paradigm that has
fallen into disgrace? Or: where (1) valuable interpretations have been
embedded into a very large text, which (2) is so large, that for the
sake of manageability these comments have been embedded into a text
produced by slightly dirty OCR, when (3) ten years later, after the
author of these comments has died, an OCR breakthrough allows to create
a significantly less dirty conversion of the original files? Or: When
you have an (1) admirably prepared text, which has however (2) only very
little done on the extraction of entities, which than become available
with the help of an entity extraction algorithm (3) which shall
supersede the few ones in the original markup, (4) leaving the rest of
the markup unchanged? Or, to take up Willard's argument: When you want
to keep something, which is acceptedly obsolete side-by-side with "some
artefacts of scholarship [which] do survive because we care about them".

While I've not been at this Wuppertal conference, I'm somewhat surprised
that these questions of how to replace an obsolete part of a marked up
text leaving the remainder unchanged have been solved by the Hofmeisters
(and yes, I'm also a quite regular reader of the Balisage proceedings).
And I have also overlooked somehow the discussion of how two data
objects might negotiate the consequences of updates in one of them for
the links between them, a bit after the line you quote from my blog.

Shortly on capta:

> Given, as she [Drucker] says,
> that all data is actually capta, current approaches to data visualization are
> misleading in that they suggest more certainty and stability than is actually
> the case.

Yes, to paraphrase (not quote) Darell Huff: How to lie with statistics,
1954: "How to lie with statistics? Visualize them."

Whether all data is capta is of course only the starting point of a
discussion, which has to clarify what is "all", or rather whether all
data are capta up to exactly the same degree, or rather whether there is
only one captor, or possibly a whole set of interacting captors. Forgive
me: Or whether it is really epistemically a good idea to consider (1)
the reading of an individual character, (2) the relative weight of two
textual witnesses and (3) the interpretation of the intent of an author
as operating on exactly the same conceptual level ...

Kind regards,
Manfred


--[2]------------------------------------------------------------------------
        Date: 2021-06-14 10:57:43+00:00
        From: Tito Orlandi <orlandi@cmcl.it>
        Subject: Re: [Humanist] 35.86: obsolescence of markup

Manfred, benedicat te omnipotens deus!

On Mon, June 14, 2021 09:42, Humanist wrote:
> Humanist Discussion Group, Vol. 35, No. 86.
> Department of Digital Humanities, University of Cologne
> Hosted by DH-Cologne
> www.dhhumanist.org Submit to: humanist@dhhumanist.org
>
> Date: 2021-06-14 07:11:53+00:00
> From: Manfred Thaller <manfred.thaller@uni-koeln.de>
> Subject: Re: [Humanist] 35.85: obsolescence of markup
>
> The epistemic one is in my opinion the one which lies at the heart of my
> own scepticism of the TEI, or, as a wrote in the opening of my "post" you
> quote: ... etc. ...
>
>


--
Tito Orlandi  (professor emeritus, Univ. di Roma La Sapienza)
Socio Accademia Nazionale dei Lincei
Segretario Generale Unione Accademica Nazionale
Hiob Ludolf Zentrum (Asien-Afrika-Institut, Univ. Hamburg)
Institutum Patristicum Augustinianum, Roma
http://cmcl.it/~orlandi


_______________________________________________
Unsubscribe at: http://dhhumanist.org/Restricted
List posts to: humanist@dhhumanist.org
List info and archives at at: http://dhhumanist.org
Listmember interface at: http://dhhumanist.org/Restricted/
Subscribe at: http://dhhumanist.org/membership_form.php