15.222 runcible markup

From: by way of Willard McCarty (willard@lists.village.Virginia.EDU)
Date: Wed Sep 05 2001 - 02:10:40 EDT

  • Next message: by way of Willard McCarty: "15.219 recommended books"

                   Humanist Discussion Group, Vol. 15, No. 222.
           Centre for Computing in the Humanities, King's College London

             Date: Wed, 05 Sep 2001 06:54:00 +0100
             From: Wendell Piez <wapiez@mulberrytech.com>
             Subject: Re: 15.216 runcibles from love's cupboard


    At 05:39 AM 9/2/01, you wrote:
    >I am particulary intrigued as to what subscribers to Humanist might think
    >about the claim that appears to be advanced by McGann at the time that
    >markup models text as determinate hierarchy and not as recursive
    >network. The distinction doesn't seem to hold since markup can provide a
    >system of interlocking pointers.

    As you know, this is an issue that interests me, so I looked up the page
    you cited (thanks). I don't believe that McGann says quite what you
    attribute to him. The closest thing he does say to your paraphrase "that
    markup models text as determinate hierarchy and not as recursive network" is:

    >In the field of Humanities Computing the idea of text has been dominated by
    >conceptions practically realized in the TEI implementation of SGML markup.
    >Several key theoretical papers published by Steve DeRose, Allen Renear,
    >"et al." explain the ground of that implementation.
    >This ground, explicitly "abstract" (Renear 1997), represents a view of
    >text as essentially a vehicle for transmitting information and concepts
    >(final cause). Text is "hierarchical" (formal cause) and "linguistic"
    >(material cause), and it is a product of human intention (efficient cause).

    This is considerably more carefully stated than your formulation. Perhaps
    we could emend your paraphrase to say "markup *has modeled* text as
    determinate hierarchy and not as recursive network", which is something
    closer to what McGann says. And I think we can agree it is little less than
    the truth.

    Markup *can*, as you suggest, go further than than DeRose's and Renear's
    OHCO (Ordered Hierarchy of Content Objects), or its present-day descendant,
    the XML infoset (or its close sibling, the XPath data model). There is
    currently quite a bit of interesting work going on in using markup to
    describe structures that are more than the "acyclic directed graph" (i.e.,
    the tree) described by SGML/XML. For example, work on both Topic Maps (see
    http://www.topicmaps.org) and RDF (Resource Description Framework) from
    W3C, suggest directions which Humanists will be profitably investigating
    for years to come.

    In fact (as you know) even SGML/XML, via the ID/IDREF mechanism, can imply
    something more complex than the simple tree structure. But reflecting on
    that, immediately you can see, there's the rub. In order to take advantage
    of such a data structure, be it "a recursive network" or "set of
    interlocking structures", we need an application architecture (objects
    instantiated in memory? a relational database?) and software that
    understands the more complex data model.

    In other words, markup itself is *not enough*. A gap opens up between the
    *notation* we use to describe and express an information set to our eyes
    and hands -- usually meaning, in this case, the actual text-and-markup, the
    lines of characters with all the pointy brackets, etc. -- and the abstract
    data model which our machine is designed to process. (You can understand
    this difference in the difference between processing markup with, say,
    regular expressions, which see only a sequence of characters and which do
    pattern-matching over that sequence, and something like XSLT, which only
    works after a parser has converted that character sequence into a tree
    structure. They work on different data models. Which is "truer" to the
    text?) We think we're doing something fluid and flexible -- markup -- but
    actually (like the evil imp in the legend) we're locking ourselves into
    something rigid and hierarchical, a tree. But -- Felix Culpa! -- we
    discover this gap is actually fortunate for us, a feature of our systems
    not a bug, as we discover these data models can be layered. Out of your
    stream of characters, if it is well-formed, you can render a tree. Out of a
    tree you can render a set of interlocking structures. Each layer, as a
    medium, "contains another medium" below it (you remember your McLuhan), but
    as a more elaborate and featured structure than its more rudimentary basis,
    can serve to represent something more complex. (So in XPath/XSLT we can say
    a title is "inside" a chapter. In markup alone, this requires assuming our
    parser recognizes containment.) Up until this point, markup systems were
    only being engineered to emulate what print media already did. Now we are
    poised, as you suggest, for markup-based systems to begin to do much more.

    Yet this does not contradict anything in what McGann said. In fact, his
    statements can be taken to suggest (I paraphrase much more freely than you
    did) that the nature of poetry is such that it will continue to evade
    comprehensive "understanding" through markup (that is, there is no way we
    can explain or fully account for a poem, through markup), not because the
    structures of poetry are so elaborate and "intertwingled" -- that is not
    the point -- but because the very nature of poetry is to work at several
    levels at once, between what we are now calling notation and data model
    (each data model potentially providing a notation for another, higher
    model). That is, to be rather crudely geekish about it, the poet's work is
    to invent a notation to express a new data model, or at the very least, to
    explore the workings of notations ("texts") and data models (abstractions
    communicated by those texts) with respect to one another.

    Naturally, McGann (being a scholar of Byron and Rossetti) is inevitably
    very sensitive towards the complexity of those models: both systems of
    linguistic and literary-generic conventions, and more overt literary
    allusion, make for extremely complex, though hardly formalized, "networks"
    of meaning. But to get caught up in this -- imagining, for example, the way
    hypertext might represent such a system of knowledge and meaning -- would
    be to miss the main point: that the poets have done it already, using their
    own materials -- ink, paper, sound, silence, white space on the page -- and
    that the nature of their creation can no more be captured in another form,
    than a cinematic masterpiece, or even a home video, can be explained and
    comprehended in a movie review (or letter to grandma), however artful.

    That is, if we look past the tantalizing promises of technology to
    encapsulate and define, finally, such knowledge and meaning as we have --
    to build the system that could, say, "know" what Byron's _Don_Juan_ "knows"
    -- and recognize that the poets have always been, not merely users of
    media, but *inventors* of new media out of the old, we'll be closer to what
    McGann was trying to get at. In that sense, I do not take his remarks to
    indicate any problem to be solved. The only warning in it is, that although
    we may have shown we can erect buildings with our Lego set, we might still
    not have explained away the art of the builder who has learned to work in
    glass and stone.


    Wendell Piez mailto:wapiez@mulberrytech.com
    Mulberry Technologies, Inc. http://www.mulberrytech.com
    17 West Jefferson Street Direct Phone: 301/315-9635
    Suite 207 Phone: 301/315-9631
    Rockville, MD 20850 Fax: 301/315-8285
        Mulberry Technologies: A Consultancy Specializing in SGML and XML

    This archive was generated by hypermail 2b30 : Wed Sep 05 2001 - 02:33:42 EDT