14.351 errors in e-books; XML & proprietary formats

From: by way of Willard McCarty (willard@lists.village.Virginia.EDU)
Date: 10/14/00

  • Next message: by way of Willard McCarty: "14.0352 funding (NSF/NEH/NEA -- U.S.)"

                   Humanist Discussion Group, Vol. 14, No. 351.
           Centre for Computing in the Humanities, King's College London
       [1]   From:    Matt Kirschenbaum <mgk@pop.uky.edu>                  (9)
             Subject: Re: 14.0345 errors in e-books, XML & proprietary
       [2]   From:    "P. T. Rourke" <ptrourke@mediaone.net>              (47)
             Subject: Reply to 14.0345 errors in e-books
       [3]   From:    Randall Pierce <rpierce@jsucc.jsu.edu>               (3)
             Subject: e-books
       [4]   From:    "David M. Seaman" <dms8f@etext.lib.Virginia.EDU>    (79)
             Subject: Re: 14.0345 XML & proprietary formats
             Date: Sat, 14 Oct 2000 12:05:33 +0100
             From: Matt Kirschenbaum <mgk@pop.uky.edu>
             Subject: Re: 14.0345 errors in e-books, XML & proprietary formats
    If Humanist is indeed an "edited" forum I have to wonder what purpose is
    served by the posting of recent comments from Norman Hinton (Humanist
    14.0345: errors in e-boos [sic]).
    I don't want to raise the spectre of censorship, but shouldn't contributors
    to this forum bear some burden of intellectual responsibility?
    Does "God forbid that we should want to know what Chaucer may have written
    or what he may have meant thereby....." (and the like) really contribute to
    the "high-level scholarly discussion of computing in the humanities" this
    forum aspires to? (See
    <http://www.princeton.edu/~mccarty/humanist/announcement.html>.)  Matt
             Date: Sat, 14 Oct 2000 12:06:52 +0100
             From: "P. T. Rourke" <ptrourke@mediaone.net>
             Subject: Reply to 14.0345 errors in e-books
      > Some interesting responses to the fact of errors in e-text: the most
      > creative is the most recent -- 'if there are errors, it's the reader's
      > fault' !! I find that fascinating -- we know that denial of
      > responsibility is typical of our culture at its worst, but this is  some
      > sort of record-setting denial.
    With all due respect, if Dr. Hinton's comments quoted above are intended to
    refer either to Prof. McCarty's or Prof. Scaife's discussions of SOL, they
    at best reflect a misunderstanding of the purpose and process of the
    "open-source"-like* method being used by the Suda On Line.  The database
    we're working with is so large that a large editorial staff would be
    required to vet all entries before they are made publicly accessible, and
    consequently proportionally large funding.  So the process of editing, like
    the process of translating, has to be performed on a volunteer basis, and
    has to be distributed.   Because our anticipated readership includes all
    potential translators and editors, there is every motivation for us to
    provide unvetted material to that readership - provided that we are
    absolutely explicit about the vetting status of entries.  Thus we warn
    readers that entries marked as "draft" are not to be considered reliable
    instances of peer-reviewed scholarship, but only drafts.  This is not an
    abdication of responsibility; indeed, because the entire process of
    composition and editing on SOL is transparent to readers, with each editor
    required to note his or her name and the character of the changes he or she
    has made, SOL's method requires a far greater acceptence of genuine
    scholarly and professional responsibility than is typical in traditional
    scholarly publications, where the readers' reports of professional scholars
    are anonymous and editing for content and style, typesetting, proofreading,
    and printing are also performed entirely anonymously unless the author
    chooses to acknowledge their help in his preface.  Indeed, the only
    person whose evaluative comments are public in a traditional scholarly
    publications setting is the author of a published review of another work,
    who to maintain objectivity cannot have any responsibility for the work
    he reviews.  I say this as someone with a background in scholarly
    publishing (rather than scholarship itself).
    It should be noted, however, that the process being used for SOL, which is
    an annotated translation, would require a number of methodological changes
    to be applied to a scholarly edition of a text with apparatus - translation
    and annotation require very different degrees of attention to philological
    niceties than textual editing does; e.g., an annotator can accept the copy
    text without remarking upon more than the most intriguing possible alternate
    readings.  Distributed, collaborative editing of such a scholarly text would
    require point-by-point assignment of responsibility, analogous to (but
    potentially far more detailed, complete, and informative than) an apparatus
    * I say "'open-source'-like" because we are dealing with a text, not a
    Patrick Rourke
             Date: Sat, 14 Oct 2000 12:07:18 +0100
             From: Randall Pierce <rpierce@jsucc.jsu.edu>
             Subject: e-books
    Any errors in E-books are those of the reader? Are you familiar with the
    Greek concept of hubris? I would hate to see the entire humanist list
    visited by Nemesis(except me, who recognized Nemesis a long time ago,
    and sacrifice something to it, him, her often.) Randall Careful, guys.
             Date: Sat, 14 Oct 2000 12:06:22 +0100
             From: "David M. Seaman" <dms8f@etext.lib.Virginia.EDU>
             Subject: Re: 14.0345 XML & proprietary formats
      >>          Date: Fri, 13 Oct 2000 10:51:32 +0100
      >>          From: "Christian Wittern" <wittern@iis.sinica.edu.tw>
      >>           >>
      >> Dear Humanistis,
      >> The recent announcement from Virginia (14.0318) led me to believe I could
      >> find some fine examples of XML usage, possibly even using the TEI 
    markup on
      >> the announced website.
      >> I was quite disappointed to find out that this is far from being the case.
      >> After much looking around in this vast (but not noisy) digital library, I
      >> could not spot even the smallest example of a XML ebook. The only ebooks I
      >> could find where not in the open standard format XML, but in a format used
      >> by some company in Redmond, whose name I can't remember at the moment,
      >> available for some computers running their own proprietary operating
      >> So, here is my advice to the ebook publishers in Virginia: If you think it
      >> is worth publishing ebooks in XML, or maybe even in the Open Ebook format,
      >> please go ahead and do so!! But please don't announce ebooks in 
      >> formats as a great breakthrouhg in electronic publishing.
      >> Christian Wittern
      >> Dr. Christian Wittern
      >> Chung-Hwa Institute of Buddhist Studies
      >> 276, Kuang Ming Road, Peitou 112
      >> Taipei, TAIWAN
      >> Tel. +886-2-2892-6111#65, Email chris@ccbs.ntu.edu.tw
    Dear Dr. Wittern:
    	Sorry to disappoint you -- I hope this explanation helps.
    On most of our site -- the English language section, for example, or
    the online Japanese literature -- the TEI source document
    is translated on-the-fly to HTML for web delivery.
    In the new Ebook section, a TEIXLITE file is converted on-the-fly to
    HTML if you click on the choice that says "web version" -- so it is
    accessible to all users regardless of platform.  If you click on the
    "Ebook" choice -- next to the "web version" button -- you get
    the same TEI file converted to OEB, and wrapped up with a CSS
    stylesheet in the Microsoft *.lit ebook format.  This does not happen
    on-the-fly yet.  So, unless you want to try out the new ebook reading
    technology we chose to adopt,  you do not need a Microsoft machine.
    We chose MS Reader over the other software-based free ebook readers
    mostly because it is XML-based and was easy for us to output to.  We
    create nothing natively in Microsoft's *.lit format -- or in *.html for
    that matter -- they are output formats from a core TEI document.  A
    nice proof of "build once, use many", and a useful argument for TEI and
    	I assume that your irritation is that you cannot download the
    native TEI document (we use TEILITE and more recently TEIXLITE tagging
    in all our locally-created documents).  Like many other digital library
    sites that do the same thing, you get full use of our richer TEI
    tagging when you search the files, or view a dynamically-created Table
    of Contents and choose to look at only a part of a file, but what we
    deliver to the general online library user is HTML output (or *.lit).
    Whenever we have offered a choice in the past between "Get HTML" and
    "Get SGML" (or "GET TEI") with the latter allowing you to download the
    raw teilite code, we get swamped with email saying that our links are
    "broken".  The vast majority of our users, academic or otherwise, have
    no sense of what this choice means -- they click on the GET TEI link and
    get ... well, they get what you and I would know and love as a TEI file,
    but they see gibberish.
    If every web browser was xml-compliant and we could send
    out the TEI and a stylesheet that would be good for us -- lots less
    converting of data on the server side -- but many of our local and
    global users do not have the latest browser.
    If you would like to talk to me off Humanist about our use of TEI I'd be
    delighted to do this -- and our library has an active partnership with
    IATH to produce digital versions of Buddhist materials in our
    collection, so there may be some digital activity here that is
    close to your Center's interest.
    The point does not need to be made to Humanist readers perhaps, but the
    reason I sent on the ebooks announcement to this list in particular was
    to champion TEI as the format to encode data in and to offer another
    "proof of concept" for an interchange to a format other than HTML.  This
    is slowly receiving some attention in the trade e-publishing arena.  Old
    news to us, I guess, although yet another reason to join the TEI
    Consortium (www.tei-c.org).
    David Seaman
    Etext Center, Virginia

    This archive was generated by hypermail 2b30 : 10/14/00 EDT