9.373 encoding: weights & measures

Humanist (mccarty@phoenix.Princeton.EDU)
Sun, 10 Dec 1995 09:52:13 -0500 (EST)

Humanist Discussion Group, Vol. 9, No. 373.
Center for Electronic Texts in the Humanities (Princeton/Rutgers)

[1] From: Willard McCarty <mccarty@epas.utoronto.ca> (36)
Subject: encoding

Here I would like to raise a matter that is quite independent of the
encoding language (or meta-language) one uses -- at least I think it is. The
question concerns the use of "weights" or degrees of certainty in a tag. I
would like to understand more fully the rationale for using weights.

Certainly the idea that phenomena occur in data to varying degrees of
certainty is hardly new and, I'd suppose, not really debatable. It seems
readily apparent to me that without the use of weights encoding for such
phenomena radically falsifies the data, at least in imaginative textual and
visual material, and I'd guess also in music. One is always in the position
of making the binary choice that phenomenon X either occurs in location Y or
it does not. At first it would seem that weights would provide a solution,
but however fine the scale one adopts, attaching a degree of certainty,
presence, or identity rationalizes, therefore falsifies the data. So the
problem remains. I would argue that in any case unless the weight can be
rigorously computed, always and forever to the satisfaction of anyone who
would treat the data, this problem remains and must be faced.

One answer to this might be that weighting has value for someone who wishes
to record his or her confidence in a judgment -- e.g. "I am 70% certain that
a metaphor of death occurs here". No argument from me, except I wonder what
value doing this might have. I grant that the person who encodes in this way
might well gain from an exhaustive accounting of his or her judgments, but
will anyone else? We get something like this, for example in a conventional
edition of a literary text, where the editor will decide this and that, then
sometimes record the reasons why, indicate the strength of other readings,
simply their existence, and so forth. But would all the trouble of encoding
a text majesterially be worth the effort?

As I think I've commented before on Humanist, I find enormous value in
facing the question at the simplest level, of the first-order binary choice
-- "is phenomenon X here or not?" -- then making my choices and seeing in
the mass of them what reasonable approximations can be made to the textual
realities as I see them. What, then, is the essential difference between
what I do and the use of weights?

Willard McCarty