Humanist Archives: Nov. 8, 2022

        Date: 2022-11-07
        From: Henry Schaffer
        Subject: Over-generalization using corpus linguistics?

I'm researching an area in which there is disagreement as to the meaning(s)
of some words and phrases in historical  textual material. I recently read
a published item which digs into two corpora in an attempt to determine the
actual meaning of several phrases.

A term or phrase might have a meaning in general usage - in the newspapers,
diaries, books, legal materials, letters, and other material captured in a
text corpus.

However, it might have a quite different, although perhaps related, meaning
when used in a specialized context such as dealing with legal, statistical,
scientific, military, ..., matters.

One example would be the word "significant". In most cases, as shown by the
first or first several dictionary definitions, it has something to do with
"importance". Corpus linguistics would support this interpretation.

However, in the narrow context of statistics, it means that the p-value
resulting from a statistical test is smaller than some pre-chosen value.
Often a statistical result being significant implies important, however, it
can be significant, or even very highly significant, because some rather
unimportant effect was explored in a rather large experiment. This usage
would rarely be found in a text corpus. But, in those cases, claiming
importance would be misleading.

I'm wondering how often the meaning indicated by study of a text corpus is
used so as to intentionally or unintentionally misinterpret or distort the
meaning as used in a specialized context.

--henry schaffer

