5.0452 Query: Word Lists and Copyright (1/44)
Elaine Brennan & Allen Renear (EDITORS@BROWNVM.BITNET)
Fri, 15 Nov 1991 15:29:31 EST
Humanist Discussion Group, Vol. 5, No. 0452. Friday, 15 Nov 1991.
Date: Wed, 13 Nov 1991 10:18 EST
From: Jean Veronis <VERONIS@VASSAR>
Subject: Q: Word lists and Copyright
I have a concrete copyright problem, and I wonder if Humanists have some
thoughts about it.
Years ago I typed in a list of words from a published book, the "Echelle
Dubois-Buyse d'Orthographe Usuelle Francaise" (2nd edition, 1977, OCDL). In its
printed form, the list gives a DIFFICULTY INDEX for spelling for each of 3730
frequent words in French.
As I said, I typed in that list, and added information for each word, such as
PHONETIC TRANSCRIPTION, PART-OF-SPEECH, etc. It occurs to me that this
information could be of some value to other colleagues, but there is obviously
a copyright problem.
My assumption is that what specifically belongs to the "Echelle Dubois-Buyse"
is the association between words and a difficulty index. This is the only
reason why people would want to buy the book, I think. Therefore, I have the
feeling that if I remove the DIFFICULTY INDEX from my data, and I distribute a
file containing for example
WORD FORM PHONETIC PART-OF-SPEECH
I would be safe as far copyright is concerned.
However, I have a slight doubt: people might argue that the SELECTION of those
particular 3730 words has by itself some intellectual value, and is protected
by a copyright...
Of course, I could ask the publishers what they think about it, but I think the
problem is more general. It occurs all the time when word lists are concerned.
- Is the list of words in a given dictionary protected by copyright (I mean
just the headwords, with no other information)
- If I start with a corpus (e.g. the Brown Corpus) and I compute some data (for
example, the frequency list of lemmas), can I distribute this result freely?
I am sure that this problem has been discussed in appropriate circles, but I am
unaware of any concrete rules governing copyright of materials such as these.