Humanist Discussion Group, Vol. 14, No. 805.
Centre for Computing in the Humanities, King's College London
Date: Fri, 20 Apr 2001 08:39:44 +0100
From: Ken Friedman <firstname.lastname@example.org>
Subject: Access to scholarly and scientific information
Access to scholarly and scientific information was
posted to PhD-Design for a thread proposed by Praveen Nahar.
You are invited to follow the thread by joining PhD-Design.
You may subscribe at the following site:
2001 April 18
Thanks to Praveen Nahar for posting the article on access to journal articles from the Chronicle of Higher Education (Olsen 2001: unpaged) [Reproduced below, exhibit 1]. This debate involves profound issues for scholarly communication in and across all fields.
As a subscriber to The Chronicle of Higher Education, I read this piece last month. The online edition links to the original proposal (Roberts et al. 2001: unpaged) [Reproduced below, exhibit 2], and to the response from Science (Editors [Science] 2001: unpaged) [Reproduced below, exhibit 3].
This debate goes back many years in most fields. It involves the development and shape of scholarly communication in general. It involves the high cost of scholarly journals to libraries and the universities and research centers that maintain them. It involves a host of related issues, including: access to scholarly and scientific literature, the development of knowledge; criteria for tenure and promotion; stability of scholarly and scientific information; management and use of scholarly and scientific information; ensuring the validity and reliability of scholarly and scientific information; and more.
While I do not have the current costs at hand, the university and research libraries of the world now spend something on the order of five billion US dollars a year on journals and scientific subsections. What is even more significant is that they also pay for the research that creates the content of the journals to which they subscribe. In addition, they pay for the refereeing and editorial work that examines, prepares, organizes, and creates the content. These costs easily equal or surpass the five billion spent on buying in published form the material that scientists and scholars have already created and edited. In the technology of an earlier day, the demands of printing publishing, and distribution were such that it made sense for organizations other than universities to undertake them.
Computers and the worldwide web have changed much of this. Today, manuscripts are prepared on computer. This means, for all practical purposes, that authors typeset their own articles, editors proofread, and other kinds of specialists undertake work that would once have been undertaken in a print shop using lock-up type or running hot lead. The new opportunities made possible by the technology of the worldwide web means that universities are essentially buying the right to use material that they have already paid to produce.
At the same time, it must be noted that publishing itself - aside from the cost of paper and postage - involves many steps provided by the publishing companies, and not by the universities. These are labor intensive, capital intensive, and they cannot be undertaken without funding.
The greatest expense is labor. In the 1970s and 1980s, I was involved in a number of projects that involved different kinds of scholarly documentation and publishing, some of which involved storage and access through the developing electronic media as well as through microfilm and print-on-demand. While there was no worldwide web, there were nascent electronic networks that were part of what then constituted the Internet. I was also deeply involved in reference book development and scholarly or scientific referencing, and it was obvious that meeting the challenges of these fields while making content available required sophisticated and costly solutions.
By the 1990s, much of this had changed, but not all. In 1994, I floated a proposal (Friedman 1994) for a web-based university press that in some regards resembled the preprint servers used in physics and other disciplines, but aimed, instead, at rendering accessible completed documents, out of print materials, and other published items. We had no luck developing what I labeled "European University Press." The labor requirements and costs soon came to seem insurmountable: acquisition, editorial, manuscript preparation, and then the continual need for software management and information structuring of the web site itself all ran into vast sums. This seemed so insurmountable to those whom we invited to join in a feasibility study that we were unable to get enough partners even to study the possibility.
These issues have continued to interest me in different ways (Friedman 1995, 1996, 1998). At every turn, one central problem becomes apparent: relatively few individual scholars are willing to do the work required for developing and maintaining a long-term access system to the scholarly and scientific literature. In fact, relatively few scholars are willing to undertake on along-term basis the work required for the editorial development of today's scholarly and scientific literature. Editorial positions and referees routinely rotate specifically because people lose interest and energy.
There is a complication here. At the start of our new millennium, the presidents, deans, and department heads that decide on hiring, promotion, tenure, sabbaticals, merit increase and all the rest place great faith in the value of journal publication as a measure of scholarly and scientific production. They do so with good reason. The journal system has survived now for four centuries. A proven system, it has served the interests of science and scholarship in important ways. Even though some universities now credit publication in extended media, many academic administrators continue to demonstrate skepticism toward formats other than the traditional paper journal. This, in turn, gives the scholars and scientists who work at universities a powerful incentive to publish in paper journals and a disincentive to publish in other media.
Several research-centered design schools are exploring the development of online databases and resource centers that would meet some of the criteria of the proposals in the Science debate. So far, none have launched. All face the same kinds of development, funding and management problems we faced in 1994. The past decade has seen sweeping strides in technology, but time, quality, and funding issues remain much the same.
There is much merit in the current proposal (Roberts et al. 2001) and in the response from Science (The Editors 2001). Science makes available an important debate in their online edition, and this debate is available free over the Web. Go to the URL below and follow the instructions.
Those who are interested in getting up to speed on this debate will want to read the book Scholarly Publishing: The Electronic Frontier. In this important contribution, Peek and Newby (1996) present summary views and empirical data to the mid-90s. It is interesting to see how relevant these issues remain, despite the technological progress since this book was completed. This book includes chapters from some of the central figures in the debate, including Rob Kling, Andrew Odlyzko, and Steven Harnad.
Four other books will fill in important background issues for those who are deeply interested. Daniel Bell's (1999) 1973 classic, The Coming of Post-industrial Society: A Venture in Social Forecasting, sets a background for many of these issues, touching the challenges that will later shape the information society as a whole. The 1999 reprint includes an important new contribution by the author.
Olaisen, Munch-Pedersen, and Wilson (1996) present authors who examine many of the problems - and possible solutions -- involved in this challenge. These include structure and classification challenges, search issues, and, mot important, the difficulties of information overload.
Compaine and Read (1999) and Lamberton (1996) are anthologies that contain many useful and relevant articles. While these are organized around information policy and information economics respectively, these issues are inextricably linked to the questions in the current debate. It is interesting to note that the first edition of Lamberton's book appeared a quarter century ago - and that many of the issues and challenges remain current enough today that some of the original articles are still relevant.
The challenge to journals offered by Roberts et al. (2001) follows on an immense amount of thinking and action during the past twenty-five years. The birth and growth of the worldwide web during the past decade makes these issues especially timely.
The state of scholarly and scientific communication in design research is improving, but it often remains fragile. This makes this debate particularly relevant to our field.
I'll post my own response to Praveen Nahar's question next week. At this time, I thought it would be helpful to share some of the important background materials, together with the items from Science.
Bell, Daniel. 1999. The Coming of Post-industrial Society. A Venture in Social Forecasting. New York: Basic Books.
Compaine, Benjamin M. and William H. Read. 1999. The Information Resources Policy Handbook. Research for the Information Age. Cambridge, Massachusetts: MIT Press.
The Editors [Science]. 2001. "Science's Response. Is a Government Archive the Best Option?." Vol. 291, No. 5512, Issue 23 Mar 2001: 2318-2319. [Available online at URL <http://www.sciencemag.org/cgi/content/full/291/5512/2318b>. Accessed 2001 March 27.]
Friedman, Ken. 1994. A University Press for the Worldwide Internet. Preliminary Notes. Research proposal of 1994 October 4.
Friedman, Ken. 1995. Books in the Age of On-Line Information: Will We Read More or Fewer Books? Statistical Summary and Preliminary Conclusions. American Association for Higher Education Technology Reports. Washington, D.C.: AAHE.
Friedman, Ken. 1996. "Individual Knowledge in the Information Society." In Information Science: From the Development of the Discipline to Social Interaction. Johan Olaisen, Erland Munch-Pedersen and Patrick Wilson, editors. Oslo: Scandinavian University Press, 245-276.
Friedman, Ken. 1998. "Cities in the Information Age: A Scandinavian Perspective." In The Virtual Workplace. Magid Igbaria and Margaret Tan, eds. Hershey, Pennsylvania: Idea Group Publishing, 144-176.
Lamberton, Donald M. 1996. The Economics and Communication and Information. Cheltenham, UK: Elgar.
Nahar, Praveen. 2001. "Scholars Urge a Boycott of Journals That Won't Release Articles to Free Archives (fwd)." Phd-Design. Date: Wed, 18 Apr 2001 06:53:19 +0530.
Olaisen, Johan, Erland Munch-Pedersen and Patrick Wilson, editors. 1996. Information Science: From the Development of the Discipline to Social Interaction. Oslo: Scandinavian University Press.
Olsen, Florence. 2001. The Chronicle of Higher Education. Online Edition. Monday, March 26, 2001. URL: <http://chronicle.com/free/2001/03/2001032601t.htm>. Accessed 2001 March 27.
Peek, Robin P., and Gregory B. Newby. 1996. Scholarly Publishing. The Electronic Frontier. Cambridge: Massachusetts: MIT Press.
Roberts, Richard J., Harold E. Varmus, Michael Ashburner, Patrick O. Brown, Michael B. Eisen, Chaitan Khosla, Marc Kirschner, Roel Nusse, Matthew Scott, Barbara Wold. 2001. "Building A 'GenBank' of the Published Literature." Science, Vol. 291, No. 5512, Issue 23 Mar 2001: 2318-2319. [Available online at URL <http://www.sciencemag.org/cgi/content/full/291/5512/2318a>. Accessed 2001 March 27.]
Monday, March 26, 2001
Scholars Urge a Boycott of Journals That Won't Release Articles to Free Archives
By FLORENCE OLSEN
Several prominent scholars, including Harold E. Varmus, the former director of the National Institutes of Health, are urging a boycott of scientific and scholarly journals that refuse to make articles accessible online -- free -- soon after their publication.
The scholars also are making a demand that some publishers say is even more challenging: that the publishers place their content in independent repositories on the Web six months after a journal issue has appeared in print.
In an essay in Science, the scholars urge their peers not to submit papers to, write reviews for, or subscribe to journals that ignore the scholars' demand. The scholars argue that it would be relatively simple and inexpensive for journals to participate in open-archives projects, such the National Library of Medicine's PubMed Central and Stanford University's HighWire Press.
HighWire archives more than 230 journals in biology, physics, and other sciences. The PubMed Central archive, which Mr. Varmus promoted while he was at N.I.H., currently has about a dozen biology journals.
"As scientists," the scholars argue, "we are particularly dependent on ready and unimpeded access to our published literature, the only permanent record of our ideas, discoveries, and research results, upon which future scientific activity and progress are based."
But in an editorial in the same issue, Science's editors say the scholars' proposal would put nonprofit, scholarly publishers at risk because it would "reroute an economically important source of online traffic for journals that offer content and other products on their sites."
Science announced, however, that it would go part way toward meeting scholars' desire for a free digital archive of life-science research. The journal has agreed to make its reports and articles available free on its Web site 12 months after the print issues are published. The journal is published by the American Association for the Advancement of Science, a nonprofit group.
Partly because of the controversy, The Journal of Cell Biology has made its contents free on its Web site, without password or access controls, six months after each issue's publication.
What makes the central-repository idea so distasteful for some journal editors is that articles would have to be converted from each journal's own format to whatever format the repository uses. Reformatting poses technical challenges and usually requires that every article be painstakingly checked for formatting errors.
The Science authors propose the GenBank, for DNA sequences, as their model for a centralized repository of life-science literature. But journal publishers who disagree say the GenBank material poses far fewer formatting problems than scientific literature.
William Wells, the news editor of The Journal of Cell Biology, notes that HighWire has staff members to handle each journal that it publishes. Those people act as liaisons, fixing the myriad translation problems that come up with each print journal.
"One of the questions we have is, Is PubMed Central willing to put in that sort of manpower?" says Mr. Wells. "Are they going to compensate us for the amount of time they'll be calling us up saying, What's this new special character? It's not very interesting stuff to talk about, but it's the practicalities," he says.
"Presumably, a lot of this can be overcome," adds Mr. Wells. "But we don't think it's necessary to overcome it because there's a much simpler way to do it, without having this huge centralized apparatus."
The Journal of Cell Biology issued a statement by Ira Mellman, the editor in chief, saying that centralized repositories would soon be unnecessary: "The ability to search across thousands of servers, as long as those servers do not have access controls, is the very reason that the Web is such a powerful tool."
Copyright 2001 by The Chronicle of Higher Education
Building A "GenBank" of the Published Literature
Richard J. Roberts,* Harold E. Varmus, Michael Ashburner, Patrick O. Brown, Michael B. Eisen, Chaitan Khosla, Marc Kirschner, Roel Nusse, Matthew Scott, Barbara Wold
Since the time of the great library of Alexandria, scholars have recognized the value of central repositories of knowledge. As scientists, we are particularly dependent on ready and unimpeded access to our published literature, the only permanent record of our ideas, discoveries, and research results, upon which future scientific activity and progress are based. The growth of the Internet is changing the way we access this literature, as more scientific journals produce online editions to supplement or replace printed versions. We urge journal publishers, their editors, and all working scientists to join together to create public, electronic archives of the scientific literature, containing complete copies of all published scientific papers.
Anyone who has spent time in a library searching for a key paper, result, or method will immediately see one of the benefits of comprehensive repositories. Those gems of information that are often buried within papers, but are not referred to in the abstract or keywords, will become readily retrievable. You will be able to locate descriptions of methods or find the original data that underlie crucial conclusions. You will be able to trace connections between observations originally scattered among many papers in different journals and databases. However, the value of central archives goes well beyond facilitated searching and retrieval. Bringing all of the scientific literature together in a common format will encourage the development of new, more sophisticated, and valuable ways of using this information, much as GenBank has done for DNA sequences.
Some have argued that central repositories are of no additional value because many journals already make their online contents freely available after some delay through their own Web sites. However, material that is freely accessible, on a controlled basis, one paper at a time, at a journal's Web site differs from material that is freely accessible in a single comprehensive collection. The latter can be efficiently indexed, searched, and linked to, whereas the former cannot. Imagine how much less useful DNA sequences would be if instead of GenBank and other global repositories, we had dozens of smaller sequence collections that could only be accessed one at a time through a genome center's Web site. Only by creating repositories with uniform, explicitly defined, and structured formats, can a dynamic digital archive of life science research literature become possible. Unimpeded access to these archives and open distribution of their contents will enable researchers to take on the challenge of integrating and interconnecting the fantastically rich, but extremely fragmented and chaotic, scientific literature.
To ensure that complete public scientific archives become a fully workable reality, the necessary infrastructure must be constructed. The National Institutes of Health has taken an important step by creating PubMed Central (PMC) (1) with the goal of storing the life sciences literature in digital form and providing free and convenient access, linked to the popular bibliographical database, PubMed. We envision PMC as only the first of many public archives. However, such archives will not realize their potential until they are populated. This requires that journal publishers allow their digital content to be distributed and used through online public archives. Several journals, including the Proceedings of the National Academy of Sciences, the British Medical Journal, Nucleic Acids Research, Molecular Biology of the Cell, and the BioMed Central (2) journals, have already agreed to deposit their content with PMC, following, at most, a short delay after print publication. Publishers now have a wonderful opportunity to reinforce their partnership with the scientific community by supporting extant archives like PMC and by allowing archival material to be freely used and distributed, and we strongly urge them to do so. It would be natural and simple for journals that have already decided to make their back issues freely accessible at their own Web sites to make the same content available in electronic archives. The costs of participating in open archives would be minimal and would be more than offset by the benefits their participation would bring to the scientific community.
Historically, publishers have left the job of archiving to the libraries. Library archives have become more accessible as we have moved from indexed abstract books to rapidly updated online abstract searching tools. Public online archives should be viewed as the logical continuation of this tradition and, thus, as a complement to the publisher's normal activities. For electronic archives to assume this role fully, decades of volumes that currently exist only in printed form will need to be digitized. We do not expect journals to bear the cost of the digital conversion of their printed archives. Indeed, efforts to raise the necessary funds are under way, so that digital conversion of archival volumes can proceed rapidly.
It is important not only that PMC succeed, but also that other institutions be encouraged to provide independent online sites for the distribution and use of the same comprehensive archives. Multiple independent online sites will help ensure ready access for users around the world and will guarantee that no single government or institution can control access to our common scientific heritage. This diversity will also foster innovation in the ways the material in the archives is used.
We feel that if journal editors and publishers were to poll their authors and readers, they would find overwhelming support for such archives. The strength of this support is demonstrated by the growing list of scientists who have signed an open letter (3) advocating free and unrestricted distribution of scientific literature 6 months after publication. We urge our colleagues, especially students and the younger members of the scientific community, to make your views heard. If these efforts are successful, in 10 years, everyone's ability to do science will have been greatly enriched, and we will all wonder how it was possible to work without such archives.
References and Notes
1. www.pubmedcentral.nih.gov 2. www.biomedcentral.com 3. www.publiclibraryofscience.org
R. J. Roberts, New England Biolabs, Beverly, MA 01915, USA. H. E. Varmus, Memorial Sloan-Kettering Cancer Center, New York, NY 10021, USA. M. Ashburner, University of Cambridge, CB2 3EH, UK, and EMBL-European Bioinformatics Institute, Cambridge, CB10 1SD, UK. P. O. Brown, Stanford University School of Medicine, Stanford, CA 94305, USA. M. B. Eisen, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, and University of California, Berkeley, CA 94720, USA. C. Khosla, Stanford University, Stanford, CA 94305, USA. M. Kirschner, Harvard Medical School, Boston, MA 02115, USA. R. Nusse and M. Scott, Stanford University School of Medicine, B. Wold, Biology Division, California Institute of Technology, Pasadena, CA 91125, USA.
*To whom correspondence should be addressed. E-mail: email@example.com
Howard Hughes Medical Institute investigator.
Volume 291, Number 5512, Issue of 23 Mar 2001, pp. 2318-2319. Copyright 2001 by The American Association for the Advancement of Science.
Is a Government Archive the Best Option? The Editors
Rich Roberts and his colleagues have constructed a thoughtful argument for an online archive of published science. A seamless way of getting access to the scientific literature is an objective many scientists have sought, and the version outlined in the Roberts piece is being pursued with vigor and understandable passion by its advocates. We admire the goal, and suspect that evolutionary forces may be moving us toward it. We have decided to make our own back research reports and articles freely available after 12 months--at our own Web site--later this year.
The specific proposal of Roberts et al. goes further. It urges our readers to sign a petition that "advocates the free and unrestricted distribution of scientific literature 6 months after publication." Actually, the petition does quite a bit more than that. It urges an economic boycott: signers agree not to submit papers to, review for, or subscribe to journals that do not submit to the petition's proposals. To begin a conversation among scholars with a threat of economic boycott is unfortunate.
However, we would rather focus on the qualities that Roberts et al. believe are essential to the archive they advocate. It should include all scientific papers and the content should be in a common format that allows for advanced search capabilities. Content should be free and "open distribution" should be allowed. PubMed Central (PMC) is given as the model of an archive that will meet these criteria. We believe other alternatives exist that can meet most of these goals faster and more effectively without putting nonprofit scholarly publishing at risk.
There already are multiple-journal sites--for example, the nonprofit HighWire Press (HWP), which archives over 230 journals, including biological, physical and interdisciplinary papers. More than 200,000 articles are freely available at this site. By comparison, there are only about a dozen journals at PMC, limited currently to biology.
Advocates of PMC argue that sites in which each journal is archived separately are insufficiently integrated. But searching across multi-journal, full-text repositories is already possible at sites such as HWP. In addition, 60% of this content is in a common format already. Why not begin with the already populated venue and add the integration, rather than the other way around? Why not use taxpayer dollars to promote innovative search technologies that do not require taking control of services provided by the private sector?
The proposition of Roberts et al. raises problems for Science, and for other journals. First, it will reroute an economically important source of online traffic for journals that offer content and other products on their sites. Second, unlimited redistribution of content could lead to misuse of content and loss of quality control. Third, it may expose users to risks historically associated with monopoly suppliers. For example, recently PubMed--on which PMC will depend--unexpectedly failed to process new content for over a month, inconveniencing authors and publishers.
We also wonder whether enough attention has been given to some of the economic issues. Experience shows that demand for scientific papers drops to about 1/10th within 4 to 5 months, but then continues at a low level for years. We plan to track our experience with free back issues carefully, but in the meanwhile, we take little comfort from the assurance that "costs of participation in open archives will be minimal." Subscription and advertising revenue will be at some risk and transferring primary access to someone else's site may expose us to further losses. The value we add--through peer review, perspective and context-setting analysis of research, and good news coverage--requires revenue support from advertising. Moreover, Science supports other activities of AAAS--including science and public policy, kindergarten through 12th- grade education, a career-mentoring Web site for young scientists, and innovative "knowledge environments." These benefit scientists from all fields. Posting our back content on a site that primarily serves biomedical scientists would confer a benefit on one group by taking benefits away from another--creating, in effect, a transfer payment from the sciences in general to biology in particular. That bothers us.
We worry, too, about another group of journals that will be entering a riskier environment. Our association is an umbrella organization, including many specialized scientific societies as affiliates. Their more focused journals must remain viable to ensure continued publishing options in highly specialized fields and for younger scientists. In most cases, academic library subscriptions provide the economic "floor" that guarantees financial sustainability. If papers from specialized journals were to become available on the PMC site, budget-conscious library directors would be tempted to cancel subscriptions. Some of the signers of the petition are scientists who belong to those very societies. Have they considered that their initiative will put PMC in competition with their own journals? When tax-exempt organizations go into competition with commercial entities they must pay unrelated-business income tax. When tax-supported organizations compete with commercial entities and nonprofits, the public has usually raised strong objections.
There are also questions about whether the proposed location for PMC--the National Library of Medicine, part of the National Institutes of Health--is the right one. NIH already sponsors, through its extramural programs, much of the biomedical research PMC will archive. It regulates the conduct of that research, controls much of the training of the next generation of researchers, and archives primary data. It now proposes that the results of the research it funds be given over by publishers and authors to a server subject to its exclusive control. The Congress or the President can eliminate support for certain kinds of science and have done so in the past. Would PMC then be able to archive papers on those subjects? Concentrating this kind of womb-to-tomb control in a single federal agency has risks, and we should ask whether we are entirely comfortable with a state-run, centrally managed economy in biomedicine.
Proponents of this plan include scientists of high reputation: Nobel laureates, leaders of institutions, and others whom we all admire. Nonetheless, we think its potential consequences require careful analysis and policy debate. We at Science are determined to participate in a constructive spirit.
Volume 291, Number 5512, Issue of 23 Mar 2001, pp. 2318-2319. Copyright 2001 by The American Association for the Advancement of Science.
Ken Friedman, Ph.D. Associate Professor of Leadership and Strategic Design Department of Knowledge Management Norwegian School of Management
Visiting Professor Advanced Research Institute School of Art and Design Staffordshire University
+47 22.98.50.00 Telephone +47 184.108.40.206 Telefax
Byvaegen 13 S-24012 Torna Haellestad Sweden
+46 (46) 53.245 Telephone +46 (46) 53.345 Telefax
This archive was generated by hypermail 2b30 : Fri Apr 20 2001 - 03:54:13 EDT