16.555 Harnad on Lynch on institutional archives

From: Humanist Discussion Group (by way of Willard McCarty ) (willard@mccarty.me.uk)
Date: Sun Mar 16 2003 - 03:39:20 EST

  • Next message: Humanist Discussion Group (by way of Willard McCarty ): "16.554 multimedia conference"

                   Humanist Discussion Group, Vol. 16, No. 555.
           Centre for Computing in the Humanities, King's College London
                         Submit to: humanist@princeton.edu

             Date: Sun, 16 Mar 2003 08:36:34 +0000
             From: Stevan Harnad <harnad@ECS.SOTON.AC.UK>
             Subject: Cliff Lynch on Institutional Archives

    Quote/Comments on:

          Clifford A. Lynch: "Institutional Repositories:
          Essential Infrastructure for Scholarship in the Digital Age"

    Cliff Lynch makes many very good points. I disagree with him only on one
    point, but it is a fundamental one, with important practical and
    strategic implications for the immediate future: What is the most pressing
    reason for creating and filling institutional repositories at this
    time? Cliff thinks it is to promote new forms of scholarship whereas
    I think it is to promote refereed research. The new scholarship
    is coming too, and will certainly grow in importance, but the immediate
    rationale for creating and filling institutional repositories is for the
    self-archiving of institutional research input, in order to maximize
    its research impact, by maximizing user access to it, through open access:

    > faculty have been exploring ways in which works of authorship in the new
    > digital medium can enhance teaching and learning and the communication
    > of scholarship

    This is the familiar and valid complaint that the university has not
    been sufficiently supportive of online innovations by faculty, neither
    in terms of resourcing it nor in terms of rewarding it. This is true,
    and it is indeed a problem, and no doubt slowing innovation. But it is
    also being remedied, by increasing recognition and support, and the
    persistence of innovative faculty. It is *not* the reason universities
    need digital repositories urgently at this time, and this is *not* the
    (main) content that will fill them.

    > faculty have exploited the Net as a vehicle for sharing their ideas
    > worldwide, whether these ideas are expressed in relatively familiar
    > forms such as digital versions of traditional journal articles or (less
    > commonly) in entirely new forms...

    This is a combination of the two kinds of content that are at issue
    here. I am putting the primary emphasis on the "familiar forms" rather
    than the new ones (important and valuable though they too are). The
    progress, productivity and funding of scholarly and scientific research
    depend directly on its visibility and accessibility: the degree to which
    it is found, seen, read, used, cited, applied, built-upon by other
    researchers. In a word, it all depends on *research impact.* And research
    impact depends on research access. Whatever blocks access blocks impact.

    There are 20,000 peer-reviewed research journals, across all disciplines
    worldwide, publishing 2,000,000 articles annually. Almost all of these
    articles are accessible to researchers (i.e., to their potential users)
    only if their institution can afford the toll-access (subscription,
    license) to the journal in which they were published. And most
    universities cannot afford toll-access to most journals -- even the
    richest can only afford a minority of the 20,000. This means that *all*
    research on the planet is inaccessible to *most* of its potential
    users. And every single case of access-denial is a case of potential
    impact loss. The overwhelming, pressing rationale for institutional
    repositories is accordingly: to put an end of this daily impact loss --
    a legacy of the paper era when the true costs of paper access made it
    unavoidable, but no longer necessary in the online era, when open access
    can be provided by institutions for their own refereed research output.

    It is quite natural for researchers to self-archive their own refereed
    research output in their own institutional archives, giving it away to
    all of its would-be users worldwide for free, in order to maximize its
    research impact, for they have been giving it away free to their
    publishers for the very same reason throughout the paper era: Unlike all
    other authors, researchers have always given away their work, written
    only for impact, not for royalty revenue from toll-income. Hence it is
    only natural that now that it has become possible to do so, they should
    self-archive it in their own institutional archives so as to put an end
    to the needless daily impact loss that is a legacy of the paper era.

    This -- and not new forms of scholarship -- is the immediate, pressing
    rationale for creating and filling institutional repositories at this
    time. And this (refereed research output) is the content with which they
    need to be filled, as soon as possible. With it -- and their newfound
    role as *outgoing* collections of a university's own research output
    instead of *incoming* collections of the output of other universities --
    the institutional archives will also become the repositories for new
    forms of scholarship. But the first and most urgent step is to put an
    end to the needless daily impact loss for peer-reviewed research.

    What about the peer-reviewed journals? Their toll-access mechanism of
    cost-recovery may continue to co-exist with the open-access versions in
    the institutional repositories, with those researchers whose institutions
    can afford it using the former and those who cannot using the latter
    -- or the journals may eventually have to cut costs and downsize to
    the essentials in the online era, which may well prove to be just
    peer-review service-provision alone, with the access, storage and
    distribution offloaded onto the institutional repositories.

    Peer-review only costs about $500 per outgoing paper, whereas
    those institutions who can afford it are paying an average of $2000
    (collectively) per incoming paper in access-tolls -- in exchange for
    the very limited access this provides, restricted to the minority who
    can afford it.

    > faculty are well motivated to rise above the institutional failures to
    > help them disseminate their works

    Indeed they are, in the service of maximizing their research impact and
    putting an end to its needless loss. But maximizing research impact is
    in the interest of their institutions too, as the benefits of research
    impact (research funding, prizes, prestige) are shared by faculty and
    their institutions.

    Let me count the three most obvious ways that the self-archiving of
    institutional research output benefits researchers' institutions:

    (1) Open access to an institution's research output maximizes its
    impact and its rewards, as noted.

    (2) Open access, being reciprocal if practised by other institutions too,
    maximizes faculty access to the research output of *other* institutions,
    generating better-informed and more current research (using the research
    output of others, as you would have them use yours!).

    (3) If/when there is ever an eventual downsizing of peer-reviewed
    journals to the remaining online-age essentials (probably only peer
    review itself), then there is also the prospect of eventual institutional
    windfall savings of up to 75% on serials budgets.

    > a faculty member seeking... broader dissemination and availability of
    > his or her traditional journal articles...faces several time-consuming
    > problems... [F]aculty time is being wasted, and expended ineffectively,
    > on system administration activities and content curation.

    Cliff here means the time-consuming problem of maintaining a website for
    self-archiving one's own research output. An institutional archive
    is certainly a more sensible solution than having each researcher
    maintain his own archive.

    > Institutional repositories can maintain data in addition to authored
    > scholarly works. In this sense, the institutional repository is a
    > complement and a supplement, rather than a substitute, for traditional
    > scholarly publication venues.

    Not only is the institutional archive a supplement rather than a
    substitute when it self-archives data that could not be included with
    the published article, but it is a supplement even when it self-archives
    the article: The self-archived open-access version is a supplement to the
    journal's toll-access version, to maximize its research impact. It is not
    a substitute for journal publication -- and certainly not a substitute
    for peer review -- though it might one day become a substitute for
    toll-access (for those who can afford it: for those who cannot, it
    is already a substitute today!).

    > where the disciplinary practice is ready, institutional repositories can
    > feed disciplinary repositories directly. In cases where the disciplinary
    > culture is more conservative, where scholarly societies or key journals
    > choose to hold back change, institutional repositories can help
    > individual faculty take the lead in initiating shifts in disciplinary
    > practice.

    There is no need -- in the age of OAI-interoperability -- for
    institutional archives to "feed" central disciplinary archives: They
    need only feed OAI metadata harvesters. The institution is the natural
    locus for self-archiving its own research output, for each of
    its disciplines. And it is individual researchers, not disciplines,
    who will overcome the old habits, with the incentive to self-archive
    coming from the discipline-universal benefits of maximizing research
    impact. These benefits are shared by researchers and their institutions,
    not by researchers and their disciplines (which are more of a locus
    for *competing* for impact than for *sharing* it!). And journals are not
    holding back change (and cannot): They are themselves changing with the
    new possibilities the online medium has provided to allow researchers to
    maximize their research impact:

    But it is certainly true that university archives can help faculty take
    the lead by providing the resources and policy that facilitates

    > Institutional repositories can encourage the exploration and adoption of
    > new forms of scholarly communication... This, to me, is perhaps the most
    > important and exciting payoff

    Here is where Cliff and I disagree. Exciting as they are, the new forms
    are not the immediate priority: Open access to the "old forms" is. Then
    the new forms will come too. But first the full research impact of the
    old forms, at last. They will pave the way for the rest.

    > The first potential danger is that institutional repositories are cast
    > as tools of institutional (administrative) strategies to exercise
    > control over what has typically been faculty controlled intellectual
    > work. I believe that any institutional repository approach that requires
    > deposit of faculty or student works and/or uses the institutional
    > repository as a means of asserting control or ownership over these works
    > will likely fail, and probably deserves to fail... This is not to say
    > that policies mandating the deposit of materials that are broadly
    > recognized as part of the institutional record ... are inappropriate.

    I agree completely. The purpose of institutional archives and
    archive-filling policies is not to assert control or ownership over
    faculty research output! It is to maximize its research impact by
    maximizing user access to it.

    Mixing up the open-access agenda with other university dreams about
    generating new revenue streams from faculty intellectual output (software,
    patents, courseware, distance education, electronic publishing) is not
    only wrong-headed, but it risks delaying the real and sizeable benefits
    of open access to refereed research output, turning the institutional
    repository movement into aimless gridlock for some time to come.

    > My second concern is... [that] administrators, librarians, and faculty
    > members wishing to challenge existing systems of scholarly publishing
    > (specifically their economic models and their creation of barriers to
    > access through intellectual property control and licensing arrangements)
    > may try to link their efforts too directly to institutional repositories
    > by imposing inappropriate policy constraints

    I agree. See above. And here is a model for an appropriate policy:

    > it dramatically underestimates the importance of institutional
    > repositories to characterize them as instruments for restructuring the
    > current economics of scholarly publishing

    I agree again. It is not the business of universities to restructure the
    economics of scholarly publishing. It is the business of universities to
    do research, publish their findings, and make sure that those findings are
    put to full use. Maximizing all would-be users' access to them is the
    way to ensure the latter. And that might (but just might) eventually
    have some effects on the economics of refereed journal publication. But
    that would only be a side-effect, not the direct motivation or
    justification at all: That direct motivation and justification is
    to maximize the impact of institutional research output by making it
    open-access -- by self-archiving it in the institutional repository.

    > the institutional repository isn't a journal, or a collection of
    > journals, and should not be managed like one. That's not the point or
    > the purpose of an institutional repository.

    Correct. It is an open-access supplement to toll-access via the journals.

    > Institutional repositories are not a challenge or alternative to
    > disciplinary repositories; rather, they complement them, just as they
    > can complement existing venues of scholarly publication.

    In the era of OAI, institutional and disciplinary archives are equivalent,
    because completely interoperable. However, the shared interest of
    researchers and their institutions in maximizing the impact of their
    research output makes institutional archives a better bet for hastening
    open access, especially as they are in a position to modify their
    existing publish/perish policies so as to mandate self-archiving in
    order to maximize research impact.

    > It is desirable to make this as simple as possible... with a simple and
    > stable submission interface to the institutional repository.

    The simple solution is available already: See the 60+ Eprints.org
    institutional archives http://software.eprints.org/#ep2
    in use for over 2 years and growing:

    The challenging part is not creating the free self-archiving software,
    nor in making it simple, nor in getting it adopted, but in getting
    the archives filled, which requires a clear, coherent institutional
    self-archiving policy -- with a clear sense of *what* needs to be
    self-archived, *how* and *why*:

    > It's vital that institutions recognize institutional repositories as a
    > serious and long-lasting commitment to the campus community (and to the
    > scholarly world, and the public at large)

    Yes, but *far* more important than this advance long-lasting commitment
    to an empty archive is a coherent policy for getting it filled!

    > An institutional repository can fail over time for many reasons: policy
    > (for example, the institution chooses to stop funding it), management
    > failure or incompetence, or technical problems. Any of these failures
    > can result in the disruption of access...I worry a great deal about what
    > the various impacts and implications of the first few major failures of
    > institutional repositories

    And I worry a great deal about worries about the permanence of empty
    or even non-existent archives, instead of directing all energies and
    resourcefulness to filling the archives! Get the precious intellectual
    eggs into the basket, and their very presence there will be the best
    guarantor that they will be maintained in perpetuum. Worry instead
    about permanence now and all you do is add another item to the
    long list of needless worries that are holding back self-archiving:

    And this is also the point to remind ourselves, again, that
    self-archiving is a *supplement* to, not a *substitute* for journal
    publication. Until and unless there is a transition and downsizing
    from toll-access journal publication to open-access journal publication,
    the primary preservation burden is not on the institutional archives!
    Their burden is merely to provide open-access to it, now, as a supplement
    for those who cannot afford toll-access.

    So stop worrying about archives failing and work instead on archives

    > Not every higher education institution will need or want to run an
    > institutional repository, though I think ultimately almost every such
    > institution will want to offer some institutional repository services to
    > its community. We will see various forms of consortial or cluster
    > institutional repositories.

    Maybe. But it seems to me that this is only a substantive question if we
    are talking about the industrial strength archive software such as
    DSpace. For the "light" softwares such as Eprints, there is so little
    start-up time and maintenance required that I would think any
    institution that generated research output could and would run its own.
    (Again, there is not enough *content* yet to talk about fancy consortial
    schemes! Let's get the culture of self-archiving rolling before we worry
    about the load being to great for an institution to manage on its own!)

    > Federation of institutional repositories may also subsume the
    > development of arrangements that recognize and facilitate faculty
    > mobility and cross-institutional collaborations.

    This can be managed at the metadata level without any special need to
    "federate" (over and above OAI-interoperability). A metadata tag
    indicating current institutions, and tags indicating prior institutions
    and dates will allow all research to be triangulated upon (for where it
    was done, and when).

    > The MIT [free repository] software is not the only option available,
    > although I believe it is the most general-purpose; for example, there
    > is [free repository] software from the University of Southampton in
    > the U.K. <http:// www.eprints.org/> designed more specifically for
    > institutional or disciplinary repositories of papers, as opposed to
    > arbitrary digital materials.

    And I have here tried to give the reasons why the pressing challenge now
    is not general-purpose archiving of arbitrary digital materials, but
    the self-archiving of institutional refereed research output, to
    maximize its research impact by maximizing its visibility and
    accessibility, through open access.

    Stevan Harnad

    NOTE: A complete archive of the ongoing discussion of providing open
    access to the peer-reviewed research literature online is available at
    the American Scientist September Forum (98 & 99 & 00 & 01 & 02):


    Discussion can be posted to: september98-forum@amsci-forum.amsci.org

    See also the Budapest Open Access Initiative:

    the BOAI Forum:

    the Free Online Scholarship Movement:

    the SPARC position paper on institutional repositories:

    the OAI site:

    and the free OAI institutional archiving software site:

    Dr Willard McCarty | Senior Lecturer | Centre for Computing in the
    Humanities | King's College London | Strand | London WC2R 2LS || +44 (0)20
    7848-2784 fax: -2980 || willard.mccarty@kcl.ac.uk

    This archive was generated by hypermail 2b30 : Sun Mar 16 2003 - 03:40:52 EST