FEATURE
Institutional Repositories: Hidden
Treasures
by Miriam A. Drake Professor Emerita Library Georgia
Institute
of Technology
The world's universities, museums, governments, and
other organizations house treasures that have been
hidden in archives, basements, attics, print formats,
and a variety of storage devices. These treasures encompass
scientific, technological, cultural, artistic, and
historical materials generally unavailable to searchers
and the public. Institutional repositories are now
being created to manage, preserve, and maintain the
digital assets, intellectual output, and histories
of institutions. Librarians are taking leadership roles
in planning and building these repositories, fulfilling
their roles as experts in collecting, describing, preserving,
and providing stewardship for documents and digital
information.
Development of institutional repositories has largely
taken place in universities. Three articles describe
the activities of universities1. While the
key articles describing institutional repositories
relate to universities, any organization can adapt
and adopt the concept. Corporations and not-for-profits
may establish repositories to archive and preserve
their institutional histories and administrative documents.
Materials in corporate repositories would most likely
remain proprietary and unavailable to people outside
the company. Not-for-profit organizations may find
repositories useful for relating the histories of the
organizations, raising funds, and creating interest
in the projects and activities of the organizations.
Repositories provide services to faculty, researchers,
and administrators who want to archive research, historic,
and creative materials. The open access and open archives
movement, the need for changes in scholarly communication
to remove barriers to access, and the increasing awareness
that universities and research institutions are losing
valuable digital and print materials have begun driving
the establishment of institutional repositories. Using
open archive models [http://www.openarchives.org],
established metadata standards, and digital rights
management, important new information sources are seeing
the light of day and becoming more generally available.
While the main purposes of institutional repositories
are to bring together and preserve the intellectual
output of a laboratory, department, university, or
other entity, the incentives and commitments to change
the process of scholarly communication have also begun
serving as strong motivators. Computers have been ubiquitous
on campuses since the late 1980s. Students and faculty
are comfortable with the power of online communication.
Faculty teachers and researchers want to archive their
own materials and have them available on personal or
institutional Web sites, these articles, along with
the development of the Internet and more powerful search
engines, have enabled people to think in practical
terms about the establishment of central facilities
for storing, archiving, preserving, and making scholarly
and artistic materials available. Repositories may
be limited to one field, one department, one institution,
or a consortium of several institutions. Collaboration
through a consortium reduces costs for each member
through resource sharing while expanding access to
digital materials.
For universities, repositories are marketing tools
communicating capabilities and quality by showcasing
faculty and student research, public service projects,
and other activities and collections. Repositories
in universities may include preprints and postprints
of journal articles, technical reports, white papers,
research data, theses, dissertations, work in progress,
important print and image collections, teaching and
learning materials, and materials documenting the history
of the institution. Digital university presses, such
as Highwire [http://highwire.Stanford.edu],
University of California eScholarship editions [http://escholarship.cdlib.org/ucpressbooks.html],
the University of Chicago Press, the Chicago Digital
Distribution Center, and BiblioVault [http://cddc.uchicago.edu] are
publishing online and establishing digital archives.
Scholarly societies may establish discipline-based
repositories to preserve the history and literature
of a particular subject area. However, these societies
have a serious dilemma. They publish journals to disseminate
research about their fields. If the societies establish
open access repositories, they could experience reduced
or zero publishing profits, which might in turn affect
their ability to pay overhead expenses and to provide
enhanced member services. The loss of revenue could
place these societies in the position of having to
ask members to pay more of the cost of member services.
The increased demand for scholarly information, especially
in science, will probably increase the pressure on
scholarly societies and universities. Digital publishing,
global networking, more research, and increased communication
among communities of scholars are driving the demand
for broader access. The idea of the invisible college
nurtured by meetings and preprints of journal articles
has been replaced by global, discipline- or project-based
online communities.
Governments and government agencies may use repositories
in the same ways as universities to document work in
progress and the histories of agencies. Some agencies
will find repositories useful for storage and access
to technical reports, white papers, hearings, and other
documents.
Institutional Repository Examples
The Dspace repository project [http://dspace.org] at MIT has received extensive coverage in the news
and literature. The Dspace Web page describes the project
as "a groundbreaking digital institutional repository
that captures, stores, indexes, preserves, and redistributes
the intellectual output of a university's research
faculty in digital formats" [http://dspace.org/introduction/index.html].
The MIT repository contains a variety of research materials
deposited in accordance with the policies developed
by departments and research units at MIT.
Dspace developed open source software with a grant
from Hewlett Packard and created a federation of universities
to work collaboratively on the project. The Federation
includes Cambridge University, Columbia, Cornell, MIT,
Ohio State, University of Rochester, University of
Toronto, and the University of Washington. Research
institutions worldwide may acquire the Dspace software
at no cost and any institution can adapt it to their
own needs.
The University of California's eScholarship Repository
[http://repositories.cdlib.org], part of the California
Digital Library, offers faculty on the 10 UC campuses
a central facility for the deposit of research or scholarly
output. Individual research centers, departments, and
sponsoring units set the policies for acceptance of
content. Determination of acceptable content is in
the hands of researchers and faculty. The system uses
Berkeley Electronic Press software [http://www.bepress.com] licensed by the University of California.
The developers of the Ohio State University (OSU)
Knowledge Bank [http://www.lib.ohio-state.edu/Kbinfo] plan to include the digital assets and information
services available to the OSU community in the repository.
The library manages the Knowledge Bank as part of its
knowledge management initiative.
In the U.K., the Consortium of University Research
Libraries (CURL) and the Joint Information Systems
Committee (JISC) have established Project SHERPA [http://www.sherpa.ac.uk] to build institutional repositories in U.K. research
universities. CURL's [http://www.curl.ac.uk] mission
is to increase the ability of research universities
to share research for the benefit of research communities.
JISC [http://www.jisc.ac.uk] aims to support teaching,
learning, research, and administration in higher education
through the use of information and communications technology.
The institutional repository projects support the goals
of both organizations and promote collaborative development
and operations.
Repositories and open archives are being established
worldwide. Many institutions use GNU e-print software
for these projects. The software, developed at the
University of Southampton in England, is free. It creates
an open access archive through author and/or institutional
archives [http://software.eprints.org]. For a list
of projects using the GNU software for author self-archiving,
go to http://www.eprint.org.
Access and Use
Repositories now represent potentially rich sources
of information, data, images, and valuable research
results. The movement is new and the time it takes
to plan, formulate policies, and bring institutional
communities to consensus can make it a slow process.
Each institution defines its own policies dealing with
access to and use of materials in repositories. Not
all materials can be made available freely. Copyrighted
materials may carry a variety of restrictions. Nonexclusive
publisher licenses would increase availability to these
materials and place the publishers in the open access
arena.
Some publishers permit authors to self-archive. Other
publishers opt for exclusive licenses for a limited
time, while still others will not allow any deviation
from exclusive copyright.
Some materials may be restricted to a small group
of researchers or to people associated with the institution
because they represent work in progress deemed proprietary
or that may entail sponsor restrictions. For example,
a group working on a patentable device or process may
want to share data only with members of the group.
Policies
Librarians both use and create institutional repositories.
In establishing repositories there are a variety of
decisions to make. Policies, systems architecture,
and other elements will depend on institutional context
and the scope and purposes of the repository. Policies
appropriate for an academic institution may not work
in a corporate setting. Not-for-profit organizations
have unique purposes and cultures that will dictate
how their repositories are formed and maintained.
Here are some of the key issues to consider when
developing repositories:
the institutional culture
the scope of the repository
content
access levels
legal aspects
standards
sustainability
funding
Institutional culture depends on how the organization
is structured as well as how much collaboration and
trust exists within an institution. In academic organizations,
faculty belong to departments, disciplines, and research
groups. Academic competition may be fiercer in some
universities than in corporations. In an internally
competitive environment where cooperation and trust
are not nurtured, building a repository will become
more difficult. Faculty will not contribute willingly
to a central repository unless they have been consulted
and trust the process. Faculty need to be convinced
that contributing to a repository will enhance their
reputations in their disciplines and result in wider
dissemination of their work.
Repository advocates must decide early on the purposes
and scope of the repository and communicate them to
all affected parties. The sooner participants can buy
into the process, the better. Will the repository be
central? Distributed? Will it cover only parts or all
of the organization? For some institutions, community-based
repositories will work well. Large and complex institutions
will need consensus on key issues and technical standards.
A repository may be limited to self-archiving by authors
or may include the intellectual output and business
and administrative documents for the whole institution.
Many institutions have treasures known to only a few
people. Repositories provide the means for unearthing
these treasures and bringing them to light.
Decision-making on content can become a contentious
issue. Criteria for deposit into the repository could
come from each community or from a central body with
input from the participants. The Dspace project at
MIT includes articles, reprints, technical reports,
working papers, conference papers, e-theses, data sets,
image files, audio and video files, and reformatted
digital library collections. Policies for the deposit
of content and who may contribute content come from
each MIT community, but the Dspace guidelines specify
that material must be "education-oriented," in digital
format, and produced by an MIT faculty member. The
author/owner agrees to give MIT permission to distribute
and preserve the material. Access policies are determined
by MIT [http://libraries.mit.edu/mit/policies/content.html].
Legal Considerations
Librarians and administrators responsible for operating
and maintaining repositories need to ensure that all
legal requirements are met. These requirements include
appropriate software and content licenses. At MIT,
authors must sign a nonexclusive license granting MIT
permission to deposit, distribute, and preserve repository
materials. Many universities have comprehensive intellectual
property policies setting forth the responsibilities
of faculty and administration. Corporations and not-for-profit
organizations may have formal intellectual property
policies. In some cases, intellectual property issues
may be covered in employment contracts.
If there are limits on distribution of materials
or access levels, the repository software needs to
build in those limits to ensure compliance. Academic
institutions usually opt for open access but may have
to restrict access for some research activities. If
student portfolios are included in the repository,
privacy considerations may limit access.
Standards
Interoperability requires that repositories employ
standards developed to handle issues associated with
open access. These standards include the Open Archival
Information System (OAIS) Reference Model [http://www.rlg.org/longtermoasis.html],
Open Archives Metadata Harvesting Protocol (OAI-PMH)
[http://www.openarchives.org/OAI/openarchivesprotocol.html],
and the Metadata Encoding and Transmission Standard
(METS) [http://www.loc.gov/standards/mets].
Software is a key element in the construction of
an institutional repository. Guide to Institutional
Repository Software, version 2, published by the Open
Access Society [http://www.soros.org/openaccess/software] is a valuable tool for selecting software appropriate
to the needs and context of the institution and its
repository.
Other organizations involved in standards and repository
design and operations include the Digital Library Federation
[http://www.dlf.org], Coalition for Networked Information
[http://www.cni.org], OCLC [http://www.OCLC.org], RLG
[http://www.rlg.org], the electronic theses and dissertations
program at Virginia Tech [http://scholar.lib.vt.edu/theses;
http://www.thesis.org/standards/metadata/current.html],
and Creative Commons [http://www.creativecommons.org].
Collaboration
Librarians, archivists, faculty, and information
technology staff have gained increased understanding
of each other's work and learned to work more collaboratively
in recent years. Each group now recognizes and appreciates
the expertise and creativity of the others. The talents
and commitment of time and energy from each group are
essential to the success of a repository project. Creation
and sustainability of a repository heavily depend on
thinking together and learning what others on the team
think so decisions can be made within their working
context.
In simple terms, success in building a repository
involves eight "C" words:
comprehension
collaboration
context
change
caring
commitment
creativity
competence
Comprehension means that all members of the team
must share a common vision and understanding of the
purposes and scope of the repository. Collaboration
involves thinking and working together, with different
people contributing their different talents, working
with others to solve problems, and making important
decisions. Context is each person's world view and
working environment. Each person has a unique mind-set
based on background, education, and experience. Thinking
and working together in a non-threatening atmosphere
helps people integrate other contexts into their own.
Repositories involve change in the way research is
disseminated, preserved, and published. This change
requires faculty to deposit their research results,
data sets, and other materials in the repository a
new step in the research process. In corporations,
management may require staff to deposit items, such
as strategic plans, marketing plans, and working papers.
Caring motivates the desire to share research results
and joint scholarly endeavors, preserve history, and
provide knowledge and information needed for future
generations to learn. Caring leads to the commitment
to deposit one's scholarly work in the repository,
encouraging others to do likewise by contributing ideas
and energy. Managers show their commitment by understanding
that repositories will grow and require support and
funding in perpetuity.
Creativity involves imagination and the ability to
visualize a new way of doing things. New ideas can
come from anywhere from individuals or groups
of individuals.
Competency means knowing how to make the repository
work for all its constituents. Librarians and archivists
need to carry their collection development skills and
operational know-how to the repository project. Information
technology staff demonstrate their competencies by
knowing about the software, hardware, networking, and
standards needed to make the repository serve everyone.
Sustainability and Funding
Maintenance and sustainability are key issues that
involve the long-term commitment of money by management.
A repository cannot run by itself. It needs constant
attention. Maintenance of content, software, and accessibility
can change. IT staff and librarians need to know the
consequences of changes in hardware, software, and
standards and be able to adjust accordingly.
Librarians need to prepare to handle problems arising
from a faculty member or key person leaving the organization,
faculty collaborating with faculty in another institution
or group of institutions, or with government or industry.
Having clear policies concerning deposit, accessibility,
and other anticipated contingencies will ease the problem-solving
process.
Repositories cannot be sustained without long-term
infusions of funds. Everyone involved in a repository
needs to understand that the project has become part
of their everyday lives and will require attention
and funding in perpetuity. Too often managers in corporations
seem unable to look beyond the quarter's bottom line
and shy away from long term commitments. Their reluctance
to commit funds is exacerbated in an uncertain economy.
Many managers in academe emulate their corporate colleagues
through their reluctance to raise and dedicate enough
money to ensure that the repository is funded at an
appropriate level forever.
Effect on Publishing
Institutional repositories and the open access movement
will affect the publishing business. Each day, it becomes
clearer and clearer that academic institutions, corporations,
and other organizations will no longer pay the prices
charged by scholarly publishers.
Players in the open access movement and builders
of repositories have reacted to high journal prices
by beginning plans to disaggregate the structure of
scholarly publishing, to eliminate or curtail the distance
between author and reader, to disintermediate. Raym
Crow points out that one of the purposes of institutional
repositories is to form a global system of interoperable
repositories that will become centers for scholarly
publishing. "Altering the structure of the scholarly
publishing model will be neither simple nor immediate.
The stakes are high for all the well-entrenched participants
in the system faculty, librarians, and publishers and
the inertia of the traditional publishing paradigm
is immense."2
The open access movement is driving changes in how
publishing costs are paid. For example, the Public
Library of Science charges authors for value-added
services (editing, refereeing, marketing, etc.) but
does not charge readers for access. The drivers of
the open access movement are high. In a world where
journal prices continue to rise while the costs of
information and networking technologies that enable
interoperability continue to drop, recognition of the
benefits of knowledge sharing grows.
Richard Johnson of SPARC made this observation:
The current system of scholarly publication limits,
rather than expands, the readership and availability
of most scholarly research (while also obscuring its
institutional origins)3. People with no
affiliation with research institutions have a difficult
time identifying and finding research information.
Despite the vast amount of U.S. government information
available online, large amounts of scientific and medical
research results are not readily available. Libraries
buy technical reports from the National Technical Information
Service and, until recently, the National Institutes
of Health. The availability of these reports would
increase if they were made part of the Federal Depository
Library Program. Governments at all levels need to
regard dissemination of the information they generate
as crucial parts of technological and economic infrastructures
and essential in a democratic republic.
The open access movement and institutional repositories
could contribute significantly to economic growth by
broadening the market for scholarly publications and
research results, especially in science and medicine.
Lower access costs would broaden usage. Economist Joel
Mokyr found in his studies of knowledge creation and
dissemination that lower access costs brought knowledge
to people who used that knowledge as the basis of invention
and innovation4. He also pointed out that
ideas and knowledge may be expensive to generate, but
inexpensive to use once implemented. The future will
bring greater innovation and technologies through open
access and institutional repositories.
Footnotes
Crow, Raym, The Case for Institutional
Repositories: A SPARC Position Paper [http://www.arl.org/SPARC/IR/ir.html],
Scholarly Publishing and Academic Resources Coalition,
2002.
Lynch, Clifford, "Institutional Repositories:
Essential Infrastructure for Scholarship in the
Digital Age," ARL Bimonthly Report 226, February
2003, Association of Research Libraries [http://www.arl.org/newslet/226/ir.html].
Branin, Joseph, "Institutional Repositories," Encyclopedia
of Library and Information Science, Forthcoming
May, 2004 [http://www.dekker.com].
2 Raym, op. cit. p.3-4.
3 Johnson, Richard, "Institutional
Repositories: Partnering with Faculty to Enhance
Scholarly Communication." D-Lib Magazine, November,
2002 [http://www.dlib.org/november02/johnson/11johnson.html].
4 Mokyr, Joel, The Gifts of Athena:
Historical Origins of the Knowledge Economy,
Princeton University Press, 2002, p 1-27. |
Susanne Bjørner is an independent consultant
to publishers, authors, and librarians and writes about
the information professions and industry. Contact her
at Bjørner@earthlink.net.
Stephanie C. Ardito is the principal of Ardito
Information & Research, Inc., a full-service information
firm based in Wilmington, Delaware. Her e-mail address
is sardito@ardito.com.
|