Since I always enjoy reading Sue Feldman's insightful reports about
digital library conferences in Information Today, I've longed to
attend one. I just couldn't decide if I should go to the one organized
by the Association for Computing Machinery (ACM) or the Institute of Electrical
and Electronics Engineers (IEEE) Computer Society's Advances in Digital
Libraries conference. My dilemma was solved this summer when these two
prominent associations joined forces to launch the first ACM/IEEE-CS Joint
Conference on Digital Libraries (JCDL 2001), held June 24–28 in Roanoke,
Virginia. The conference program was so rich, colorful, and engaging that
hardly anyone paid attention (except for a passing glance perhaps) to the
Miss Virginia contestants, who convened at the same hotel to prepare for
the following week's pageant.
A Rich Menu of Choices
The topic itself has always been very attractive to me. After all,
I teach a course on digital librarianship, write a column with the same
title for Computers in Libraries, and have created small digital
shelves with my students for the past few years. It added to the attraction
that Edward A. Fox, a computer science professor at Virginia Tech, was
the general chair, and Christine L. Borgman, a professor at UCLA's Department
of Information Studies, was the program chair. They created an exceptionally
well-rounded conference program, with the obvious purpose of being all-inclusive
and still high-quality. The joint nature of JCDL 2001 brought together
the best people from both of the previously competing conferences as attendees
and speakers. There were more than 420 attendees from 20 countries.
It was a special bonus to see and talk to many people whom I hadn't
met before but whose names I knew from conference papers and journal articles.
Even better was running into "long time, no see" acquaintances. Such encounters
add a personal touch to conferences for me.
It was an excellent idea to schedule short papers (15 minutes), long
papers (30 minutes), expert panel sessions, keynote speeches on each day
by key industry figures, poster sessions, demonstrations, pre-conference
tutorials, and post-conference workshops. The social events in the evening
nicely rounded out the daytime programs, and even the substantial breaks
provided good opportunities for mingling and chatting with researchers,
teaching faculty, and deans of computer and information science schools.
And while on the subject, I can't help mentioning that the reasonable $395
conference fee also included a reception, a banquet, and breakfast and
lunch every day—far superior to the rubber-chicken dish served at most
of the information industry conferences. And after this detour about food,
here comes the food for thought.
The Appetizer
To prepare the uninitiated, Fox offered a full-day tutorial that provided
an overview of the practical aspects of digital libraries: definitions;
foundations; and issues, including resource discovery, architectures, de
jure and de facto standards, protocols, interoperability, 3-D interfaces,
search agents, distributed processing, data representation formats, and
social and legal issues. It was a sampling of subjects that the conference
presentations touched upon. Fox is utterly qualified to put together the
full picture from pieces, as he already proved as editor of an excellent
special issue of the Journal of the American Society for Information
Science that focused on digital libraries—years before conferences
were dedicated to the topic.
Dagobert Soergel's full-day tutorial discussed thesauri and ontologies
in digital libraries. I remember him as the thesaurus specialist
when we met in 1976 during my 1-month research stint at the University
of Maryland, and he's now imparting his immense knowledge to the Web environment.
Hussein Suleman's half-day tutorial addressed the issues related to building
interoperable digital libraries—such as the concept of the Open Archives
Initiative—and protocols for metadata harvesting and exchanging. Ian Witten
and David Bainbridge's full-day session on "Building a Digital Library
Using Open-Source Software" used the Greenstone software to demonstrate
the process of creating digital libraries from a variety of document sources.
The biggest advantage of the software is the ease with which you can include
and index disparate document types in a short time and make them searchable.
Currently there are few power-search options—such as proximity and positional
operators—that are important when searching a full-text document, but a
soon-to-be-released version will have such features. Commercial software
in this league has a five-digit price tag.
Other researchers (along with Witten) from New Zealand's University
ofWaikato Computer Science Department demonstrated their talent earlier
with both the Phrasier and Kniles software that help users find relevant
documents in full-text repositories that don't have abstracts and subject
headings. Greenstone is yet another part of this team's software arsenal.
The Main Course
The conference itself discussed the issues mentioned above, and then
some, such as the state-of-the-art tools to identify, explore, and classify
the content of audio and video files in digital libraries instead of or
complementing traditional human abstracting and indexing (A&I). The
terabytes of audiovisual information make bibliographic control and the
print media's A&I issues look like child's play. The presentations
proved that such tools are for real and will be available within a few
years.
The value of linguistic research, which got tremendous support from
computer technology and is repaid now with compound interest to computer
and information scientists who deal with automatic classification, indexing,
and abstracting, manifested itself in many presentations. No one embodied
the integration of the dual culture better than Judith Klavans, director
of the Center for Research on Information Access (CRIA) at Columbia University
and a linguist turned information/
computer scientist. She made several short and lucid presentations
about CRIA's grant projects. All of them superbly illustrated how information
technology will help end-users in the long run find their ways throughthe
digital towers of Babel built, for example, from the full-text sources
of U.S. government agencies' regulatory information using the same terms
with quite different meanings. As a bonus, on top of her articulate talk
and keen intellect, she's one of those speakers who remains cool and unfazed
and can substitute her slides with words if, in a Maalox moment, technical
difficulties prevent connecting the speaker's laptop to the audiovisual
system. This happened way too often—representing the only weak point of
the conference. (I couldn't help thinking of the flawless, taken-for-granted
technical arrangement Bill Spence provides at the InfoToday meetings—expertise
that probably could be used to lure Klavans in for a session or two at
InfoToday 2002, since her office is just a short ride away.)
Many presentations fused traditional library science with information
and computer science. William Arms of Cornell University's Computer Science
Department, and the author of the best-selling book Digital Libraries
(which will be out in paperback by the time this issue reaches you),
paid appropriate homage to the pioneers of contemporary metadata research,
such as Gerald Salton's seminal SMART project about natural language searching;
the Cranfield project, which provided an impeccable test and benchmark
suite for indexing-related research; and Henriette Avram's work on the
MARC format—the mother of all metadata research that aims to make interoperability
smoother.
The keynote addresses represented the pillars of the conference. Brewster
Kahle, developer of the WAIS (Wide Area Information Server) system that
pioneered the searching of full-text databases scattered over the Internet,
and the brain behind the Alexa Internet archive, gave a high-octane talk
about the past, present, and future of digital libraries. He showed off
his latest project, the Wayback Machine (http://archive0.alexa.com),
a repository of digital materials about the 2000 presidential election.
It gives a blow-by-blow replay of the events in chronological order as
reported (or rather spinned) on the 900 Web sites of the candidates, their
parties, the TV stations, newspapers, and magazines. It's a fascinating
project that will allow the next generation to understand how this noble
tradition of American politics turned into a farce, and is a masterly manifestation
of what digital archives can do.
Pamela Samuelson, professor at both the University of CaliforniaBerkeley's
School ofLaw and School of Information Management and Systems, gave a fast-paced
review of the legal issues surrounding digital libraries. She pointed out
that there's currently a publishers' nirvana in which they set the rules
and finance the technology to enforce them—including the inane stipulation
that you can't read the digital version of Alice in Wonderland aloud
from its publisher's Web site. Her talk was given a particular edge as
just the day before, the Supreme Court sided with the freelance writers
in their suit against The New York Times Co. and others. As an author and
(non-practicing) intellectual-property legal scholar myself, I'm very interested
in the reaction ofjournal publishers and online information providers.
I hope that instead of removing the infringing materials (which would not
solve the problem of past copyright infringements), they work out some
generic agreement that reasonably shares the publisher's pecuniary benefits
with the authors.
In his keynote speech, Clifford Lynch, executive director of the Coalition
for Networked Information, focused on interoperability, or rather, the
lack of it. He pointed out that there's currently no objective way to measure
interoperability, only descriptive and anecdotal approaches, and it's a
gradual measure rather than a binary one. This certainly gave comfort to
many digital library builders who can claim that they're quite interoperational,
just like how library automation vendors up until the late 1980s liked
to say that they were mostly MARC compatible. To me that always
sounded like "I'm a little pregnant." In the long run, interoperability
must be answered with a definite yes or no. After all, the key to the success
of digital libraries is not so much in their role as passive repositories,
but as interactive and collaborating archives that complement each other
and lead users from one resource to another by way of, say, citations.
The Desserts
The post-conference workshops represented the desserts. Because I was
unable to reschedule my flights, I missed this part of the conference.
I did, however, catch a glimpse of the dessert cart. The workshops included
such mouthwatering delicacies as visual interfaces to digital libraries;
the technology of browsing applications; information visualization for
digital libraries; and classification crosswalks that mapped the classification
schemes of different systems, featuring Diane Vizine-Goetz, the outstanding
expert from OCLC's Office of Research. Although the topic of Digital Libraries
in Asian Languages would have been Greek to me (or as Hungarians say, "Chinese
to me"), I would've liked to have heard Ching-chih Chen, a speaker who
deeply impressed me in the early 1980s with a presentation about her digital
library, The First Emperor of China.
I missed these workshops, but I hope I'll have another chance next year
when JCDL 2002 convenes in Portland, Oregon, under the chairmanship of
Gary Marchionini, another outstanding digital library specialist (whom
I met again in Roanoke after a 20-plus-year hiatus). And who won the Miss
Virginia Pageant? You can see for yourself since there's a digital library
about it (http://www.missva.com), of course.
Péter Jacsó is associate professor of library and information
science at the University of Hawaii's Department of Information and Computer
Sciences. His e-mail address is jacso@hawaii.edu. |