FEATURE
Open WorldCat Pilot:A User's Perspective by
Nancy O'Neill
OCLC's Open WorldCat Pilot [http://www.oclc.org/worldcat/pilot/] "is
an initiative that integrates library records into
popular Internet search sites and tests the effectiveness
of the Web in guiding users to library-owned materials.
The goal of the pilot: to make libraries more visible
to Web users and more accessible from the sites where
many people begin to look for information."1 The project aims to "open" WorldCat records to present
and potential library users through the familiar Web
search engines Google and Yahoo! Search. Enabling Web
users to locate materials they need quickly and easily
in libraries near them will promote library use and
reinforce the value of libraries. Ultimately even people
who don't often use libraries may come to consider
libraries as a first source of information. If you
believe in libraries, as OCLC obviously does, what's
not to love about Open WorldCat?
For the Open WorldCat Pilot, OCLC extracted a 2-million-record
subset from the 55 million records in the WorldCat
database. OCLC selected the items most frequently cataloged
by libraries; specifically, they selected records with
one hundred or more libraries listed as holding the
item.2 It is important to note that the
pilot uses "limited fields" of the records. Gary Price
asked in ResourceShelf, "Why doesn't OCLC make subject
headings viewable and hyperlinked?"3 Perhaps
that is a problem the search engines could help solve.
Limiting the fields included could also make it difficult
to distinguish between formats like VHS and DVD.
According to the Open WorldCat Pilot Quick Facts, "WorldCat
records began to display within Google search results
in December 2003 and within Yahoo! search results in
May 2004. Inbound links from Open WorldCat search results
have grown from 39,000 in February 2004 to more than
1 million in the first half of June 2004."4
Approximately 12,000 libraries are participating
in the project, including the academic, public, and
school libraries originally included automatically,
plus state, federal, and special libraries that have
asked to join. Libraries may choose to opt out of the
program by notifying their OCLC regional service provider
or by completing an online form on the FAQ page. Libraries
not part of the pilot but that contribute records or
holdings information to WorldCat may complete an online
form to participate. OCLC cooperatives invite libraries
that do not contribute their cataloging records to
WorldCat to join the pilot by joining the OCLC cooperative.
There are several ways of searching Open WorldCat
items in Google and Yahoo!. Searching works the same
in Google and Yahoo! with two exceptions, which we
discuss below. The most intuitive is using the phrase "find
in a library" plus the title of the item or the subject
to be searched: find in a library: da vinci code.
Alternatively, you can search using the phrase "worldcat
libraries" and the title of the item or the subject
searched: worldcat libraries social ecology.
With enough promotion, the phrase "worldcat libraries" could
someday become a recognizable way for the casual user
to search as well as great brand recognition for OCLC.
The phrase "worldcatlibraries" works equally well,
although Google persists in asking "Did you mean worldcat
libraries?" and Yahoo! returns the search as "Find
in a Library." OCLC also suggests using "wcpa," a
phrase that appears in the URLs of all records retrieved,
e.g., wcpa da vinci code. It's not exactly intuitive
and it does not work in Yahoo!. The remaining search, "site:worldcatlibraries.org
[title]," is described at length below. It too does
not work well in Yahoo!.
Ericka McDonald, manager of OCLC WorldCat end-user
services, was extremely helpful in answering questions
about the project. McDonald provided much of the g
information that follows.
Authors may be searched by linking the name to a
title or subject, e.g., worldcat libraries murder
in mont
parnasse and greenwood or worldcat libraries basketball and lee. Google
retrieves WorldCat records for both these searches, but Yahoo! retrieves only
the subject search for basketball and lee. OCLC recommends using the search
syntax "site:worldcatlibraries.org [author name]" to locate author records.
Searching site:worldcatlibraries.org dennis lehane did retrieve several
records for Dennis Lehane (author of Mystic River and other novels)
in Google, but retrieved only one record in Yahoo!. Using the other recommended
searches combined with the author name "Dennis Lehane" located WorldCat records
in both Google and Yahoo! but only for one title and, in one case, for two
different titles in two different formats. The "site:" search used in Google
retrieves several records for different titles by Lehane, but the "site:" search
in Yahoo! retrieves only one record for one of Lehane's works. Generally the "site:" search
in Yahoo! retrieved only one record for anything.
To locate several records for the same item, OCLC
suggests using "site: worldcatlibraries.org [title]," e.g., site:worldcatlibraries.org
war in a time of peace and halberstam. The "site:" search
is designed to retrieve all the records WorldCat has
for a particular title and therefore increases the
chance of finding an item record that lists the user's
local library. This syntax limits the search to records
harvested from OCLC's server. For instance, "site: worldcatlibraries" works
well in Google, but does not work well in Yahoo!. Also the "site:" search does
not appear to necessarily improve the chance of locating a particular item
record in which the local library appears. Searching worldcat libraries
silas marner in Google retrieves WorldCat records for the book as the first
two retrievals, but neither record listed Santa Monica Public Library as having
the book. Searching site:worldcatlibraries.org silas marner retrieved
69 WorldCat records for various editions of the work, as well as books about
the work. Record 30 on page three of the results listed the Santa Monica Public
Library. Searching site:worldcatlibraries.org silas marner in Yahoo!
retrieved only one WorldCat record.
OCLC plans to use Functional Requirements for Bibliographic
Records (FRBR) to change the records that they make
available for harvesting; this should make the "site:" search
syntax unnecessary. FRBR provides a new way of defining
relationships between bibliographic items, their creators,
and their subjects. It embodies the basic laws of cataloging
and offers ways to further develop and enrich existing
catalogs. According to McDonald, it creates a work
level record that incorporates various versions and
editions of items. The aim is to allow users to locate
whatever version of an item a local library owns, an
essential improvement. The search engines will not
have to support or implement FRBR, just continue to
harvest the content the same way they presently do.
Deb Bendig, manager of Discovery View of WorldCat,
says that OCLC is trying various approaches to employ
FRBR in WorldCat but has not yet set a target date
for its implementation. Additional information about
FRBR is available on the OCLC Research Projects Web
site at http://www.oclc.org/research/projects/frbr/default.htm.
Even though a "site:" search may be successful, it
does not necessarily improve one's chances of finding
the desired item in the local library. I persevered
to locate Santa Monica's record for Silas Marner because
I was testing the system; the casual searcher would
probably give up after one page of records. We'll see
if FRBR solves the problem.
Web users can search for just about anything: books,
magazines or journals, videos, compact discs. Searching
for a specific format, such as DVD, can be frustrating
due to inconsistency in retrieval. "Find in library" did
not retrieve a Wild Strawberries DVD record
in Google, but "worldcat libraries" did: worldcat
libraries wild strawberries dvd. Both find in
library wild strawberries dvd and worldcat libraries
wild strawberries dvd retrieved the WorldCat record
in Yahoo!. I had the same results searching for a Rear
Window DVD: find in
library rear window dvd did not work in Google but did work in Yahoo!,
and worldcat libraries rear window dvd worked in both search engines.
I located the DVDs of Wild Strawberries and Rear Window without
too much difficulty in both Google and Yahoo!, but I could not retrieve a WorldCat
record for a Citizen Kane DVD using either search engine, even though
records in the full WorldCat database indicated it would have met the criteria
of the pilot. The first WorldCat record listed 600 libraries holding the Citizen
Kane DVD, the second record listed 250 libraries with the item, and the
third 177 libraries with the item. So why didn't it appear in my search?
McDonald provided no answer as to why I could not
find the DVD, but she suggested that I search for it
using site:worldcatlibraries.org citizen kane visual
material. That search strategy worked beautifully,
retrieving 18 WorldCat records. The only problem is
that users have to know that the term "visual material" covers
records for VHS or DVD. And who would ever think to
search that way? The question remains: Why can I locate
other DVDs without using an arcane search syntax, but
not this one?
Curiosity about finding almost no DVD holdings for
my own Santa Monica Public Library led to an embarrassing
discovery. Santa Monica Public Library stopped entering
its DVD holdings in WorldCat when it outsourced provision
of DVD records to a DVD vendor. As troubling as that
may be for a Santa Monica Library client, it may reflect
a practice followed by many libraries that outsource
some or all of their cataloging. When Santa Monica
obtained DVD cataloging records from a vendor, instead
of cataloging them in-house, we apparently had no simple
way to upload the vendor records into WorldCat. Santa
Monica's Technical Processing Department explained
that re-entering the records in WorldCat using CatMe
amounts to recreating the entire record, a process
too labor intensive to be cost-effective.
Fortunately, Santa Monica has since resumed in-house
cataloging of audiovisual materials. However, many
libraries outsource at least some of their cataloging
and, if they find that they cannot easily upload the
vendor records, the consumer will be denied the ability
to locate items in a popular formats. As libraries
respond to the need to become more cost-effective,
it seems that OCLC, equally cost-conscious, may not
have provided the technology used by many online catalog
vendors to make loading vendor records easy.
I put the question to Cynthia Whitacre, of the OCLC
Cataloging Partners Program, who told me that OCLC
has more than one avenue to work with vendors on supplying
catalog records for easy uploading into WorldCat. Two
of which are Cataloging Partners [http://www.oclc.org/catalogingpartners/partners/default.htm],
a recent program, and PromptCat [http://www.oclc.org/promptcat/about/vendors/],
an established program. Whitacre says that OCLC actively
works with vendors to assure that records can be added
to WorldCat, and I admit that the vendor partner list
[http://www.oclc.org/promptcat/about/vendors/] is pretty
impressive. Coincidentally, Santa Monica's former DVD
vendor recently signed up with OCLC.
According to information on the Open WorldCat Pilot
page, "In most cases, the Open WorldCat pilot will
provide users with detailed library information in
as little as two clicks."5 Using either
Google or Yahoo!, enter a simple search string "find
in a library" plus the title of the item. OCLC says
that the WorldCat record should appear as the first
hit, and it usually does.
Click #1 takes users to a page where they can enter
their ZIP code to locate the nearest library that has
the item. Click #2 retrieves the list of libraries
in or around that ZIP code. If the local library's
catalog is linked, click #3 takes users either to a
library Web site or, in the best cases, directly to
the library's online catalog record for the item searched.
In the sample searches provided by OCLC, a search for
the title Benjamin Franklin: An American Life took
only two clicks to reach ZIP code 90401 and a Santa
Monica Public Library appearance in the list of libraries;
click #3 retrieved my library's online catalog record.
I think I'm in love!
Another sample search provided by OCLC, The Da
Vinci Code, using the same ZIP code, does not
list Santa Monica as one of the nearby libraries
even though Santa Monica actually has multiple copies
of the book. A Newsbreak6 by Searcher editor
Barbara Quint on Open WorldCat led to this article
when she called me to ask why. This particular title
is an anomaly. Santa Monica's record for the book
is not in WorldCat although it should be. Finding
no record for Santa Monica, the user is directed
to the next closest ZIP codes Beverly Hills
Public Library, where click #3 goes directly to the
online catalog record, and El Segundo Public Library,
where click #3 takes you to the online catalog search
screen but not to the individual record. Instead,
you must perform the title search again. Of the other
libraries listed, click #3 takes the user to the
El Camino College catalog login page, the Harvard-Westlake
Upper School catalog search page, the Woodbury University
catalog search page, and a broken link for UCLA that
goes nowhere. Obviously OCLC can't change the way
various online catalogs operate, but these varied
entry points are inconsistent and inconvenient.
Most consumers want to go directly to the item record
in a local library catalog. So why the inconsistencies?
According to McDonald, the hotlinked library name should
always take the user to the library OPAC. In most cases,
it drops the user at the home page for the catalog,
where the user has to re-enter the search. In some
cases, the link takes the user to the record for the
item in the OPAC. The ability to form this "deep link" depends
on a couple of factors: (1) how the library has configured
the link in its FirstSearch Administrative module and
(2) whether the OPAC supports deep linking. OCLC is
trying to get libraries to configure deep links in
FirstSearch. To do this, many libraries will need to
turn on this capability in the OPACs or ask the local
system vendor to do it. This is a high priority for
OCLC, and it is working closely with member libraries
and local system vendors to improve these links.
Migell Acosta, Santa Monica's Principal Librarian
for Information Management and an OCLC Members Council
Delegate, set up Santa Monica's linkage. He believes
that click #3 would take users directly to the item
record in most online catalogs, if OCLC provided better
directions to libraries on setting up the Open WebCat
Pilot linkage. Gale Group's InfoTrac offers linkage
to library catalogs and provides excellent examples
of how to set up that linkage. Acosta emphasized that
OCLC is working to make the technical end easy for
libraries with pilots being created to work out problems.
OK, I'm convinced.
But as a consumer I still find it annoying to land
on a library catalog search screen and have to re-enter
a search. Even more frustrating is landing on the library
Web site, then struggling to locate the online catalog
link, before having to re-enter the search. If Open
WorldCat frustrates customers, it certainly won't help
attract them to libraries.
Unfortunately, the record retrieved for The Da
Vinci Code did not suggest the Los Angeles Public
Library, even though Santa Monica's 90401 ZIP code
is surrounded by neighboring Los Angeles Library
branches. According to the Pilot FAQs, "Initially
OCLC is using the postal code for the street address
associated with each OCLC institution symbol....
The postal code entered by the user does not have
to exactly match the library postal code; concentric
radiuses of geographic proximity are employed to
locate libraries near the postal code. These radiuses
are 20 kilometers (12 miles), 50 kilometers (31 miles),
100 kilometers (62 miles), "region" and "worldwide." If
at least 10 libraries are not found within the radius,
the search expands out to the next radius."7 The
Central Los Angeles Public Library, the institution
associated with the OCLC institution code, is about
15 miles from Santa Monica with a 90071 ZIP code;
that could explain the omission. Nevertheless LAPL
branches are much closer to Santa Monica than Beverly
Hills or El Segundo, and those branches will all
have copies of The Da Vinci Code.
In another inconsistency, searching The Da Vinci
Code in the 90024 ZIP code for West Los Angeles
retrieves a Los Angeles Public Library record, but
searching the same title for the 90025 ZIP code for
West Los Angeles does not retrieve Los Angeles Public
Library entries. Interestingly enough, the West Los
Angeles Regional Branch of the Los Angeles Public
Library is actually located in the 90025 ZIP code.
Local users may know that the ZIP codes are adjacent
and recognize the location of the Regional Branch.
On the other hand, L.A. is a big county with
lots of ZIP codes. I doubt explanations of the esoteric
algorithm for linking ZIP codes and libraries will
do much to relieve user frustration. McDonald assured
me that OCLC has received feedback on this problem
and is working on the ZIP code recognition program.
I should mention that search results may also retrieve
libraries that are nearby but are not open to everyone.
If the searcher can't take out the item located at,
for example, a local university, how satisfied will
the user be? And how likely to try Open WorldCat
a second time?
Search results can be inconsistent as well. The following
search syntax usually worked: "find in catalog," "worldcat
libraries," "worldcatlibraries," and "wcpa." (Most
consumers would not use "wcpa" unless directed to do
so, but a few might notice that it forms part of the
http://www.worldcatlibraries.org/wcpa/ URL in all WorldCat
records.) In an informal and unscientific test, I used
all four approaches for a variety of materials in both
Google and Yahoo!. I tried to search as a library user
rather than as a librarian, so I sometimes avoided
adding terms I thought would produce better results.
Rather than use the prescribed search "find in a library:
[title], I shortened it to "find in library [title]." Most
users have been repeatedly chastised by search engines
for using common words like "a"; not many users will
think to add a colon after "find in a library." I searched
a variety of document formats, although I expected
results would be less satisfactory because of the variety
of ways libraries catalog periodicals and non-print
materials. Still, users search for both print and non-print
materials, so the project should encompass all formats.
The generally successful searches for DVDs came as
a pleasant surprise.
My Unscientific Tests
Here are the ground rules I followed when searching:
I used the same 14 searches in Google
and Yahoo!.
I used the same Santa Monica Public
Library ZIP code (90401).
I checked that each title searched
has a record in WorldCat and that the record lists
more
than 100 libraries holding the title.
I checked that Santa Monica Public Library
has a record for the item in WorldCat and that the
record is one with more than 100 libraries
listed
I used titles most likely to be purchased
by many libraries (with two exceptions).
I refreshed the screen between each
search in each search engine.
I looked at only the first page of
the results.
I did search for two books that were not in the Santa
Monica Library collection but recently had been requested
by Santa Monica clients. Clients searching for more
esoteric titles will find WorldCat's extensive listings
extremely useful. These clients tend to be a bit more
sophisticated about library collections. They usually
know if their local library might not have such materials
and will turn to a Web search.
For a complete breakdown of the 14 separate searches
I did and to see how Google and Yahoo! results compared,
go to this URL on the Information Today, Inc. Web site:
https://www.infotoday.com/searcher/nov04.oneill.shtml.
A caveat provided by the WorldCat Pilot page applies: "Please
note that Web search-engine content is dynamic, so
your results may vary."8
The search results were mostly satisfactory. The
most glaring problem is the fact that the record(s)
retrieved are not always those that show the holdings
of the local library. Obviously, libraries enter records
for various iterations of an item, and those items
become different records. Searching Open WorldCat retrieves
only a few records for an item, not all of them. OCLC
recognizes this as a problem.
McDonald also introduced me to the search syntax
mentioned above that retrieves WorldCat records for
several different iterations of a title. In Google,
search "site: worldcatlibraries [title]." Average Google
users are unlikely to use this search syntax unless
given specific directions, and it may retrieve more
records than the user wishes to check. Searching The
Da Vinci Code sample in Google as site: world
catlibraries.org da vinci code produces two pages
limited to WorldCat records that include three records
for the book (one a Spanish translation); five records
for the audio book; and five records for books about
the book. Santa Monica Public Library is listed on
the record for the Spanish language version and on
two of the records for the book as a subject. A title
search for David Halberstam's War in a Time of Peace as
site:worldcatlibraries.org war in a time of peace retrieves
about 185 WorldCat records that are variations on the
title. Adding the author's name produces a perfect
search. Searching site:worldcatlibraries.org war
in a time of peace and halberstam retrieves only
two records, both WorldCat records for the book. One
of the records lists the Santa Monica Public Library
as having the item.
According to McDonald, OCLC received such positive
feedback on the pilot, originally scheduled to end
in June, that it will extend it into the fall. However,
she added that OCLC is already working on details for
transitioning Open WorldCat from a pilot into a permanent
membership benefit.
Chip Nilges, OCLC Director of Content Services, elaborated
on Open WorldCat's future. OCLC's time frame for going
into production is October/November. The pricing model
will be part of a library's subscription to WorldCat
on FirstSearch. Nilges explained that they consider
it another way to access WorldCat: OCLC supports access
via Z39.50, FirstSearch, and now a variety of open
Web partners. OCLC intends to make members' collections
visible and available to information seekers, from
library portals and on the open Web. Does the pricing
model mean that clients of libraries that subscribe
to FirstSearch, but do not make it available to the
public, will have access to Open WorldCat through the
search engine partners? "If your library has subscription
access to WorldCat on FirstSearch, its holdings will
display in the search engine partners. We're treating
this as a feature of FS WorldCat and will fund it through
standard price increases for that service, just as
we do other enhancements. Of course, all of this is
new to us, as well, so we'll need to keep an eye on
traffic and other expenses over time," says Nilges.
Hats Off
Overall, the Google searches were more successful
than Yahoo! searches. The "site:worldcatlibraries.org" retrieved
only one WorldCat record in Yahoo!, and the "wcpa" search
doesn't work. But Yahoo! does retrieve WorldCat records
when the appropriate search syntax is omitted. (See
the "Life Without Open WorldCat" sidebar on page 55.)
In the long term, OCLC wants to increase the amount
of content available; increase the number of partners
by including other search engines, booksellers, and
sites dedicated to books; enable interlibrary loan
requests through remote user authentication; and develop
new user statistics and configuration tools for libraries.
The Open WebCat project might advance faster if the
search engines would put more effort into dealing with
non-Web content. Both Google9 and Yahoo!10 seem
interested in opening up new content avenues, and both
have the research and development staff to deal with
mechanisms for better searching of original non-Web
content. But neither seem willing to take one obvious
step and open up a home page tab for library material.
Such a tab might reach beyond OCLC to open access movement
sources, government archive collections, bibliographic
indexing and abstracting services, and so on.
Grumble as we may, OCLC's Open WorldCat Pilot has
the potential to achieve its goals and more. It may
not yet have earned a standing ovation for its performance,
but let's give a rousing cheer for the initiative a
special "hats off" to Google and Yahoo! as our new
library partners and encourage OCLC to move
from pilot to permanent.
Life Without Open WorldCat, Or "Real People" Searching
In the December 2, 2003, ResourceShelf,
Gary Price posed this question: "Where will
a typical Open WorldCat record appear on a
results page based on an average user query
(2.4 words)?"
Good question. I experimented in both Google
and Yahoo! by searching some of the items I used
to test the Open WorldCat pilot, but without
using the suggested search syntax. (Caveat: These
results are based on one search per item. Since
search engine rankings change continually, results
will probably vary from search to search.) A
search for da vinci code located no WorldCat
records in the first 20 pages of Google results.
But wait! The same search in Yahoo! showed the
WorldCat record as the fifth item on the first
page of results. Searching atkins for life (a
title suggested by the WorldCat pilot) produced
no WorldCat record in 20 pages of Yahoo! results,
and the Google search fared no better.
OK, how about a DVD search? Searching wild
strawberries dvd in Google located no WorldCat
records in 20 pages of results, but Yahoo! provided
the WorldCat record as number 68 on the fourth
page of results. A patient searcher might get
that far ... maybe. The DVD for Rear Window searched
as rear window dvd turned up as a WorldCat record
in Yahoo! on page nine as record number 173.
No luck in 20 pages of Google results.
Two journals searched as new england journal
of medicine and architectural digest produced
the same mixed results. Yahoo! returned the WorldCat
record for New England Journal of Medicine as
the 31st item on page two of the results, but
Google returned no WorldCat record in 20 pages
of results. The Architectural Digest search found
no WorldCat record in 20 pages of Google results,
but Yahoo! returned the WorldCat record as number
68 on page four of the results.
It's not impossible to locate a WorldCat record
in Yahoo! with "an average user query (2.4 words)," but
most searchers will simply not persist past the
first three pages of results, if that. What we
want are search results that pop to the top on
every search even when the user doesn't use a
special syntax.
Now when were Google and Yahoo! Search going
to put up that "LIBRARY" tab on their home pages
again? |
Footnotes 1 Open WorldCat Pilot: Using WorldCat
to increase the visibility of libraries on the Web
[http://www.oclc.org/worldcat/pilot/].
2 Ibid.
3 Price, Gary, "Web Search Yahoo!," NewsBreaks:
Two Million Open Worldcat Records Hit the Yahoo! Database,
ResourceShelf, Wednesday, July 7, 2004 [http://www.resourceshelf.com/
2004_07_01_resourceshelf_archive.html].
4 Quick facts about the Open WorldCat
pilot [http://www.oclc.org/worldcat/pilot/facts/default.htm].
5 How the Open WorldCat pilot works
[http://www.oclc.org/worldcat/pilot/how/default.htm].
6 Quint, Barbara, "Yahoo! Search
Joins OCLC Open WorldCat Project," InfoToday Newsbreaks,
July 6, 2004
[https://www.infotoday.com/newsbreaks/nb040706-2.shtml].
7 Open WorldCat Pilot: Frequently
Asked Questions [http://www.oclc.org/worldcat/pilot/faq/default.htm#link11].
8 Open WorldCat Pilot: How It Works
[http://www.oclc.org/worldcat/pilot/how/default.htm].
9 Zeitchik, Steven, "Google looks
to add book content," Publishers Weekly, November
2003, p. 3. InfoTrac OneFile Gale Group Databases.
Santa Monica Public Lib., CA 11 August 2004 [http://www.infotrac.galegroup.com].
10 Webb, Cynthia, "Yahoo! Search
will roll out Content Acquisition Program," The America's
Intelligence Wire, 3 March 2004. InfoTrac OneFile Gale
Group Databases. Santa Monica Public Lib., CA 11 August
2004 [http://www.infotrac.galegroup.com].
|