FEATURE
The Truth About
Federated Searching
Source: WebFeat (http://www.webfeat.org)
Federated searching is a hot topic that seems to be gaining
traction in libraries everywhere. As with many technologies
that are rapidly adopted,there are some misconceptions
about what it can do. WebFeat, a provider of federated
search technology to more than 900 public, academic, and
corporate libraries, including more than half of the top
10 U.S. public libraries, has compiled this list of the
five most commonly repeated misconceptions about federated
searching.
1. Federated search engines
leave no stone unturned.
Reality: Not all federated search engines
can search all databases, although most can search
Z39.50 and free databases. But many vendors that claim
to offer federated search engines cannot currently
search all licensed databases for both walk-up and
remote users. Why? Authentication. It's very difficult
to manage authentication for subscription databases,
particularly for remote users. Before buying, ask vendors
to demonstrate that they can search all of your library's
databases using your library's own authentication,
both locally and remotely.
2. De-dupe really works.
Reality: For federated search engines, true
de-duplication is virtually impossible. In order to
de-dupe, the engine would have to download all search
results and compare them. The limiting factor is not
federated search engine technology, but the way databases
return results: 10 or 20 records at a time. Completing
a true de-dupe operation would take hours because a
single search might produce 100,000 hits. These hits
or citations typically come back 10 to 20 at a time.
If it takes 5 seconds to download 20 hits, it would
take hours to download them all. And the same citation
may appear in different places in results sets from
different databases. So to completely de-dupe search
results, it's necessary to download all results from
all databases. Vendors that claim to do true de-duping
usually are just de-duping the first results set returned
by the search.
3. Relevancy rankings
are totally relevant.
Reality: It's impossible to perform a relevancy
ranking that's totally relevant. A relevancy ranking
basically counts the occurrence of words being searched
in a citation. Based on this frequency of occurrence,
items will be moved closer to the top or farther down
the results list. Here's the problem: When attempting
to relevancy-rank citations, the only words you have
to work with are those that appear in the citation.
Often, the search word doesn't even appear. The abstract
and full-text data, as well as the indexing that content
providers use to relevancy-rank their content, are
unavailable to federated search engines. The content
providers have the full article and indexing to work
with, but not the federated search engines. They have
only the citation to search on.
4. Federated searching is software.
Reality: It certainly is software, but it's
best consumed as a service. A federated search engine
searches databases that update and change an average
of 2 to 3 times per year. This means that a system
accessing 100 databases is subject to between 200 and
300 updates per yearalmost one per day! Subscribing
to a federated searching service instead of installing
software eliminates the need for libraries to update
translators almost daily so they can avoid disruptions
in service. (Translators convert search queries into
something that can be understood by the database that's
being searched.) Without frequent updates to these
translators, entire databases can become periodically
unavailable for searching. It's unacceptable for a
database subscription that couldcost a library $10,000
or more per year to be offline for any amount of time.
5. We don't make your
search engine. We make
your search engine better.
Reality: You can't get better results with
a federated search engine than you can with the native
database search. The same content is being searched,
and a federated engine does not enhance the native
database's search interface. All federated search does
is translate a search into something the native database's
engine can understand. But it's restricted to the capabilities
of the native database's search function. A federated
search can't do a three-term search with Boolean operators
in a native database whose interface doesn't support
it. Federated searching cannot improve on the native
databases' search capabilities. It can only use them.
Paula J. Hane is Information Today, Inc.'s news bureau chief
and editor of NewsBreaks. Her e-mail address is phane@infotoday.com.
|