FEATURE
Tribunes and Tribulation The Top 100
Newspaper Archives
(or Lack Thereof)
By Larry Krumenaker
Publisher,
Hermograph Press
The newspaper industry has finally
figured out a way to join the digital world...and not lose print subscribers.
How? Make it more cost effective to get the paper delivered than to use an
archive online!
Let me prove my thesis.
I publish the book Net.Journal Directory, the Catalog of Periodicals Archived
on the World Wide Web [http://www.hermograph.com/njd] and
its online version, Net.Journal Finder [http://www.hermograph.com/njf]. Net.Journal was
created when a hobby of collecting inexpensive archive locations mushroomed
into this book in the mid-1990s. I had become used to having LexisNexis or
Dialog at my beck and call when I worked in broadcasting or corporate libraries.
Information withdrawal is a painful experience, brought on when personal
subscriptions to either service stretch beyond the limits of one's wallet.
(Notice that I said wallet, not credit card; credit card pay-per-view
services and even simple Web access were not even on the infohighway radar
screen then.) As a budding writer, I wanted to do research in periodical
backfiles inexpensively. I began to record archive Web addresses (translation:
gophers, telnets, BBS, not World Wide Web yet).
And if you want to search newspapers, which ones are you most likely to rummage
around in? Other than for local stories, you will probably want to search the
largest papers, the so-called "papers of record." The Top 100 papers, in terms
of circulation, is a list published annually by Editor and Publisher.
The most recent one available for this article is a bit dated, 2001. Still,
the list doesn't change much from year to year, especially on the low end,
low defined as nearer number 100 than number 1.
A couple of Top 100 factoids. First, for the purposes of this article, we
will only look at 99. The E&P also includes Investor's Business
Daily. Not likely to see Garfield or horoscopes, local news, or much wire
service copy there. It's not the regular guy kind of paper, so I dropped it
from this investigation, though I'll still refer to the Top 100. All the papers
listed are published 7 days a week, except two, which are 6-day papers. All
are in the English language except for one in Spanish.
This winter rumors started flying among the e-mail mailing lists used by
information professionals about archives being pulled off the big services,
specifically Dialog and LexisNexis. On a mission of investigation from Searcher's
editor, I got to work.
Dear Reader, remember that this is all subject to change between my work
on the article and your reading it. My research used an excerpt from Net.Journal's database
of nearly 30,000 titles. I also rechecked all the papers' own Web archives.
The trend towards fee-based archives continues...but there were some surprises
and some real bargains.
The Big Players
It's interesting to look back at Net.Journal #1 and see that virtually nobody
listed in that 1997 book is still online! Despite some tumbles, both Dialog
and LexisNexis (A.K.A DialogWeb and LexisNexis, respectively) remain online
and still have newspapers; indeed, the Nexis side of LexisNexis began as a
newspaper archive. NewsNet, IQ, Knowledge Index, BRS (and what WAS the name
of their consumer service again?), all gone. Newspapers archives online? Hardly.
You can still find newspapers in Dialog, LexisNexis, and now also Gale Group's
InfoTrac Web, ProQuest, and Factiva (see
Table 1 on pp. 30-31). Out of the top 100, 90 appear on LexisNexis,
but Factiva has 90 as well. Factiva is the successor to Dow Jones News/Retrieval,
another major newspaper archive from Net.Journal Directory 1, and Dow
Jones Interactive. Many of Factiva's papers come from its partnership with
ProQuest (once UMI, the great newspaper microfilmer). On ProQuest's own service,
ProQuest Direct, you will find 55 of the Top 100. Dialog has 52 in its own
full-text files and 19 more in File 781, the ProQuest file on Dialog. New to
the newspaper archive business comes InfoTrac Web, the Gale Group library service,
with 47 titles.
Note that we are only considering full-text archives. We've deliberately
left out selected full-text files (SFT), like Business Dateline. That eliminated
other periodical archive files on LexisNexis and Dialog and files with papers
found on OCLC's FirstSearch service. Though we have striven for completeness,
NOTHING is ever completely full text any more. You aren't going to get all
the articles anyway, but at least you probably will get all kinds of news areas
in so-called full-text archives politics, business, science, general
news, etc. whereas in the SFT files, you'll get just business or some
other highly filtered selectivity.
A casual examination of Table 1 doesn't indicate any particular advantage
in terms of coverage dates for any service compared to the newspapers' own
Web archives. Sometimes a paper's Web site has a deeper, longer archive, sometimes
not. It does show a lot of papers don't have an archive at all!
The Middle Players
There is one Web-only newspaper archive service and several pretenders. One
of my favorite Web sites ever is NewsLibrary. Originally a Knight-Ridder service
that collected all K-R's newspapers, it made a one-price-fits-all archive service
and became one of the earliest, reasonably priced, pay-per-view periodical
sites in history. At (usually) $2.95 a pop, it's no better or worse price-wise
than the $3 cash-and-carry on LexisNexis, identical to most full-text Dialog
files, and Factiva's prices as well. (Niche marketing to libraries, ProQuest
and InfoTrac offer all-you-can-eat for too-high-a-cost-for-mere-humans services
with no way to make comparisons with them on this scale.) There's always around
100 titles, though the titles have changed from time to time and aren't always
Knight-Ridder pubs either, and you can search them individually or collectively
for free, check the abstracts and citations, and then pay for what you want.
NewsLibrary is now the property of NewsBank but otherwise remains unchanged,
and I hope it will stay that way. NewsLibrary has 61 of the top 100 papers
and, based on ease of use and cost, constitutes a good competitor to the Big
Ones. Many newspapers use NewsLibrary as their archive operator and don't actually
have their own long-term archives on their own Web sites. (This explains the
gaps in the last column of Table 1.) But look first to the NewsLibrary column
before you give up on a Web-found archive.
There are at least three pseudo-services out there. One is RealCities, a
Knight-Ridder property, which would be another NewsLibrary (it charges the
same rates) but you can't search the papers en masse. It's really more a Web
hosting service for papers. URLs in in the sidebar with an "/mld/" in them
are RealCity papers. Another such service is the ProQuest Archiver. Again,
about three dozen newspapers use this host service, and you can't search these
papers en masse either, nor are they all the same price. The archive links
always go to a URL with "pqasb" in the address. Finally, there's one that's
just frankly poorly done, by the Web design firm called Alliance Alert. Most
of the "state" dot-com services in my listings go to them. They are poorly
designed. For example, the archive information page on nj.com for the Star-Ledger isn't
linked to any other page; the newspaper librarian had to tell me where it was.
One other service that has more than half of the Top 100 newspapers...and
many more isn't listed here. The Financial Times of London operates
a service called the World Press Monitor, at roughly $14 a month. It contains
500 papers and magazines from all around the world. But most of the papers
from the U.S. are available only in selected full text, primarily because these
come from the Knight-Ridder Business news wire. Still, if I had to choose two
low-cost periodical services for my personal credit card, the World Press Monitor
and NewsLibrary would be my choices for the general consumer.
The Individual Players
Table 2 on pp. 33-34 lists
the newspapers themselves, grouping them into four colored bands by circulation
size. Chances are, if you are a newspaper searcher on LexisNexis or Dialog,
you already use one or more of those in the top band, USA Today,TheWall
Street Journal, or TheNew York Times, papers with more than 1 million
papers sold every day. When you do a comprehensive national search, you probably
search the next band, the 500,000 to a million circulators, and maybe some
of the third band, the 250,000 and up group.
In a physical newspaper library, there's the current news and the "morgue," where
papers go for storage and future research. Online, there are similar depositories,
actually three of them.
The first one is simple: Are today's articles viewable? A simple click on
a headline answers that question. Most papers allow you to see the current
issue's stories and at no cost. Most of the "no, you can't" papers are the
big ones in the top 12, in fact half of them. There are only five others in
the remaining 87. In some cases, you must sign up and register before you can
read articles, but again usually for no money at all. If all you need is current
news (though often it is 24 hours old, not all have "breaking news" sections),
just about any U.S. newspaper on the Web will do. (To find practically any
newspaper or broadcast news source in the U.S. and outside nothing
beats the inimitable http://newslink.org
as a starting point.)
Where does news go between today's life and the future morgue? Paper purgatory!
That backlogged stack on the morgue librarian's desk has an online equivalent.
Here, you'll find yesterday's news, and often more for a short period, usually
7 days. Some Web sites go for as much as 2 or 3 months (and sometimes that's
the only archive on the Web site!); others apparently consider yesterday's
news not worth knowing. Most often these transition zones between hot stuff
and cold clippings are cost-free. Generally, if you register for today's news,
you register for last week's, too.
Finally, then, there is the long-term archive and the central part of this
search. As noted above, more than half use NewsLibrary...but there are different
flavors of that as well. Some archive articles are paid by the piece; other
archives charge you for access per unit of time. Whenever there is a "T" in
a Table 2 column, this means you spend something like $5.95 for 24 hours of
access. Sometimes you can download as much as you want, others have an upper
limit, say, 10 articles. Naturally, like print subscriptions, you pay less
per unit if you buy in larger quantities. I've listed the Large Economy Size
rate in the next column. Have a large limit on your MasterCard? You'd better!
Rates go into the hundreds of dollars, up to about $2,000. If you're a business,
school, or library, you have to sometimes set up site licenses, no credit card
allowed. (Whenever you see NL, this means the NewsLibrary rate of $2.95 each
article, maxing up to 1,000 articles for $1,99 is the range of charges).
What gives each Web site, well, character is the various ways it goes about
setting up the archive (or setting YOU up). You'll find extra pricing and archive
information and some of the oddities listed in Comments. For examples, some
sites have two long-term archives, sometimes both free, sometimes one
charges a fee. Sometimes the site tells you there's a limit to the purgatory
archive, then you do a keyword search and get articles from several years ago free!
The long-term archive would charge you for it. A nice trick to know. Some Web
archives are browser unfriendly, others are so slow it would be quicker to
run to the store and get a copy of the paper edition.
Another tip for those on very tight budgets. For a hot story and definitions
of hot change from paper to paper depending upon local connections you
may find that a newspaper has linked to earlier stories as background material.
In a sense, the reporters have done the archive checking for you. Often stories
tagged and linked to a current story carry no charges, though the same stories
would cost if you retrieved them on your own from the newspaper's archive.
Some newspapers go heavily into paper recycling and some have found an online
equivalent. For example, after 7 days, the Chicago Sun-Times Web archive
is history, literally. Others of this ilk might go as long as 60 days before
recycling the electrons. For these, there truly is no choice; if you want a
story from an earlier edition, you may have to look deep into a stack of yellowing
paper out in the garage or head for the university library's microfilm reader.
Paper or Digital?
"They" say that everything is free on the Web. Not quite. Some newspapers
are definitely far from free. The Wall Street Journal charges you for the print
edition, then an electronic subsidy on top of that, plus possibly a
single article cost. But there are some bargains, even freebies, out there
amongst the newspaper archives.
What defines a bargain? With most archives roughly comparable in price, certainly
at the pay-per-view range, a bargain depends on the size of the archive and
how online costs compare to offline costs. Print edition subscriptions also
vary greatly in amount. Annual costs can range from a few Jacksons to several
Franklins. You can investigate possible effects on your budget by comparing
the cost of however many articles you need in a year (or month) with the annual
subscription.
But clearly, free is often better than fee, and free is better for 10 years
of archive than for one. So I've created Krumenaker's Newspaper Bargain Index.
I've divided the print edition annual cost by the non-discounted single article
cost (or the cost for 24 hours when access is by time) and multiplied by the
range of the archive. As you can guess, free archives are the best, a cost
divided by zero is infinity, therefore free archives have an infinite value!
Some are more infinite than others, so the NBI can get very metaphysical infinity
+ 10 is better than infinity + 2. Sites with no long-term archive have zero
value, mathematically or otherwise. In between, the higher the NBI, the more
value in the archive.
Of the freebies, the best bargain is the St. Petersburg Times. It's
totally free and you go back 16 years! Why they do this, I don't know, but St.
Pete's is my Mecca when I search newspapers now. In second place comes
the Greensburg (PA) Tribune Review (near Pittsburgh), but it would help
a lot more if it didn't have such a mix up of two archives. Other valuable
sites, with up to 7 years of free archiving, include the San Francisco Chronicle (#11
in the top 100), the Milwaukee Journal Sentinel, Seattle Times, Jacksonville
Times-Union (what is it with Florida papers?), and the Las Vegas
Review-Journal.
Among the fee-based archives, the top of the heap mathematically is low-ranked
(#86) Salt Lake Tribune, with a long and inexpensive inventory. But
don't forget the three archives that go back to the late 1970s: #5 Washington
Post, #8 Newsday, and #14 Boston Globe. The first two have 2-week
long free archives.
Who's the worst bargain among the fee-based archives? These would be the Tacoma
News Tribune, Toledo Blade, Palm Beach Post, San Diego
Union-Tribune, and the New York Post. Why? Because the price of
the archive comes close to matching the price of the subscription...and some
of these have very small archives. They may be good papers, they just don't
have cost-effective archives.
Thirty
My
recommendation: if you are just looking for U.S. newspapers, and nothing else,
or have a limited budget, go for the FT.COM and NewsLibrary services, keep St.
Petersburg in your bookmark list, get an annual subscription to your local
big paper, and buy a long-term discount archive rate at the Washington Post.
If you have to search other kinds of files as well, LexisNexis has most of
the Top 100 papers, as do Dialog and Factiva, and have the means to search
them simultaneously with other kinds of data. There aren't many other choices.
If your institutional budget has the bucks, one of the other periodical warehouse
services, e.g., ProQuest or InfoTrac, will give you a wider choice of periodicals
than just going Web, but little of the non-periodical data universe. If you
need today's news (or the recent week's), the Web's the best bargain around,
for sure, and you can check the news anywhere, from anywhere.
|