Information Today
Volume 18, Issue 4 — April 2001
Table of Contents Previous Issues Subscribe Now! ITI Home
IT Interview •
Moreover Tackles Indexing News on the Web
CEO Nick Denton talks about extraction technology and the ‘new information food chain’
by Paula J. Hane

Moreover recently won Search Engine Watch’s Best Specialty Search award for its ability to track thousands of rapidly changing online sources. Within just 2 years of its founding, Moreover has established itself as a leading provider of dynamic Web news headlines. Nick Denton, the company’s CEO and co-founder, divides his time between Moreover’s main offices in San Francisco and London, but took time out to talk to me about the organization’s technology, business model, and products as well as industry trends. 

Q You founded Moreover in 1999 after leaving a job as the Silicon Valley reporter for the Financial Times (FT). What gave you the idea for this new venture? 

A My initial idea came out of frustration as a journalist—a journalist, who, like any other information professional, was overwhelmed by news, commentary, and analysis. This was particularly true for anyone covering technology and the Internet. By 1999, I already had a huge bookmark file. It was just unmanageable. My initial idea for Moreover was one of those "wouldn’t-it-be-cool" ideas. Wouldn’t it be cool if there were just one place I could go to filter through news, commentary, and discussion-group material—and save all the time, confusion, and hassle involved in navigating the Web? I got together with two of my high school mates—a database developer and an architect—to work on the problem. We ultimately decided, rather than to create the mother of all news sites, to take that Web-extraction functionality and power other news sites, search engines, and business intranets and extranets. 

Q And, important to your concept was that you were aggregating headlines and linking to content—not syndicating the content—so that you didn’t need to license it.

A We never really looked at the issue of content licensing. We saw this as an indexing problem. There are great companies, like Inktomi and Google, that index the entire Web. The frustration, for a business user in particular, is looking for time-sensitive, high-value information. There was no way with the existing search or directory technologies to look for this information. The average search engine visits the average site about once every 15 days to index it. Moreover goes to high-frequency sites once every 15 minutes. That difference allows us to take information hot off the Web and deliver it to customers’ intranets. We have come up against companies like Factiva and Lexis-Nexis, which license full-text content. Our biggest point of contrast is that we index while they license. It has some disadvantages, but many advantages as well.

Q You are only indexing current information—there’s no archive. What about ongoing coverage and access to archival material?

A We are a current-awareness tool, so our zone of operation is from 15 minutes ago to a couple of months [ago]. Increasingly, as the Web gets more robust and as publishers’ sites become more systematic in the way they treat links, we do go further back in time. The Web becomes more feasible as a near-archive kind of medium. For instance, News.com maintains constant links going back several years, so any link we extract from News.com will stay live for months and years. It’s unlikely we’ll be a 10-year archive, but we can move in that direction.

Q When you say the links are live, do you mean that on the Moreover.com news portal, a user could still find links to those stories? But that doesn’t apply to your other models of feeding headlines to other sites?

A Not necessarily. For instance, Moreover supplies its Web-intelligence database to Inktomi. Inktomi runs searches across our database that go back in time. We’re doing more technology deals like that. We made our name with news feeds, but we find that more and more companies—portals, search engines, other information companies, software platform companies, corporate portal developers, and even other information companies who we initially saw as potential competitors—have come to us to license Moreover’s Web-intelligence database for distribution within their own environments. They are taking an entire copy of our database. In that regard, we’re operating a little bit like COMTEX. That’s now a growing part of our business. 

Q I think COMTEX provides more premium-priced content, though—not freely available Web content. 

A Actually, if you look at the source mix, the typical COMTEX customer takes content that is pretty heavily weighted toward press releases and second-tier U.S. newspapers, whereas, with the economics of Web indexing, Moreover has sources from The New York Times, the San Francisco Chronicle, News.com, ZDNet, CMP, discussion boards, the Financial Times, and [Information Today, Inc.]. If you look at the quality of our sources, I don’t think any of the licensed providers can match up to that. We don’t have the constraints of license agreements. It’s that range of sources and editorial quality that led us to do the deals that we have with the Financial Times and The Economist, for example, which are two of the editorially most choosy publications on this planet. They chose to use Moreover, rather than a more traditional aggregator. 

Q Let’s talk about the sites that you link to. Does Moreover include password-protected or subscription sites? What about sites that register users?

A In order to actually follow through on a link from a headline, you’d have to be a registered user. We do index The Wall Street Journal, but you have to be a subscriber to look at the article. From The Wall Street Journal’s point of view, we’re delivering them potential subscribers. The key to Moreover is that the publishers like us because they get additional traffic and additional registered users or subscribers. In fact, we get many publishers hassling us—not complaining that we’re taking their headlines, but that we haven’t included them. In the Web-publishing world, if you’re not Yahoo!, you struggle to build up a critical mass of traffic. Being present within a network like Moreover’s—being on intranets and extranets and search engines and news portals—can be a pretty powerful driver of traffic. For some publications, we provide in excess of 70 percent of their inbound referrals. 

Moreover does not cover every single source in the world. We are the leader in providing aggregated Web sources. A large corporation looking to track the full range of competitive intelligence relating to its industry sector would need Factiva or Nexis for high-quality off-line sources and would need Moreover running alongside. The time when you could get a subscription newsletter, read The Wall Street Journal on the way to work, check Reuters during the day, and expect to be on the ball is gone. There is a new information food chain out there. 

If you’re in the information industry, you need [www.infotoday.com], and you need Chris Sherman, and you need Search Engine Watch. Those sources are available on the Web and not in a traditional news database. These are the sources that Moreover provides. Even mainstream technology news sources like News.com, CMP, or ZDNet typically don’t exist within traditional databases. But, if you’re covering technology or are in the technology sector, these are sources you need to track.

Q In addition to covering daily Web sources, are you including weekly newsmagazines or monthly Web sources?

A We include all high-quality Web sources, no matter how frequently they are updated. We integrate McKinsey Quarterly, which is available on the Web but is not included in many traditional databases. We do everything from near real time to quarterlies. If it has a URL, we can index it. 

Q Do you provide a list of sources? I don’t recall seeing one.

A We have a list that we offer to customers. If we don’t have a source, and a customer wants it, then we’ll have it within a day. One of the strongest values for our enterprise customers is the ability to add sources to order. So, in addition to Newsweek, Business Week, Fortune, Forbes, The Economist, and FT, we tell customers, "Give us a list of 100 URLs that you want included in this Web-intelligence database." Compared to how slowly sources are added to traditional databases, this gives us a strong differentiator. 

Q So, for most sites, your headlines just link to them and they like the traffic. But I saw something about publishers paying for links. What is this arrangement?

A If you look on the front page of the news portal of Moreover.com, there’s a featured-publisher list. There are some publishers who pay and that we promote by placement. There are some publishers that we promote to sites that get free headlines from us. These are sources like Advertising Age, CNBC, The Economist—high-quality sources. 

Q So, there are a number of access points to your service. There’s free searching of headlines on your news portal site; there’s headlines by e-mail for an individual such as me; there’s free headlines available to feed into Web sites; and then there’s premium headline services available for a fee, which provide additional content.

A Our primary business model is licensing the high-end Moreover database to corporate customers and to search engines, portals, and other technology companies. It includes more content than a Webmaster could take away for free. The Moreover.com news portal showcases this higher-end content, but you can’t take it away to your site. So, we offer teasers. You can look around, take some free headlines, but when you get into pharmaceuticals, finance, the semiconductor industry, then we charge on an access basis. We charge a base rate and about 10 cents per view to enterprises. 

Q Let’s talk about a new competitive intelligence (CI) product you’ve rolled out recently, called the Business Intelligence Solution. I’ve recently reported on other new CI services: Hoover’s Intelligence Monitor and Northern Light’s Rival Eye. With so many options now available, what is distinctive in what you offer?

A We launched our flagship Business Intelligence Solution in December 2000 and now have more than 55 customers, including British Telecom and McGraw-Hill. Our key distinction is that we have the leading technology for indexing dynamic content on the Web. Northern Light does two things really well. They are very good at indexing the Web—like Inktomi and Google. They are expert at indexing static sites. And, in addition, their Special Collection is like a low-end Nexis or Factiva. We provide a dynamic indexing solution that these other sites don’t provide. 

Q This leads nicely into my questions about your proprietary technology for this extraction and indexing. When I was on your site recently, I spotted a little notice that said, "Powered by WebTop." WebTop is a division of the former Dialog—now Bright Station—and the product is the former k-check—now known as WebCheck—that they developed. So, how much is WebTop doing, and what is your proprietary technology doing?

A We work with several search engine companies, including WebTop and others, as well as categorization engines such as Autonomy. The value that Moreover brings in each case is the quality of the data set over which these technologies run. Moreover’s core expertise is in extracting information from the Web; parsing HTML and XML data sources; and in focusing on high-value, time-sensitive information. A traditional search engine will index an entire page and will often get confused by the navigation-bar material. Moreover homes in on the hotspots, on the headlines, using clues in the HTML. Moreover’s extraction technology exploits that to turn layout information into meaningful information. Once we’ve extracted the information, we add metadata to it—information about the industry sector, source, time posted, stock ticker symbols, etc. The rules for this are "human assisted." We create a rich, dynamic database of the Web—with headlines, URLs, categorization—that a search technology like Inktomi or WebTop can run over. Those search technologies are very good, but the results can only be as good as the data that is fed to them. On the search note, you can see an example of Moreover-powered searching at its finest by going to FT.com. 

Q Do you have plans to apply your technology to other areas besides news headlines?

A Focusing now on current news is the easiest way to demonstrate the power of Moreover’s extraction technology. We will be moving into market research. For example, an increasing number of investment banks are putting market research on their sites for free. Indexing intranets for customers—internal information—is another area for us. Customers really want all information in one place and don’t care what’s internal and external. They just want information to make better decisions. 

The other big area that we’re focused on now is Web discussion groups and Web logs, which is included in our Business Intelligence Solution. Even more than an information medium, the Web is a medium for insight, commentary, and opinion. What often scares people about the Web is all the buzz, the gossip, the commentary—the stockholder on the discussion board complaining about the company or the disgruntled ex-employee—but this is also the part of the Web that can deliver great value by providing an early-warning system for opportunity or impending crisis. The highest value of Web intelligence is tracking this kind of information. 

Q I have a question about companies who subscribe to your Business Intelligence Solution or to the new vertical CI solution you just announced for the telecom industry. How are companies handling headlines from Moreover that they want to keep and access? Are they dropping the content into databases on their intranets, or what? 

A They can select and store individual items. We have an add-on tool called NewsBlogger that works with the Moreover database. It allows an intranet manager to select, comment on, and store individual headlines. We output our data in about 14 different formats, including WAP, JavaScript, and five flavors of XML, so Webmasters can choose the format they want for integration within their portal or intranet. Since Moreover is XML-based, we can work with multiple search engines and portal builders. We exploit the simplicity of the Web in order to reduce implementation time.

Q What other vertical solutions are you planning for Business Intelligence Solution?

A Other sectors we are planning include finance, technology, and pharmaceuticals. We already serve clients in the finance area, but this will be packaging the product for this vertical market.

Q Some trends that I see emerging from our conversation seem to be that partnering is important, specialty search engines are key to good results for Web searches, and using the power of XML for information handling.

A Yes, and the importance of Web sources. Until Moreover, there hasn’t been a good way to get good Web intelligence in front of business users in a digestible form. One caveat about XML: It’s powerful and useful only if it’s kept simple. 

Q It sounds like the company is doing great and you’re partnering with good companies. I know you have venture capital (VC) funding. What’s your company’s financial health? Are you profitable?

A We’re nearly profitable, and have enough capital raised to carry us until then. A company needs to be stable, well-financed, and well-backed, with a key technology—like our extraction technology. We also have stability in having a company like Reuters as a venture backer, in addition to other top-tier VCs. 

[For more information about Moreover, visit http://www.moreover.com or call 415/989-0600.] 
 

Paula J. Hane, co-editor with Barbara Quint for NewsBreaks, is contributing editor of Information Today, a former reference librarian, and a longtime online searcher. Her e-mail address is phane@ infotoday.com.

Table of Contents Previous Issues Subscribe Now! ITI Home
© 2001 Information Today, Inc. Home