DEPARTMENTS 
                              Internet Search Engine Update 
                              by Greg R. Notess 
                              Reference Librarian, Montana State University | 
                           
                         
                                                 Internet
                            Search Engine Update goes up on the Web
                            at  http://www.onlinemag.net as
                            soon as it is written, approximately one month before
                            the print issue mails to subscribers.   
                         
                        The big change in the search engine space this time is 
                        Yahoo!'s 
                        launch of its own search engine. Instead of relying 
                        on Google to provide the bulk of its search results, Yahoo! 
                        now has its own Inktomi-based search engine database. 
                        Many of the other changes and announcements appear to 
                        be a reaction to it. 
                        AltaVista and AlltheWeb, 
                          although now owned by Yahoo!, still 
                          continue to have their own, unique databases, plus separate, 
                          unique search features. Although their image, news, 
                          and media databases have been combined, their main Web 
                          page databases continue to be separate and continue 
                          to be updated. While Yahoo! claims that eventually all 
                          will share the same Yahoo! Database—that was originally 
                          supposed to have happened late in 2003—it has 
                          not yet occurred. The good news is that even when (and 
                          if) these engines do have a common database, the advanced 
                          search features at each are supposed to continue to 
                          be available.  
                        Ask Jeeves has announced that it will 
                          be purchasing Interactive Search Holdings, a company 
                          that includes the MyWay.com, My Search, My Web Search, 
                          iWon, and Excite search sites. These sites currently 
                          use either Google results or offer results from several 
                          search engines. Once the purchase is completed, these 
                          sites are likely to be switched over to Ask Jeeves/Teoma 
                          search results. Ask Jeeves has also suspended its Index 
                          Express paid submission service for large (over 1,000 
                          URLs) sites, although it continues to accept paid submission 
                          from smaller sites. It still has no free submission, 
                          although its Teoma search engine (which also provides 
                          the bulk of results for Ask Jeeves) has been quite successful 
                          at finding most sites, even without the free submission. 
                         
                        Gigablast continues to add new features. 
                          The latest additions are the links to archived pages 
                          for each of the URLs in a results list. This is in addition 
                          to the cached archive of that page. The new links are 
                          labeled as "older copies" and link directly 
                          to the Internet Archive's Wayback Machine. Gigablast 
                          has added a related concept section called Giga Bits, 
                          which displays at the top of the search results page. 
                          One more small change—Gigablast now has stop words. 
                          Very common words, such as "a," "is," 
                          and "the," will not be searched. Gigablast 
                          does display a message noting which terms have been 
                          ignored. Put a "+" in front of terms or include 
                          them within quotation marks as part of a phrase search 
                          to search them.  
                        Google announced an expanded Web database 
                          (from 3.3 billion to 4.285 billion) and an enlarged 
                          image database (doubled to about 880 million images). 
                          Google's last announced increase went to 3.3 billion 
                          a few days after AlltheWeb announced a larger number 
                          than Google's previous number; similarly, the timing 
                          of this Google announcement was on the same day that 
                          Yahoo! announced dropping Google and the launch of its 
                          own database. With that being said, Google's database 
                          growth is still significant. On several actual searches, 
                          it does not seem to find that much more than it did 
                          before the announcement. On a few, Yahoo! finds even 
                          more results than Google. Yet for most searches I tested, 
                          Google still retrieves more results than the others. 
                          In other Google developments, the Google news alerts 
                          have expanded beyond English, to French, German, Italian, 
                          and Spanish. Google Labs has a new experiment providing 
                          access to the Froogle shopping engine via wireless devices 
                          such as mobile phones and PDAs. Beyond its automatic 
                          stemming, Google is also sometimes searching for English 
                          synonyms of query words. Use a "+" in front 
                          of each term to turn off the automatic stemming and 
                          synonym searching. And lastly, the site: search now 
                          works by itself and no longer requires the addition 
                          of another search term.  
                        Lycos is changing again. Its new interface 
                          is designed around the idea of social networking, a 
                          recent hot topic on the Internet. While much of the 
                          content on the older site remains, including search, 
                          the focus is now more on personal publishing with blogs 
                          and home pages, along with searching for people and 
                          groups. At this point, the general Web search remains 
                          and continues to be powered by the same database used 
                          by AlltheWeb. Lycos-owned HotBot continues to be focused 
                          just on search and offers access to the HotBot (Inktomi), 
                          Lycos (AlltheWeb), Ask Jeeves, and Google databases. 
                          Additionally, Lycos launched the HotBot Desktop, a browser-based 
                          search toolbar that enables Web, individual hard drive, 
                          and RSS feed searching.  
                        MSN Search no longer includes results 
                          from LookSmart. In the past, directory listings from 
                          LookSmart came before the Inktomi search engine results. 
                          Now, MSN Search uses only the Inktomi results. It has 
                          also launched its own toolbar. MSN continues to experiment 
                          with various beta versions of its own search engine, 
                          and now that Yahoo! has launched its own version, many 
                          expect to see a new MSN search engine sometime in 2004. 
                         
                        Yahoo! launched 
                          a new search engine database [http://search.yahoo.com] 
                          that no longer uses Google results. Instead, it appears 
                          to have results based in part on an Inktomi-based database, 
                          but the results differ from other Inktomi-based search 
                          engines, including MSN Search and HotBot. One major 
                          useful addition is the cached copy of pages that continue 
                          to be available—this is the first major search 
                          engine beyond Google to include this useful feature. 
                          In a similar way, HTML versions of PDF and other file 
                          types are also available. According to Yahoo!, only 
                          the first 500 KB of a document is indexed, which is 
                          better than Google's 101KB but still short of full-document 
                          indexing available at AlltheWeb. It appears to be able 
                          to handle full Boolean searching using AND, OR, NOT, 
                          and parentheses for nesting. The new Yahoo! search uses 
                          field searches similar to Google's, such as intitle: 
                          and inurl:, along with site:, link:, hostname:, and 
                          url:. The image database continues to pull from Google 
                          at this point. In addition, Yahoo! announced its new 
                          Content Acquisition Program, which is designed to help 
                          both noncommercial and commercial content providers 
                          to get more Web resources into the Yahoo! database. 
                          On the noncommercial side, Yahoo! is working with sites 
                          like the Library of Congress, NPR, Project Gutenberg, 
                          and UCLA's Cuneiform Digital Library Initiative to make 
                          sure their content is included in the database. Inclusion 
                          is not supposed to change relevance ranking, but it 
                          may help move more material from the invisible Web into 
                          the database. 
                         
                         Greg 
                        R. Notess (greg@notess.com; 
                        www.notess.com) 
                        is a reference librarian at Montana State University and 
                        founder of SearchEngineShowdown.com.  
                         
                        Comments? Email the editor at marydee@infotoday.com.  
                             |