ONLINE, May 2000
Copyright © 2000 Information Today, Inc.
What do Ally McBeal, chicken soup, and the structure of DNA have in common?
Answer: You can find information about them on the Internet by using a search engine.
As of January 2000, many major search engines had databases containing pages in the hundreds of millions. With so many pages being indexed, searchers can now find almost
anything--some of which is actually useful research--in almost any imaginable field. But sometimes more is not better. At times, you may wish you could sit down to a simplified search engine that finds only on-topic information. Now you can, by using a specialized search engine.
Think of a mainstream search engine as a Super Wal-Mart, and a specialized search engine as the specialty shop down the street. A specialized search engine focuses on a specific subject, a geographic region, or a certain type of computer file format. As such, specialized search engines tend to index fewer Web pages; but in the process, they also weed out information that's not useful for a particular topic. As a result, on-target pages will more likely be found at the top of your results page.
Another difference between large search engines and specialized search engines is human interaction. Many specialized search engines employ subject specialists who actually gather, rank, and annotate each link. So, not only are entries weeded in order to be subject-specific, but those weeded entries are winnowed even further so that only useful information is left.
All submissions to Achoo go through a review process. According to Adam Kruszynski, a Web site administrator at Achoo, a junior site administrator examines every submission to Achoo. Descriptions are edited and entries are placed into the proper Medical Subject Heading (MeSH heading). If a site is difficult to classify, the site is then taken to a consultant to properly classify. This process makes sure that all entries are medically related, have medical content, are regularly maintained, and are ethically sound.
There are three ways to search Achoo: browsing, basic search, and power search. Achoo's directory is based on the National Library of Medicine's MeSH heading system. This makes browsing relatively easy for an experienced health searcher, since linked information is appropriately classified.
Basic searching offers many advanced features, including phrase searching, and All or Any searches. Each Basic search can be limited to full record, URL, title, description, or keywords. Advanced searching (available by clicking the search folder tab at the top of the main page, and then clicking on "Search Achoo") allows AND/OR Boolean searching, phrase searching, and site name and descriptions searching. Searches can also be limited to geographic location.
You can also limit to what Achoo calls "Site Content Qualification Filters." This feature allows very specific content limits, like companies, conferences, newsgroups, or peer-reviewed information. Each Achoo record lists a title, a brief summary, and a country designation.
A team of medical catalogers reviews each entry in the reviewed Web site directory. Each site listed is ranked using a three-star to five-star system. The ranking system grades for content (1-20 points), ease of use (1-10 points), layout (1-10 points), and level of appeal (1-10 points).
You can also search by entering information into a text box, which returns results from the reviewed Web site database and the Internet. You can enter one or more words into the search box, which will bring results in the following order:
Records appearing in the reviewed database are given a "HON Code of Conduct" icon. Web sites are included in the HON database only if the site meets certain qualifications, such as: information is provided by a qualified health professional; information supports the relationship between patient and physician; modification dates, authors, and contact information are clearly displayed.
Records not found in the HON database are provided by MARVIN, a robot that searches the Internet using a 12,000-word medical dictionary. Each word in the dictionary is weighed depending on relevance and specificity to the medical field. When an appropriate Web site is found using this relevancy, the site is auto-indexed using the dictionary.
MedHunt allows Basic and Advanced searching. The Basic search feature allows Any, All, and Adjacent word searching, and can limit to geographic location and to only records found in the HON database. The Advanced search feature adds three more search boxes, and allows a guided Boolean AND/OR linking of the boxes. Each record includes a relevance score, the title and URL of the site, a listing of linked keywords, a brief description, the HON Code icon (if ranked), and a country and language designation.
Medical Matrix is the smallest of the search engines in this article, weighing in at only 4,481 records (as of November 29, 1999). Those records go through a multi-step ranking process by an editorial board, made up mostly of physicians and health workers. They rank each site in terms of quality, peer review, full content, multimedia features, and unrestricted access. Records are given a star rating of one to five based on the ranking process.
There are two search options available for Medical Matrix users: a search box and a directory. The search box is very simple, with no help pointers. The directory is arranged in a loose MeSH category heading, and seems more useful than the Search box.
Once you've searched using either the search box or the directory, you'll be presented with a list of results. Each record provides a title, a partial summary of the site, the star rating of the site, and a link to "Details." The "Details" link provides more information: title, description classification (MeSH headings), rating, contact, URL, keywords, date entered, and last updated.
Searchers using Medical Matrix have to go through a free registration process before using the search engine for the first time.
There are three ways to search FindLaw. To browse, simply choose an appropriate topic from the directory on the main page, or choose "For Lawyers," "For Students," "For the Public," or "For Business" to narrow your browse search to a specific target group.
To search the FindLaw directory, enter a search using the search box at the top of the page. Entering more than one word defaults to a phrase search. Other options, such as AND, OR, NOT, NEAR, and wildcard searching, are also available. Records found by browsing or searching FindLaw's directory display the title, URL, a short summary, and a clickable See Also subject listing.
To search LawCrawler, enter a word or phrase in the search box. LawCrawler allows AND, OR, NOT, and NEAR Boolean operators. One can also limit to specific databases using LawCrawler. "World Wide Sites with Legal Information" is the default database, which searches using AltaVista and FindLaw's intelligent operators. Other databases include: Legal News, Legal Dictionary, Law Reviews, Mailing List Archives, U.S. Constitution, U.S. Code, Supreme Court Opinions, and All Federal Circuits. Results found using LawCrawler include a title, a brief summary, the site's URL, document size (in bytes), and a revision date.
The second way to search ILRG is to use the enhanced search feature. This feature adds an assisted AND/OR feature, and allows URL searching and keyword exclusion. The third way to search ILRG is to browse the directory. The directory, called the Annotated Index of Features, is available on ILRG's main page. Records found using the first three search methods include a link to the site and, in some instances, a brief summary.
The search feature is rather powerful. At first glance, the search box (located on the main page) appears very simple, allowing only Any and All limits. However, one can also search using more powerful options such as +/-, AND/OR, phrase searching, limiting to URL and mailto, and wildcard searching.
Results include a title, a user rating and the number of votes for that user rating, a summary of the site, a link to user comments, the number of user comments provided, and a "More Like This" link. Users can rate each site using a 1-10 rating scale-- these are then averaged and displayed on the results page. You can also leave a User Comment for each site.
Searching is performed through a search box or the directory. The directory can be accessed on the main page by clicking on the appropriate subject field. This search will then list subheadings in each category and list the number of pages found under each heading. Searching via the search box produces similar results.
Records found in BioCrawler include a title, the URL, a relevancy percentage, a summary taken from the site, and page size in bytes. The number of links on the page, the number of citations to the page (links from other Web sites to the page), and a numerical rank are also given.
Records in BioCrawler are ranked according to links found on a page. Top-ranked pages are those that are linked to, or cited, by many other pages. Koesters also indicated that pages that are linked to those top-ranked pages are also highly ranked.
Biolinks can be searched using a basic or an advanced search box, or by browsing its directory. The basic search feature allows simple keyword searching in either the "entire database" or the "indexed database only." The service doesn't offer any description of differences between the two databases--but when doing a search, you'll find more sites using the "indexed database only" option. One can also choose between All and Any of the words entered in the search box.
In the Advanced Search mode, one can choose a more specific category (Meetings, Medical Sites, or Associations & Societies); narrow to "indexed, spidered, and entire databases" (again not explained); and narrow to full entries, title, URL, keyword, or page contents. Records found using these search methods provide a title, URL, brief description, keywords, and a contact address.
The directory can be browsed from the main page. You can choose a main topic, like Journals, Meetings, or Software, or choose one of the sub-topics also listed beneath each topic. Records found using the directory provide a linked title to the site.
Basic searching allows you to enter keywords into a search box and narrow to audio, video, or images. The Advanced Search page allows +/- searching and offers an Everywhere/Web Site/Share selection. Web Site finds files on the Web. The Share function enables Scour's Media Agent software, which is available for free download. Media Agent searches publicly accessible servers and allows you to download shared files off of these servers. Scour warns, however, that the software is only as reliable as the source. A disconnected or crashed server that has the multimedia file you're looking for will not be accessible. Additionally, Scour denies any accountability for users who download any unapproved copyrighted material. According to Scour, "Just remember, we're not responsible for what kind of material you download--we just carve up the pie."
The Advanced feature also allows you to narrow to a specific file type/format, with choices like movie trailers, downloadable music, RealMedia, MP3, Liquid Audio, and Shockwave Animation.
Each item found includes a file name, dimensions (in pixels), file size (in kilobytes), the URL, and a thumbnail (for images and videos). Audio files found using Scour also include file format and a date.
Searchers have many options in ditto. Search features include +/-, AND/OR, NOT, AND NOT, single and multiple character wildcards, filename, URL, and title searching. These features can be used in both the Basic and Advanced search areas. The Basic search box provides a text box and a Go button. The Advanced Search box also allows limiting to gif/jpeg files, file size (small, medium, or large), width, height, color depth (black/white, 8-bit, or 24-bit), and images added in the last week, last month, or the last six months.
Results appear as thumbnail images. A Details link is included that provides picture attributes such as page title, photo size, width, and height. (Editor's Note: For more information on ditto.com, including its recent court battle over copyright issues, see Paula Berinstein's THE BIG PICTURE, "Image Search Engines and Copyright," November 1999 ONLINE, p. 91.)
Searching is very straightforward. The Basic search allows you to enter a word or words into a search box, and limit by audio, video, and/or live media formats. The Advanced search offers more options, including Any/All limits and field limiting (all fields, title, author, copyright, directory).
Results list format, file size and length, title, author and copyright information, URL, and a user rating. The rating system is based on popularity, and can be rated anywhere from Sweet! (highly rated) to Stinky! (poorly rated). Streambox also provides a "Send Clip" option that enables you to send a file to a friend.
SearchEngineGuide.Com: The Guide to Search Engines, Portals, and Directories, located at http://searchengineguide.com/, is another good place to find specialized search engines. SearchEngineGuide currently indexes 2,243 search engines (as of November 30, 1999). All search engines featured on this list are divided into subject directories. Each entry provides a brief summary.
InvisibleWeb.Com: The Search Engine of Search Engines (http://invisibleweb.com/), created by Intelliseek, is another helpful search engine site. This Web site is one of a few sites devoted to finding information on the "invisible Web" (searchable information resources whose contents cannot be indexed by traditional search engines). Many search engines fall into the "invisible Web," since their index of links is stored in a database rather than on static Web pages. This site also lists all links in a subject directory.
Chris Sherman, the Web Search guide for About.com, maintains a site at http://websearch.about.com/. Aside from a wealth of resource links and current commentary on hot Web- search topics, Chris has compiled a long list of categories ranging from Careers and Jobs to Web Site Promotion. Within many of these categories you'll find links to corresponding specialized search engines.
David King (david@kclibrary.org) is Information Technology Librarian at Kansas City Public Library.
Comments? Email letters to the Editor at editor@infotoday.com.
[infotoday.com] | [ONLINE] | [Current Issue] | [Subscriptions] | [Top] |
Copyright © 2000, Information Today, Inc. All rights reserved.
Comments