ONLINE, March 2001
Copyright © 2001 Information Today, Inc.
Imagine a search result page containing links to hundreds of related items, regardless of location, media format, and language. (Well, OK, maybe not language quite yet.) Your initial item is cross-referenced via hypertext links to other materials using descriptors such as:
Full text materials <free or as document delivery>
Author searches <of books and/or articles>
Journal article subject searches (by controlled subject terms or 'find similar' techniques)
OPAC book catalog searches
Citation analyses
ISI related records (citation cluster analysis)
Movies
Images
WWW sites clustered by subject <Inference Find >media type <SearchLight >Visualizations (subject cartographies and concept lines)
semantic analysis <Oingo>
Raw datasets (census info, GIS data)
This OpenURL syntax allows for interoperability by providing a simple and consistent way to identify where any item is found and how any item is described. In their technical explanation of the OpenURL, Herbert Van de Sompel, Patrick Hochstenbach, and Oren Beit-Aire ("OpenURL Syntax Description"; http://sfxit.exlibris-usa.com/openurl/openurl.html) review the OpenURL syntax as an HTTP GET request, which gives commands for description, origin-description, object-description, global-identifier-zone, object-metadata-zone, and local-identifier-zone. SFX is a software product developed as a PhD project by Van de Sompel at the University of Ghent, and acquired by Ex Libris, developer of integrated library systems. It facilitates a fully interlinked environment for scholarly information using context-sensitive linking techniques. OpenURL is the generic, public syntax used by SFX.
Known resolver/known item > local resolver > host machine > item deliveryKnown Item Example OpenURL: {http://www.pointerjournals.yale.edu?id=DOI%123-45-7654}
The look-up table within the resolver "pointerjournals" would say that any DOI (Document Object Identifier) starting with 123 should be sent to a specific full-text server (such as JSTOR). Examples of complex decision processes might include identifying the one "appropriate copy" item among a set of possible versions, including such considerations as the PDF or enveloped HTML copy or the copy from a standalone host as opposed to an aggregator copy.
Material Type | RDF Format | Value Elements | |
Books | Dublin Core | LC Subject | |
Visual Images | VRA Core | Headings | |
GIS Data | FGDC | Name Authority | |
(Content | |||
Standard for | |||
Digital | |||
Geospatial | |||
Metadata) | |||
Two examples of metadata content layout would be:
Journal Citation Metadata Descriptor Example OpenURL:
http://www.pointerjourmet.yale.edu?author=smith, joyce&issn=1234-5678&title=lost_and_found}
or
http://www.pointerjourmet.yale.edu?issn=1234-5678&date=1999&volume=1&issue=2&spage=13}
Image Item Metadata Descriptor Example OpenURL:
http://www.pointermetaimage.yale.edu?creator=smith joyce&topic=hat&topic=blue}
The identified local resolver would use the available metadata fields to determine which local metadata index or indexes should be searched. A search against each local metadata repository would find matches to the elements.
Known resolver/known data elements > local resolver (determine appropriate indexes > search index machine(s) > local/remote resolver to find item hosts > host machine > item delivery
In some cases, the metadata (and possibly additional associated metadata from a local server) will be captured and searched against other remote indexes (e.g. the Internet Movie Database or HotBot).
Known resolver/known data elements > local metadata repository (capture metadata) > local resolver (determine appropriate indexes) > search index machine(s) > local/remote resolver to find item hosts > host machine > item delivery
The search and retrieval of information across these complex and interacting indexes, resolvers, and host machines can occur over a variety of infrastructures. You can search across only local machines with a standard set of possible communication protocols. Alternatively, an information network can provide searching across a hybrid of local and remote servers with only the OpenURL as the standardizing agent. Some search engines will provide direct links to items while others may only provide additional metadata leading to other network resources before finding your final item(s). The following are examples of possible search engines that might be utilized:
The process would include the following steps:
First determine the important concepts, their synonyms, and the appropriate relational operators between the terms. This critical thinking step is the most important intellectual activity in the entire search process in terms of both precision and scalable resource utilization. The use of well-considered limitations to narrow the initial search to relevant materials will save a great deal of computer processing and reader review time. In addition to simple limitations such as media type, language, and year, other possibilities include discipline hierarchies, peer-reviewed material, and relevance ranking.
A challenge to this approach will be the integration of interactive feedback processing into this scenario; perhaps that will be best accomplished by simply allowing users to make real-time connections to certain hosts during the process. For example, linking on the term "ISI" in the first search screen in this article will make a connection to ISI rather than clicking on the "223" to see the actual citations.
Imagine a search agent that would start with the metadata from your journal citation (e.g. Dublin Core-identified subject term elements "population-U.S.-Arkansas") and link this item to all related records in a GIS data repository index with the following (FGDC elements "Arkansas- census-population").
This ability to search for metadata terms across variant indexing schema provides for powerful linkage opportunities. The ability to map subject headings across different subject thesauri or hierarchies would provide even more powerful ways to perform sophisticated interdisciplinary searches.
Some advanced search and analysis search engines are now able to create visual maps of the concepts within large sets of data. In these enhanced search option cases, you would more likely link directly to the remote search engine for interactive searching, and then return to the Related Links page for further exploration when you were finished using the specialized remote search interface.
For example, once the appropriate hosts have been located, the following is run:
IF item = "123" and desired format = "HTML" THEN host "XX" sends "item123h"
IF item = "123" and desired format = "XML" THEN host "YY" sends "item123x"
The ARC search service (http://arc.cs.odu.edu/) uses the OAI conventions to search an interdisciplinary set of distributed, refereed Eprint Archives. The coverage includes areas of Physics, Mathematics, and Computer Science, and limited coverage of the Cognitive Sciences (Psychology, Neuroscience, Behavioral Biology, Linguistics, Neuroscience).
Van de Sompel, Herbert and Patrick Hochstenbach, "Reference Linking in a Hybrid Library Environment: Part 3: Generalizing the SFX Solution in the "SFX@Ghent and SFX@LANL" Experiment." D-Lib Magazine. 4 No 10 (October 1999) : (http://www.dlib.org/dlib/october99/van_de_sompel_10vam_de_sompel.html)
A full project description of SFX is online (http://www.sfxit.com/sfx2.html).
"Digital Library Projects: Focus on Improving Access to Information Users." Session at Special Libraries Association Global 2000 Conference, October 16-19, 2000, Brighton, U.K. (http://dli.grainger.uiuc.edu/sla2000/).
This session included a number of speakers providing updates on linking technologies and trials. Especially related is the presentation, "The SFX-Framework & the OpenURL" by Herbert Van de Sompel. The excellent graphics and real-world examples from his current project are posted at the University of Illinois at Urbana-Champaign's Web site (http://dli.grainger.uiuc.edu/sla2000/sla2000_hvds/sld008.htm). There are plans to link the SFX approach with the materials within the engineering testbed at the UIUC Library.
For a good overview of smart agent technologies in libraries, see "Library Agents: Library Applications of Intelligent Software Agents" by Gerry McKiernan, Curator, CyberStacks, Iowa State University, Ames, IA (http://www.public.iastate.edu/~CYBERSTACKS/Agents.htm).
David Stern (david.e.stern@yale.edu) is director of science libraries at Yale University.
Comments? Email letters to the Editor at marydee@infotoday.com.
[infotoday.com] | [ONLINE] | [Current Issue] | [Subscriptions] | [Top] |
Copyright © 2001, Information Today, Inc. All rights reserved.
custserv@infotoday.com