Information Today, Inc. Corporate Site KMWorld CRM Media Streaming Media Faulkner Speech Technology DBTA/Unisphere
PRIVACY/COOKIES POLICY
Other ITI Websites
American Library Directory Boardwalk Empire Database Trends and Applications DestinationCRM Faulkner Information Services Fulltext Sources Online InfoToday Europe KMWorld Literary Market Place Plexus Publishing Smart Customer Service Speech Technology Streaming Media Streaming Media Europe Streaming Media Producer Unisphere Research



Vendors: For commercial reprints in print or digital form, contact LaShawn Fugate (lashawn@infotoday.com)

Magazines > Computers in Libraries > June 2024

Back Index Forward
SUBSCRIBE NOW!
Vol. 44 No. 5 — June 2024

AI CORNER

From A&I to AI: Elsevier Integrates AI Into Scopus
by Barbie E. Keiser

Scopus, Elsevier’s abstract and index database, is subscribed to by many libraries. It includes peer-reviewed research literature (book series, journals, trade journals, and conference proceedings) covering 330 disciplines and has 19.6 million author profiles. Scopus AI builds on the strengths of the vetted Scopus content, allowing researchers a new way to navigate through almost 30,000 peer-reviewed journals from more than 7,000 publishers in 105 countries. According to Elsevier, Scopus AI streamlines the research process, offering comprehensive, traceable, and up-to-date data from which they can analyze trends and gather insights efficiently.

In August 2023, Elsevier soft-launched an alpha version of Scopus AI that allowed users to pose natural language queries instead of having to construct complex searches using Boolean operators and keywords. Some 15,000 researchers tested the tool that combined the capabilities of generative AI (GenAI) with Scopus content and data. By January 2024, Scopus AI was ready for launch.

The difference between a traditional Scopus search and Scopus AI is evident on the homepage, which encourages users to “Explore new topics and discover relevant references from 2013,” providing search examples to help users frame a question using natural language rather than the customary keyword search. The goal is to deliver a snapshot of any given research area. Each topic summary generated by Scopus AI is accompanied by five to eight references drawn from the Scopus database. A longer timeline is being considered by Elsevier, since the current 10-year time frame may be fine for students but not for researchers.

Creating summaries using plain language helps the public and students decipher the often-dense text of scientific papers. Synopses that employ less jargon make the materials comprehensible to people with different levels of expertise, leading to a broader readership. Accessible language also helps scientists speed-read the literature.

FEATURES OF SCOPUS AI

Using the Scopus database as its source minimizes the risk of misinformation, aka hallucinations, that are common with content generated from large language models (LLMs) trained on open resources. Advanced prompt engineering and use of curated recent data minimize risks of false, AI-generated information and ensure responses are based on current, trusted knowledge. Using Scopus AI, researchers get references from Scopus that support each of the statements that appear in the summary, grounding the results returned by the semantic search. No source is older than 2013, although the works they cite may be. In addition to the topic summaries, Scopus AI takes advantage of linked data to offer users features to enhance their research:

  • Expanded summary explores other references and lists additional sub-topics/sections, such as Key principles, Differences, Applications, and Challenges. RAG-Fusion (Retrieval Augmented Generation-Fusion) creates variations of the researcher’s initial query to predict the next question. It produces a more detailed summary, again noting the source material used to generate each statement.
  • Concept map presents a mind map to help users understand relationships among literature results, main concepts and ideas, and subtopics, offering additional terms to employ in successive queries to delve deeper into the matter. Scopus AI uses keywords from research abstracts to generate a concept map for each query, allowing users to “visualize links between research concepts, discover connections between topics and discover untapped frontiers to explore. The tool visually maps search results, offering a comprehensive overview that allows researchers to navigate complex relationships easily. The visualization takes the keywords of the abstracts and provides a bird’s-eye view of the topic space. It shows how concepts fit together and allows you to explore new vocabulary associated with a particular subject” (elsevier.libguides.com/Scopus/ScopusAI).
  • Go deeper suggests questions to help researchers explore different aspects of a topic, refocus or follow up on their search strategy, or take a different perspective, broadening their understanding of a topic.
  • Topic experts draws on the author profiles in Scopus to identify the most prolific writers in a subject—top researchers linked to a query—and generate a summary of their publications and experience, explaining why each was selected.
  • Foundational papers leverages the Scopus Knowledge Graph of the citations used in the papers’ network to generate a summary and identify which papers are commonly cited. It lists the most high-impact Scopus papers on any topic, pinpointing seminal works on the topic.

Searchers can frame a question in a conversational manner, in the form of a question or statement, or by entering keywords. The Scopus AI vector search “interprets the intent and meaning of the query. Then it will look at the last five years of Scopus abstracts to find the ones that best answer the question. Then the Large Language Model, using very strict prompt engineering, generates a response that is based on trusted Scopus knowledge, with references so you know where everything is sourced from” (elsevier.libguides.com/Scopus/ScopusAI). Taking the relevant abstracts from the Scopus database, the system uses the LLMs to synthesize the material to make it more understandable, generating a comprehensive summary based on that information. The summary, comprised of a series of statements, links each claim to the academic abstract on which the claim is based.

Data sources for training the model are the metadata and abstracts of 94 million Scopus documents. While preprints are available in Scopus, they are not used by Scopus AI. (Some weighting is applied to ensure that newer papers that haven’t had a chance to garner citations are not disregarded.)

It’s challenging for scholars to keep abreast of all research published in their field and navigate to the most significant, relevant materials for a project. Scopus AI’s stated goal is to offer a discovery tool to help scholars find relevant and reliable information as they pursue new research areas and identify opportunities for expanding research. Also helpful is the ability to export references to Mendeley, Refworks, Zotero, and EndNote.

EARLY CAREER RESEARCHERS

Elsevier sees Scopus AI as valuable to early-career academics who may be researching an unfamiliar topic and have not learned the vocabulary as well as seasoned academics have. Also, these researchers may not be familiar with seminal works or experts in the field with whom to collaborate. Here, too, Scopus AI can assist.

Being able to structure a query using natural language as opposed to keywords one may not know is advantageous. Doctoral candidates seeking topics for their dissertations can turn to Scopus AI to help them identify gaps in the research. Additionally, these individuals want to incorporate into their research projects what is valued by institutions of higher learning. Today, this represents cross-disciplinary/interdisciplinary research and global collaborative efforts. Scopus AI is designed to identify global experts and help bridge gaps between disciplines, enhancing and accelerating access to knowledge.

In academia, a good deal of hesitancy concerning GenAI exists. In addition to the trusted content of the Scopus database and ensuring that summaries in Scopus AI display verifiable references to document abstracts used in summarization, the tool indicates how it is working every step of the way so users can decipher how responses have been reached.

Scopus AI always provides the references it used to build topic summaries, offering a degree of transparency so users can trust the integrity of its answers. Despite being “rigorously” vetted, however, problems can arise, which was the case when Elsevier retracted seven papers by a pair of Japanese physicians (science.org/content/article/whistleblowers-flagged-300-scientific-papers-for-retraction-many-journals-ghosted-them).

Elsevier stresses its responsible and transparent use of AI on its website and in nearly every press release and demo of Scopus AI. The publisher’s five responsible AI principles reflect its approach to AI regarding the impact of its solutions on people, bias, transparency (explaining how its AI solutions work), accountability through human oversight, and respect for privacy and “robust data governance” (elsevier.com/products/scopus/scopus-ai). Users should view these principles alongside the publisher’s policy on text and data mining (elsevier.com/about/policies-and-standards/text-and-data-mining/license) and the use of GenAI and AI-assisted technologies in writing for Elsevier (elsevier.com/about/policies-and-standards/the-use-of-generative-ai-and-ai-assisted-technologies-in-writing-for-elsevier).

Here are some of the additional steps Elsevier established to ensure responsible use of AI, most of which are also discussed on this webinar: webinars.elsevier.com/elsevier/Scopus-AI-Navigating-essential-practices-in-responsible-Gen-AI-Session-2

  • Assessment of algorithmic impact and technologies employed using an approach developed by the Canadian government and endorsed by the Ada Lovelace Institute.
  • Directly pointing to the document abstract for each claim made. To do this, Scopus AI uses the mini-LM vector model, primarily using cosine similarity, testing the response on resources such as Quora’s Insincere Questions Classification to deliberately remove biased responses.
  • Reflection layer that informs the context-aware response to present a direct response to queries when the system is confident and infer responses when the source material doesn’t explicitly state a relevant response, but there is sufficient information so that it can be inferred with medium confidence. No response occurs when there is no direct or indirect statement in any abstract responsive to the query.
  • Internal human oversight is supplemented by solicited user feedback to adjust for poor-performing queries and with an independent content selection and advisory board.
  • Agreement with Microsoft that no user queries are stored or used to train or improve the ChatGPT LLM.

Since its design stage, the research community has contributed ideas and feedback on Scopus AI. Thousands of researchers worldwide participated in rigorous testing, and its engaged user community is continuing to shape its future.

When asked about future plans, an Elsevier spokesperson indicated an interest in quality improvement, which includes bringing in more insights for author profiles, and the company is looking into how to support filters in Scopus AI. Elsevier is considering the use of permalinks, allowing researchers to revisit previous conversations with Scopus AI, and enabling a “save chat histories” option. At the moment, Scopus AI is English language-only but Elsevier is actively exploring how to support multiple languages.

Scopus AI is designed to advance the user’s understanding of the subject matter. It offers enriching insights, transforming the way scholars conduct research, and forging new pathways for resource discovery and analysis. As librarians introduce Scopus AI to their communities, they may want to highlight some of the issues with AI in general and Scopus AI in particular, such as the difference in the start date of the source materials. It may make no difference for undergraduates, but researchers will need reminding that the traditional Scopus database is but one tab to the left.

Barbie E. Keiser


Barbie E. Keiser (barbiekeiser@gmail.com) is an information resources management (IRM) consultant located in the metro Washington, D.C., area.

Comments? Emall Marydee Ojala (marydee@xmission.com), editor, Online Searcher