VOICES OF THE SEARCHERS
The Perils and Power of NOT in Prompting Chatbots
by Marydee Ojala
Most librarians are familiar with the basic Boolean commands: AND, OR, and NOT. These work excellently for searching traditional library databases. When teaching others about search techniques, we rely on examples and Venn diagrams to demonstrate the power of Boolean search. I have always felt we should stress the perils of the NOT command, particularly in resources containing the full text of lengthy documents.
It’s easy for novice searchers to think that since they are only interested in one concept within a search, NOT’ing out something they perceive as irrelevant is a welcome way to retrieve the best results. This is not always true. Take a trivial example to get across the point of the potential peril of NOT. You want information about being allergic to dogs. You don’t want to know about being allergic to cats. So you “cleverly” enter allergies AND (dogs NOT cats). This automatically excludes a highly relevant document that includes the sentence “Unlike allergies to cats, allergies to dogs are caused by Can F proteins.”
The NOT command does work well in traditional databases when you’re building sets. You can then NOT out a set containing information already viewed, eliminating redundancy and retaining relevant information. That exemplifies the power of NOT.
Web search engines give the illusion of using Boolean logic as they offer possibilities for using AND (the plus sign in Google), OR, and NOT (the minus sign). The major problem is that they don’t work properly. This is particularly true with NOT. You would expect dogs –dogs to retrieve no hits. It actually retrieves millions. Because of machine learning and algorithms, behind-the-scenes equivalencies are constructed that equate dogs with dog, puppies, puppy, canine, and a few more terms. It takes a lot of NOT’ing out before you get to zero hits, if you ever do. The power of NOT is largely dissipated by web search engines’ refusal to take Boolean commands seriously. Their attitude (if search engines can have an attitude) is, “I’m in charge here, not you. Command all you like, but I won’t listen.”
In the new world of chatbots and generative AI (GenAI), the Boolean NOT loses even more of its power. AI-based searching does not understand the AND, OR, and NOT of Boolean logic. It simply sees them as words. A search for dogs NOT cats, to GenAI chatbots, is three words, not a command. It looks to its language model to determine, analytically, where these three words appear and how often they appear in proximity to each other and to related words (which it also determines algorithmically). It can then return results based on its predictions about the three words. In Google’s Gemini, for example, the first sentence in the results was, “Dogs and cats are both wonderful companions, but they do have some key differences.” It then goes on to give a lot of information about both animals.
Using GenAI to create images reveals other instances where the perils of negativity prevail. This goes a step beyond NOT as a Boolean command to the general lack of understanding that sometimes results from translating text into images. Recently, I was working with Microsoft’s Copilot to create some images of a bookless library. It really struggled with the concept, creating a futuristic library but with books on bookshelves. I explained that bookless meant without books. As with NOT, Copilot did not understand the concept of “without,” and books were still there. It did add two people wearing VR headsets. GenAI is touted as promoting conversational search, but when it comes to concepts associated with NOT’ing out words or concepts, it’s a frustrating conversation from the human’s point of view.
Online searchers and search educators are confronted with unlearning the rigors of Boolean except when searching or explaining library databases. Whether searching the web or prompting a GenAI chatbot, the perils of using NOT, along with similar words such as without or phrases like don’t include, far outweigh considering NOT a power tool. |