DATABASE REVIEW
Claude and the Pursuit of AI Safety
by Mick O'Leary
Claude
SYNOPSIS
Claude (anthropic.com/product; sign up at claude.ai) is a generative AI chatbot from Anthropic that has sophisticated safety features built into its analytical systems. It doesn’t use multimodal content or live search, instead providing highly capable and reliable text-based information management.
|
Nov. 30, 2022, launched an epochal, 6-month burst of technological advance. On that date, OpenAI released ChatGPT, a generative AI chatbot that inaugurated a frenzied period of startling and unprecedented AI innovation. ChatGPT’s extraordinary capabilities captured the attention of the tech community and beyond: Within 2 months, it had more than 100 million users. In March 2023, OpenAI brought out GPT-4, which was significantly more powerful than its predecessor. In May 2023, Microsoft released the new Bing, a version of its Bing browser that utilizes GPT-4. That same month, Google put its chatbot, Bard, into general release and integrated it into its browser, branded as the Search Generative Experience. (Bard is now called Gemini.)
The power of these new AI products launched waves of speculation about whether we’re heading toward an AI Eden or an AI apocalypse. The latter scenario was fueled by the disturbing tendency of the new chatbots to “hallucinate,” whether that means making a simple mistake or creating whole batches of made-up “information.”
In March 2023, there was yet another announcement, this one from a then-lesser-known AI company named Anthropic, about its chatbot, Claude. Unlike the grand visions from the big-name companies, Anthropic had a very different sales pitch: It is the “safe” AI that will do what it’s told and most definitely will not take over the world.
ANTHROPIC’S SAFETY HERITAGE
Anthropic was founded in 2021 by several high-ranking OpenAI employees who, according to multiple reports, were dissatisfied with the company’s safety stance. (Much of Anthropic’s company information was obtained from external sources because the Anthropic site, oddly, has almost none of it, including officers, directors, organizational structure, and history. Claude was no help on this, declining to answer. However, ChatGPT and Bard were quite informative.) For a small startup, Anthropic has been very successful at fundraising, including more than $2 billion from Google and a $4 billion commitment from Amazon.
What Anthropic does promote and describe in great detail is its commitment to safety. Its tagline is “AI research and products that put safety at the frontier.” Anthropic is an “AI safety and research company” that builds “reliable, interpretable, and steerable AI systems.” All of this sounds nice, but Anthropic does have a concrete example of its philosophy. It is a public benefit corporation, which means that it balances profit maximization with efforts to achieve its stated benefits.
Anthropic explains that the principal tool for achieving its safety goals is Constitutional AI. A distinctive component of Claude’s training regimen, Constitutional AI is a set of 77 safety principles that shape Claude’s responses. The principles are drawn from the Universal Declaration of Human Rights and from industry conduct codes, including Anthropic’s. Claude is required to be mindful of freedom, human rights, honesty, impartiality, legality, respect, and consideration for non-Western perspectives.
AI’S PRACTICAL WORKHORSE
Anthropic presents Claude as an AI assistant—a highly capable digital workhorse for a range of practical workplace content management tasks, including document analysis, interactive dialogues, research, and workflow automation. Claude’s place in the AI product universe is also defined by what it doesn’t do:
- First, it isn’t multimodal. It doesn’t handle video, audio, or images. Thus, it’s not suitable for creative, artistic, and other tasks that use multimodal content.
- Second, it’s not connected to the live web; it draws solely upon its trained content. Thus, it’s not suitable for those many uses that need up-to-the-minute information.
Claude has three versions:
- Claude Instant is a simplified and less-expensive version that’s suitable for less-demanding applications.
- Claude 2.0 is more powerful and is better at complex reasoning tasks—and is much more expensive.
- Claude 2.1 is like Claude 2.0, but is less subject to hallucinations.
Claude provides APIs for commercial applications and is available as a free public chatbot, using version 2.1, at claude.ai.
CLAUDE AT WORK
I conducted many dozens of prompts at claude.ai, concentrating broadly on reference queries rather than on creative or personal themes. I avoided prompts that depended on recent information for which, as mentioned, Claude is not designed. (Please note that this is a tiny, idiosyncratic sample that attempts to capture only a small piece of Claude’s capabilities.)
Accuracy
I found Claude to be highly accurate with prompts about which I have personal knowledge or otherwise cross-checked. I observed no errors or hallucinations. Its answers are typically concise, highly pertinent to the prompt, and detailed. Claude is accomplished at condensing larger documents, skillfully extracting key points into a designated word-length summary. For big document tasks, Claude has an exceptionally large context window that’s capable of inputting approximately 150,000 words or 500 pages.
Composition
Claude uses a basic, facts-only writing style, with excellent grammar and usage and without conversational or imaginative flourishes. It often uses a straightforward dot-point format. Narrative responses are admirably organized, coherent, and clear. Claude can compose creatively, but I didn’t pursue this because it seems to fall outside of its principal uses.
Timeliness
As mentioned, Claude does not search the live web. Its training (as of January 2024) is current through 2022.
Safety
Aspirations for safety pervade the entire Anthropic project. Its methods, though, seem to boil down to not answering questions that will upset someone, break laws, or otherwise get it into trouble (not that this is a bad thing!). I deliberately submitted a number of edgy prompts and found that Claude politely declined to give me information on how to hijack a car, the best shoplifting techniques, or how to forge a check. It did describe how to home-brew beer. On value- or opinion-based topics, Claude often gives a fence-sitting response: “I do not have a definite view on ____, but there are reasonable arguments on both sides.” On partisan topics, Claude sometimes was mildly progressive.
WHO WANTS SAFE AI?
It is now a year-and-a-half into the Great AI Revolution, and things have settled down a bit. Generative AI has neither saved nor destroyed the world. Instead, a steady wave of AI-driven apps is spreading through all sectors and markets, including giant watershed science projects as well as consumer gadgets. Nevertheless, serious and legitimate concerns about their safety and reliability continue.
If Claude continues its present track—a safe and accomplished information handler that doesn’t need multimodal content or live web search—it will significantly demarcate itself from ChatGPT, Gemini, and others that do so. Claude’s clients may not need every bell and whistle, but they may be pleased with a powerful and sophisticated AI chatbot that, in Constitutional AI, has a distinctive and conscientiously designed method that thwarts illegal, malicious, and incorrect responses. |