

I agree We use cookies on this website to enhance your user experience. By clicking any link on this page you are giving your consent for us to set cookies. More info
Thank you for Subscribing to Business Management Review Weekly Brief
Why artificial intelligence can lead to odd results
In Lewis Carroll's second novel, Alice enters wonderland by climbing through a mirror. There she finds that everything is not quite as it seems, and some things are very odd indeed.
OpenAI is an artificial intelligence research and development organisation part founded by Elon Musk. Its stated mission is ambitious - to safely build artificial general intelligence that would benefit all of humanity. In June it released its third iteration of an artificial intelligence tool that mimics human written text. The tool is called GPT-3. GPT-3 has been the subject of significant hype in the tech community, not the least of which textual outputs produced by GPT-3 look as though they were written by humans. But on digging a little deeper, users will find that everything is not quite as it seems, and some things are very odd indeed.
At its heart, GPT-3 is a language prediction tool. It can take a small language input and predict the text that will follow. The user supplies an input and GPT-3 supplies an output based on that text called a 'completion'. In creating the completion, it draws on an extraordinarily large dataset of text gathered by webcrawlers across the internet, breaking down language into fundamental building blocks and examining the patterns of relationships between those building blocks. As one commentator described it, GPT-3 basically ground up the internet into a slurry. GPT-3 then reconstructs those relationships in new patterns to form a completion that matches the user's input. It is constantly working to provide a new string of text which flows from the user's input and any completions it has already generated. Its patterning strengths enable to progressively output matching strings of text which build on each other to replicate the cadence of polished text, luring the reader into a lulling rhythm.
At Ashurst Advance Digital we have been working with artificial intelligence tools for over 5 years and we are only too familiar with a key flaw of other natural language processing tools: it has no idea what it is talking about.
Natural language processing is effectively pattern recognition on a mass scale. GPT-3's language model has 175 billion parameters, meaning the patterns are deeper, longer and richer than ever. But pattern recognition can only emulate, not replicate, language understanding. Left to run for long enough, the reader will notice the outputs start to suffer from cognitive dissonance. A study of GPT-3 run by Nabla, an organisation specialising in AI enhanced healthcare, provides an example of a short exchange demonstrating an inability to keep track of time when booking an appointment:
There are many other examples on the internet, where outputs of GPT-3 may initially seem sound but don't quite work out. A favourite is a Youtube video where GPT-3 is asked to provide a recipe. There are several false starts where text is produced which refers to rabbits, but is not a recipes in any traditional sense. The user then makes a more specific request for a recipe for chocolate chip cookies. The test kitchen produced a passable, albeit very flat cookie, combining a leathery and liquid texture which would not win any baking prizes.
As another example of the challenges faced by natural language processing, GPT-3 can be fooled by asking invalid questions:
More insidiously, Nabla has shown GPT-3's pattern recognition capabilities can produce outputs which are wildly inappropriate, as in the case where it encourages a user to commit suicide.
"The Provision Of Legal Services Is High Risk As A Client's Losses Resulting From An Error In Legal Advice Or Typo In A Contract Can Exceed The Cost Of The Legal Fees By Orders Of Magnitude"
There is no doubt that GPT-3 can create text that appears useful. However, at present, reliance on its outputs is misguided at best or dangerous at worst. Unfortunately, this means that reliance on GPT-3 to provide factual answers in a legal context appears inappropriately risky.
The provision of legal services is high risk as a client's losses resulting from an error in legal advice or typo in a contract can exceed the cost of the legal fees by orders of magnitude. In my experience as a legal technologist, an unreliable technology is very difficult to work with. Two key drivers for legal technology adoption are time savings and risk mitigation. An unreliable tool does not save any time if its outputs need to be validated from first principles each time it is used. Likewise, a tool which appears to be functioning well but regularly generates spurious results may lead to increased rather than decreased risk.
As legal professionals, does this mean all is lost? Is there any use to which we can put an unreliable or misleading technology tool? Leveraging the unreliable in legal service delivery
As lawyers, we often like to think of ourselves as practising black letter law. We provide incisive commercial advice to our clients written in plain language founded on well-established legal precedent, rules and regulation. We would rather not think of ourselves as creative when drafting a loan facility for hundreds of millions of dollars or reviewing a portfolio of contracts checking for change of control provisions in an M&A transaction.
GPT-3 can produce text which is literate, learned and beautifully written. Given the opening line, it could write a sonnet in the style of Shakespeare or John Donne. Although it may not seem like it, lawyers can also be highly creative, particularly when acting as advocate. GPT-3 should be perfectly capable of writing an argument for a given position, adapting the style of a famed barrister or even the user's own style given sufficient source material. It is not suggested that the lawyer would use the output in its raw state, but rather they would use it as a starting point for their own work, much like a senior lawyer asks a graduate or trainee to prepare a first draft. It may turn out that the best use for GPT-3 style text completion is as a creative muse, to inspire lawyers and to help them become the best that they can be.
One enterprising developer has used GPT-3 to create titles for his posts on Hacker News that immediately went viral. Tellingly, the best titles did not come from completions, but rather titles he thought of after reviewing about GPT-3's completions. In his own words "It felt as if we were working together – GPT-3 as a writer, and I as an editor." The developer goes on to quote Peter Thiel:
“We have let ourselves become enchanted by big data only because we exoticize technology. We’re impressed with small feats accomplished by computers alone, but we ignore big achievements from complementarity because the human contribution makes them less uncanny. Watson, Deep Blue, and ever-better machine learning algorithms are cool. But the most valuable companies in the future won’t ask what problems can be solved with computers alone. Instead, they’ll ask: how can computers help humans solve hard problems?”
Thus, technology is not the solution but the enabler of the solution. We believe this strongly within Ashurst Advance Digital, and our partnership with lawyers within the firm to leverage technology to deliver better solutions to clients forms the core of our value proposition.
So what do we make of GPT-3 and its place within the legal industry? All hope is not lost. Ashurst Advance Digital extensively uses document automation to create first drafts of complex legal agreements. We use the automation help us accelerate difficult drafting problems and we can see GPT-3 adding significant value in this and similar areas. We have applied for a licence with OpenAI and we look forward to exploring how GPT-3 can use a corpus of legal contracts to prepare a first draft of an agreement or case law to formulate an initial legal argument for our legal professionals.
Journeying through the looking glass of natural language processing and predictive completions may sometimes lead us to odd results, but we hope in time it will also help us produce truly creative and persuasive advocacy