Many organisations are still approaching AI governance as a rules exercise, but static policies rarely map cleanly to how legal professionals actually work. Instead of blanket rules or a list of “dos and don’ts”, legal teams need better judgement frameworks. This is the foundation that will help lawyers understand when AI can accelerate work, when oversight needs to increase, and how to evaluate risk based on the task at hand.
A simple misunderstanding can create a serious liability
Some AI uses are pretty straightforward – spell check, for example, doesn’t need to be checked and then triple-checked to make sure it didn’t make a mistake. By contrast, any AI usage that's going to affect legal outcomes requires a different approach.
It’s helpful here to think of the concept of “outcome determinative work”: any legal task where an error or omission would materially change the client’s legal position, rights, remedies, or exposure.
AI – particularly generative AI – can easily create these missteps for practitioners who don’t have a good understanding of how AI works and how it can impact their work product.
For instance, a lawyer might misstate a legal rule or case holding due to overreliance on AI and not checking it themselves. Specifically, they might rely on a summary rather than reading the original source, and then make a decision based on that summary – but it turns out the summary removed some nuance, or swapped the parties. These mistakes – both of which are known to happen – are clearly substantive and could thwart the case.
Building the “judgement” muscle
Lawyers don’t need more brightline rules or vague policies to address this challenge – they need to develop their judgement to the point where they can sense when they should dig deeper; which AI tools they can inherently trust more than others due to their architecture; and when they are likely to get good results versus questionable results.
For example, the latest court decisions are not going to be in the training data for generative AI tools. These tools, it’s worth remembering, work on the statistical relationships between the words that would be common in regular discussions. As a result, there's not going to be much information density or associations around the latest cases for the AI to fall back on. If a legal professional feeds a recent ruling to an AI tool and asks it to summarise the ruling using Retrieval-Augmented Generation (RAG), the results will be decent. But if they ask questions or try to generate new content, assuming the information is already there, the results will be poor. A legal professional’s “antenna” should go up in this scenario, alerting them that generative AI will only be useful for some tasks around recent rulings.
By contrast, it will be well prepared for a task involving an important ruling from 5 or 10 years ago that has been much discussed by legal bloggers or in published articles. Those blogs and articles on the Internet are going to feed the training data set and increase the information density around that ruling. Understanding what you should expect and what you can expect from an AI tool is nearly half the battle when it comes to building up judgement for how to engage with AI.
Additionally, legal professionals need to draw upon the judgement they’ve accumulated from their specific practice area to understand when some of these tools have gone off in the wrong direction.
For instance, the elements of a contract are: offer, acceptance, consideration. The legal definition of consideration is an exchange of value, which is basically money – but it is very easy to think that consideration means “to think about something”. With this semantic overlap, the sentences that AI produces aren’t obviously wrong when that meaning is used, but a lawyer should recognise this error, if they’re paying attention. Put another way, a lawyer who has built up enough of a judgement muscle ought to sense when material the AI produces is grammatically correct, but entirely wrong for the task at hand.
The rules of engagement
A simple tool that can help firms ensure their professionals are honing their judgement around AI usage is an “AI engagement” matrix.
Think of this as a quadrant. Activities with a low level of risk and a high level of curation, such as using spell check for a work email? Those fall into the “safe zone” within the quadrant where low engagement is acceptable. Activities with a high level of risk and a low level of curation, such as using the free version of ChatGPT to generate an M&A agreement? That’s squarely in the danger zone and requires more engagement.
The more built-in friction there is around using the AI, the better. Introducing friction into the work process forces lawyers to engage. When lawyers engage, they think more deeply and pay attention, which makes their tool use safer. They get the fast and smooth responses that AI can provide, but they’re still using their “legal mind” to do the deep work.
What are some key ways legal organisations can introduce some of this friction and create pathways towards judgement design?
For starters, lawyers should keep the document that they're working on and their AI-generated source entirely separate from one another. More importantly, if they bring something from the AI-generated source into their document, they should comment on it and justify why they have used it.
An embedded comment works fine here, something to the effect of “I know that this material that I incorporated from AI is correct because I have ten years’ experience in this practice area and it aligns with what I’ve experienced firsthand” or “I know that this material is right because I cross-referenced it against primary source X and primary source Y.”
The operative word in these examples is because. That word shows that careful inspection took place. If the legal professional can't say why the material works – if they can’t provide a “because” – then they don't really understand it.
Commenting reinforces the why and strengthens judgement. That built in level of friction – that extra step – makes them truly think about the material they’re potentially adopting rather than just rubberstamping whatever AI gives them.
It also adds accountability and auditability so that if a partner or somebody else at the firm goes through a document to look at what another individual has done, they can say, “Okay, I can see that the junior associate used generative AI here, but she thought about it. She was able to justify why she was willing to adopt this position, and there's a record of what was done and why”.
Making judgement design a reality
Taking this one step further, firms should aim to make discussing work processes standard when a junior associate is turning in work.
When senior partners ask how the work was done, it gives them a chance to see what the thought process was, what tools the junior associate used, and what goals they had. It becomes more of a mentoring discussion about the work and also gives room for concerns to come up naturally without sounding accusatory.
With this approach in place, partners can feel confident putting their stamp of approval on an associate's work – and the associate can feel confident that they have applied their judgement in the work that they're doing.
Policies mark boundaries, but boundaries don't build judgement. The firms that will actually reduce AI risk while capturing its competitive advantage are those that treat judgement as a professional capacity — one that has to be cultivated through deliberate practice, honest conversation, and the humility to recognise what AI simply cannot do well. That's not a policy problem. It's a culture problem, and culture is built one decision and one discussion at a time.