Thank you for the insightful piece—it really made me reflect. It brought to mind a scenario: Counsel relies on AI to generate case law and submits it as part of testimony. Then, the judge uses the same AI to interpret the results and finds them to be inaccurate. You're absolutely right—AI, at this stage, is still a language model, not true general intelligence.
Ultimately, I believe that whatever you submit is your work—accurate or not. I’d be extremely frustrated if my attorney relied on AI-generated content without proper review, only for it to lead to an unfavorable ruling. There’s no question AI has tremendous potential but using it without verifying or understanding the output is a significant risk.
I work in business valuation and have to be very cautious myself. I primarily use AI to streamline redundancies, but when it comes to core analysis and conclusions, the responsibility—and liability—remains entirely with me.
It's not just the hallucination problem, but there's a longer-term or more dynamic problem, which is that over-reliance on LLMs may lead to an erosion of lawyers' ability to do the work themselves, which in turn will mean they're not as good at working with the LLMs and monitoring their output. I was talking recently with a lawyer who has a lot of experience working with AI in practice and he gave the example of drafting a conflicts waiver. A commercial-grade AI system can do a not-bad job preparing a first cut through a waiver, but an experienced lawyer still needs to be able to review it and tune it up. A lawyer with preexisting, old school, not-AI-derived experience can do that. A lawyer who doesn't have that foundation will not be able to work as effectively with the technology. That puts too much trust in the technology, at least at this stage in its development.
Thanks for the article. Very interesting and also concerning.
I just read the paper discussing verification drift, it seems the author is not generalising verification drift to any piece of technology, but only GenAI.
My understanding of verification drift is that: There is an emphasis on users who are aware of GenAI limitations (hallucination) and understand the need to verify the outputs. Despite this, given the GenAI's tone, users find the output convincing and drift away from verifying it, hence 'verification drift'.
In another piece by the same author, I read that he believes various factors contribute to verification drift. He says those who use GenAI don't use it once a day; they use it frequently. The burden of verifying the outputs every time they use GenAI, given that they sound credible, means that despite knowing that they should verify it, they sometimes decide not to do so.
He also elaborates in his study that the burden of verification is considerable, as evidenced by his experiment, where he sometimes had to spend a few hours verifying AI-generated content.
Thanks - this is very helpful. I haven't dug that deeply into the literature on this yet, but it struck me as a reasonable explanation for something that otherwise seems inexplicable, i.e. that the rate of submitting briefs with hallucinated citations seems to be going up, not down, despite a drumbeat of reporting on it. (I don't think a week goes by that Law360 doesn't have another report of a sanctions order entered by a judge relating to fake citations generated by AI.) I'm not a human factors engineer by any stretch, but I do follow the seemingly endless conversation about automation dependency in aviation - hence the reference to children of the magenta line. Based on what I know from that context about increasing familiarity with sophisticated autoflight systems contributing to an erosion of situational awareness by pilots, it seems plausible that something similar is going on as lawyers increase the integration of AI into their workflow.
Brad, while I’m relieved to learn of this lawyer discipline, I think sanctioning lawyers for AI breaches is pretty easy compared to sanctioning lawyers who, for example, misrepresent the constitutionality of certain of President Trump’s Executive orders or mislead the courts as to the Administration’s compliance with court orders. These are bigger fish to fry—and discipline—in my view.
The birthright citizenship case heard in Seattle comes to mind, where Judge Coughenour found Executive Order #14160 to be “a blatantly unconstitutional order” and could not understand “how a member of the bar could state unequivocally that this is a constitutional order.” Yet I am unaware of any discipline initiated against the DOJ attorney(s) attempting to make that case. And what about the continuing misdirection proffered by DOJ attorneys in the Abrego Garcia case? At least the lawyer foolishly working a maritime law case gave an honest answer when queried by the court.
This piece also reminded me of the first day of pleading and procedure class back in law school (fifty+ years ago). The case to be discussed was Pennoyer v. Neff and the instruction had been posted two days earlier. My friend, Frank, was the first to be called upon and, having not read the case (or even the Gilbert syllabus), foolishly and hilariously tried to charm his way through an answer. The prof was surprisingly kind to Frank, while the rest of were absorbing at least two lessons: (1) always be prepared; and (2) at least be honest in your response.
Finally, could not help but chuckle over some of the terms identified in your piece. Maybe my favorite is “verification drift.” Why do we lawyers so often try to soften the blow and/or over-explain something that is pretty obvious even to the high school sophomore? “Verification drift” is a disingenuous way to describe simple laziness and, in the case of attorneys, breach of duty. It would be easier and more to the point to just say “check your citations—it’s your duty as an officer of the court.” Just saying.
I agree that these fake-citation cases are much easier for discipline. I've always been frustrated by the extreme reluctance of judges to impose Rule 11 sanctions in anything but the most flagrant cases. Some of the Sidney Powell cases, including the 10th Cir. case, are the rare exception that prove the rule.
Maybe it comes from having colleagues who are trained as social psychologists, or maybe it's a result of teaching a business ethics class from a moral psychology point of view, but I find it helpful to have an explanation in terms of a psychological mechanism rather than simply stigmatizing it as laziness. Putting it differently, an interesting question would be why lawyers give into laziness when they should know the consequences of submitting fake citations to a court are very severe. The idea of being lulled into a false sense of security by increasingly familiar technology is a reasonable explanation. There's a lot of human behavior that is a real puzzle if we assume people act rationally, but which makes sense if we look for patterned irrational behavior. The post-Kahneman & Tversky literature is very helpful here, but I didn't want to make the post even longer. I thought the Australian study was helpful in explaining what would otherwise be difficult-to-explain errors.
Thank you for the insightful piece—it really made me reflect. It brought to mind a scenario: Counsel relies on AI to generate case law and submits it as part of testimony. Then, the judge uses the same AI to interpret the results and finds them to be inaccurate. You're absolutely right—AI, at this stage, is still a language model, not true general intelligence.
Ultimately, I believe that whatever you submit is your work—accurate or not. I’d be extremely frustrated if my attorney relied on AI-generated content without proper review, only for it to lead to an unfavorable ruling. There’s no question AI has tremendous potential but using it without verifying or understanding the output is a significant risk.
I work in business valuation and have to be very cautious myself. I primarily use AI to streamline redundancies, but when it comes to core analysis and conclusions, the responsibility—and liability—remains entirely with me.
So true. I admit I had no idea until I read your piece and started doing more investigation. I also found this LinkedIn post (and comments) about the increasing number of cases despite the media reflecting on the misuse of GenAI in courts: https://www.linkedin.com/posts/amy-salyzyn-a86264222_may-16-2025-update-at-httpslnkdinebzb5zwj-activity-7328530979856289792-Sptx?utm_source=share&utm_medium=member_desktop&rcm=ACoAABQj5aMBokw7F5sOOs4SesqKA84iL_wdKZQ
Thanks so much for sharing these with us.
It's not just the hallucination problem, but there's a longer-term or more dynamic problem, which is that over-reliance on LLMs may lead to an erosion of lawyers' ability to do the work themselves, which in turn will mean they're not as good at working with the LLMs and monitoring their output. I was talking recently with a lawyer who has a lot of experience working with AI in practice and he gave the example of drafting a conflicts waiver. A commercial-grade AI system can do a not-bad job preparing a first cut through a waiver, but an experienced lawyer still needs to be able to review it and tune it up. A lawyer with preexisting, old school, not-AI-derived experience can do that. A lawyer who doesn't have that foundation will not be able to work as effectively with the technology. That puts too much trust in the technology, at least at this stage in its development.
Thanks for the article. Very interesting and also concerning.
I just read the paper discussing verification drift, it seems the author is not generalising verification drift to any piece of technology, but only GenAI.
My understanding of verification drift is that: There is an emphasis on users who are aware of GenAI limitations (hallucination) and understand the need to verify the outputs. Despite this, given the GenAI's tone, users find the output convincing and drift away from verifying it, hence 'verification drift'.
In another piece by the same author, I read that he believes various factors contribute to verification drift. He says those who use GenAI don't use it once a day; they use it frequently. The burden of verifying the outputs every time they use GenAI, given that they sound credible, means that despite knowing that they should verify it, they sometimes decide not to do so.
He also elaborates in his study that the burden of verification is considerable, as evidenced by his experiment, where he sometimes had to spend a few hours verifying AI-generated content.
Thanks - this is very helpful. I haven't dug that deeply into the literature on this yet, but it struck me as a reasonable explanation for something that otherwise seems inexplicable, i.e. that the rate of submitting briefs with hallucinated citations seems to be going up, not down, despite a drumbeat of reporting on it. (I don't think a week goes by that Law360 doesn't have another report of a sanctions order entered by a judge relating to fake citations generated by AI.) I'm not a human factors engineer by any stretch, but I do follow the seemingly endless conversation about automation dependency in aviation - hence the reference to children of the magenta line. Based on what I know from that context about increasing familiarity with sophisticated autoflight systems contributing to an erosion of situational awareness by pilots, it seems plausible that something similar is going on as lawyers increase the integration of AI into their workflow.
Brad, while I’m relieved to learn of this lawyer discipline, I think sanctioning lawyers for AI breaches is pretty easy compared to sanctioning lawyers who, for example, misrepresent the constitutionality of certain of President Trump’s Executive orders or mislead the courts as to the Administration’s compliance with court orders. These are bigger fish to fry—and discipline—in my view.
The birthright citizenship case heard in Seattle comes to mind, where Judge Coughenour found Executive Order #14160 to be “a blatantly unconstitutional order” and could not understand “how a member of the bar could state unequivocally that this is a constitutional order.” Yet I am unaware of any discipline initiated against the DOJ attorney(s) attempting to make that case. And what about the continuing misdirection proffered by DOJ attorneys in the Abrego Garcia case? At least the lawyer foolishly working a maritime law case gave an honest answer when queried by the court.
This piece also reminded me of the first day of pleading and procedure class back in law school (fifty+ years ago). The case to be discussed was Pennoyer v. Neff and the instruction had been posted two days earlier. My friend, Frank, was the first to be called upon and, having not read the case (or even the Gilbert syllabus), foolishly and hilariously tried to charm his way through an answer. The prof was surprisingly kind to Frank, while the rest of were absorbing at least two lessons: (1) always be prepared; and (2) at least be honest in your response.
Finally, could not help but chuckle over some of the terms identified in your piece. Maybe my favorite is “verification drift.” Why do we lawyers so often try to soften the blow and/or over-explain something that is pretty obvious even to the high school sophomore? “Verification drift” is a disingenuous way to describe simple laziness and, in the case of attorneys, breach of duty. It would be easier and more to the point to just say “check your citations—it’s your duty as an officer of the court.” Just saying.
I agree that these fake-citation cases are much easier for discipline. I've always been frustrated by the extreme reluctance of judges to impose Rule 11 sanctions in anything but the most flagrant cases. Some of the Sidney Powell cases, including the 10th Cir. case, are the rare exception that prove the rule.
Maybe it comes from having colleagues who are trained as social psychologists, or maybe it's a result of teaching a business ethics class from a moral psychology point of view, but I find it helpful to have an explanation in terms of a psychological mechanism rather than simply stigmatizing it as laziness. Putting it differently, an interesting question would be why lawyers give into laziness when they should know the consequences of submitting fake citations to a court are very severe. The idea of being lulled into a false sense of security by increasingly familiar technology is a reasonable explanation. There's a lot of human behavior that is a real puzzle if we assume people act rationally, but which makes sense if we look for patterned irrational behavior. The post-Kahneman & Tversky literature is very helpful here, but I didn't want to make the post even longer. I thought the Australian study was helpful in explaining what would otherwise be difficult-to-explain errors.