Vals AI’s latest Legal Research Benchmarking Report compares Alexi, Counsel Stack, Midpage, and ChatGPT against a lawyer control group, revealing AI tools now rival—and often exceed—human accuracy.
Vals AI has released its highly anticipated Legal Research Benchmarking Report, extending its Vals Legal AI Report (VLAIR) series and testing how well artificial intelligence performs on real-world legal research tasks.
The study evaluated Alexi, Counsel Stack, Midpage, and ChatGPT across 200 U.S. legal research questions, measuring performance against a lawyer baseline. Yet notably absent from participation were the industry’s largest players—Harvey, CoCounsel, vLex, and LexisNexis—raising questions about transparency among market leaders.
Despite their absence, the results suggest AI is closing in on, and in many cases surpassing, traditional lawyer research performance. All AI products outperformed the lawyer baseline across three key metrics—accuracy, authoritativeness, and appropriateness—with overall scores ranging from 74% to 78%, compared to 69% for human lawyers.
Counsel Stack emerged as the top performer across all criteria, followed closely by Alexi and Midpage. Even ChatGPT, a general-purpose AI without access to proprietary legal databases, outperformed the human control group.
Vals AI found that both legal and generalist AI tools can produce highly accurate answers, with little difference between them on factual correctness. The key differentiator remains citations and sourcing, where specialized legal AI platforms held a modest lead due to access to vetted legal databases.
However, the study notes that gap may soon narrow as OpenAI’s “Deep Research” feature—announced earlier this year—becomes widely adopted. “Sources and citations are the differentiators for legal AI, for now,” the report states.
While AI excelled in single-jurisdiction questions, it faltered on complex, multi-jurisdictional matters such as 50-state surveys. In those cases, both humans and machines struggled to deliver comprehensive, authoritative responses.
Still, in 75% of all questions, AI outperformed the lawyer baseline, often by a margin of more than 30 percentage points. “Where lawyers may struggle in correctly and completely answering legal research questions, AI can provide significant support,” the report concludes.
The VLAIR team emphasized that no participating product was optimized for the dataset and that all responses were independently evaluated by lawyers and law librarians for accuracy, authority, and clarity.
As legal AI adoption accelerates, the findings reinforce a broader shift underway in the profession: AI is no longer just a support tool—it’s becoming a research partner.


