GenAI Archives - ��VR��Ƶ Institute

CoCounsel Legal Monthly Insider

jeffrey.mccoy@thomsonreuters.com — Thu, 26 Feb 2026 01:53:01 +0000

Accelerating Legal Workflows with Agentic AI

Building on the excitement of ��VR��Ƶ announcing that more than one million professionals have chosen CoCounsel, the company’s professional-grade AI, this month’s CoCounsel Legal release brings a wave of new capabilities to streamline research, drafting, and document review for legal professionals. These updates reflect the principles driving our roadmap: Agentic AI grounded in deep legal expertise, rooted in your own knowledge and workflows, and built to elevate the way modern legal teams operate.

Agentic AI Grounded in Deep Legal Expertise

Westlaw Advantage Canada with Deep Research

Westlaw Advantage applies agentic AI to trusted authoritative content, acting like an expert researcher to help you quickly move from research to strategy. At the heart of Westlaw Advantage is Deep Research, the legal industry’s first professional-grade agentic AI research capability. By combining advanced AI with Westlaw’s unmatched content library, it streamlines complex research in English or French, delivering comprehensive coverage while significantly reducing manual research time. Westlaw Advantage Canada is more than just a research solution; it’s a strategic element of CoCounsel, our AI technology, available to every Canadian legal professional.

Westlaw Advantage Canada Deep Research Report

Deep Research in Practical Law US

Deep Research in Practical Law brings a smarter, more intuitive approach to using all the know-how resources. It feels less like searching Practical Law and more like working alongside a Practical Law editor who has instant command of every resource and tool at your disposal. Powered by agentic AI and grounded in Practical Law’s trusted and up-to-date content, it automatically plans research steps, pulls the most relevant guidance and templates, and delivers a clear, well-supported research report. Deep Research reviews multiple Practical Law resources, synthesizing the key guidance, and iteratively exploring further until it has exhausted its work, eliminating manual tedious, multi-step workflows—helping you move from “how to?” to “here’s how” faster. And, with easy access to the underlying sources for verification, you can move confidently to the next phase of your matter.

Deep Research in Practical Law

Deep Research Verification Tools for US

Verifying AI generated research shouldn’t slow attorneys down—and now it doesn’t have to. CoCounsel Legal’s built in verification tools surface clear statements alongside their supporting sources, use AI to assess how well those sources align, and apply adversarial review to highlight potential gaps or missing context. By bringing together supporting evidence, counter perspectives, and direct connections to Key Numbers and KeyCite, attorneys can quickly validate research, strengthen their arguments, and stay confidently in control—without losing momentum.

Deep Research Verification Tools

Rooted in Your Knowledge

Draft a New Agreement Using Your Precedent Document (Beta)

This AI-powered capability provides transactional attorneys the ability to create a comprehensive first drafts in minutes. Upload a trusted precedent document, describe specific needs, and receive a complete, multi-page draft that follows the firm’s structure and style. This transforms hours of drafting work into minutes, delivering relevant drafts that reflect firm standards allowing attorneys to focus more on strategy.

Draft a new agreement using your precedent document

Create a Lease Agreement Abstract

First released in the U.S. and now available in the UK, Canada, and Australia, this capability efficiently produces concise, tabular summaries of lease agreements using a user-specified template. Upload lease documents and receive structured summaries in Markdown or Microsoft Word format, complete with instant citations to relevant contract sections for enhanced accuracy and rapid verification.

Create a Lease Agreement Abstract

Benchmark Document Against Standard

Now live in the UK, Canada, and Australia, as well as the U.S., this tool lets users assess a negotiated document by comparing it against a user-uploaded “ideal contract” to highlight missing clauses or key deviations. It enhances alignment and risk management by identifying gaps and offering practical recommendations that help streamline negotiations.

Benchmark Document Against Standard

Syncly Box and Dropbox Integrations

CoCounsel Legal now integrates with Dropbox and Box through Syncly, enabling secure document import. This simplifies workflows by bringing documents stored across systems into one continuous workflow for analysis, research, and drafting.

Syncly Box and Dropbox integrations

Customise a Practical Law Agreement

Now available in the UK, Canada, and Australia alongside the US, this workflow enables legal professionals to tailor a Practical Law standard document according to specified instructions and provisions of a term sheet. By incorporating deal-specific details into template agreements and utilizing expertly drafted Practical Law content as a foundation, it significantly streamlines the drafting process.

Customise a Practical Law Agreement

Built for how you work

Tabular Analysis

Now available in the U.S., with UK and Canada coming soon, transforms high-volume document review by allowing users to process up to 10,000 documents and 100 questions in a dynamic, filterable table. With the ability to modify reviews in progress and run multiple tables simultaneously, it offers unmatched scalability and efficiency, enabling legal professionals to focus on strategic, high-value tasks.

Tabular analysis

Find Practical Law Drafting Language – UK

Quickly find drafting language from trusted content using a simple prompt within Practical Law Search & Summarise or CoCounsel. This capability streamlines the drafting process by helping users locate specific, verifiable drafting language powered by Practical Law’s expert-written content.

Find Practical Law Drafting Language – UK

Outline Case File

This litigation-focused skill for the US examines entire case records to identify relevant documents, extract key facts, and create an outline connecting facts to claims. It significantly reduces review time, helping professionals quickly get up to speed and identify strengths, weaknesses, and gaps in the factual record.

Outline Case File

Explore these new CoCounsel Legal features today

Sign in to CoCounsel Legal today to enhance the speed and effectiveness of your research, document analysis and drafting. Or, explore training options at the .

To keep up to date on new enhancements, sign up for the ��ٴǻ岹��.

CoCounsel Monthly Insider: Sharpening Your Competitive Edge

jeffrey.mccoy@thomsonreuters.com — Wed, 17 Sep 2025 20:22:59 +0000

Driven by our commitment to our customers, each month, ��VR��Ƶ is delivering enhancements to CoCounsel Legal and additional solutions, making them more intuitive, customizable, and effortless to use. In this September edition, we spotlight the latest updates, featuring major upgrades and subtle refinements, designed to boost efficiency and support the delivery of exceptional, high-quality work.

Redesigned drafting capabilities unify CoCounsel tools, content, expertise, and workflows

Informed by customer feedback, we’ve reimagined the legal drafting experience – making it more intuitive, intelligent, and seamlessly integrated. The drafting capabilities in CoCounsel fuse users’ institutional knowledge with trusted ��VR��Ƶ content and AI-powered technology to expedite the legal drafting process. The redesigned homepage puts everything users need right at their fingertips – CoCounsel Chat, skills, and powerful litigation and document analysis tools – all in one clean, intuitive space. Eliminating the need to jump between tabs or hunt for resources, it’s now an even smoother, faster experience that lets users stay focused, work smarter, and get more done with less friction.

Drafting homepage

Live Draft brings the ability to summarize and modify a document using natural language in Word. Live Draft also delivers contextual awareness of the document and understanding of the content and structure, so every suggestion and edit is tailored to the content. This helps to further reduce time spent producing a final draft, by delivering more accurate, relevant suggested changes.

Live Draft

Append Authorities enables users to combine all cited documents into a single file suitable for court use, reducing the risk of errors and increasing efficiency. Every cited document is linked for verification purposes, and a hyperlinked table of contents is included.

Append Authorities

Region settings customizes CoCounsel tools for global legal professionals

The new region settings capability enables users to select their geographic preference from U.S., UK, Australia or Canada. Based on the selected region, region settings will automatically adjust tools and prompts in the CoCounsel Library making the work product more relevant. Users can now automatically tailor their documents using specific regional requirements, including for the UK and Australia, British English spelling variations, legal terminology, grammar, and content formats. Similarly, this will be coming soon for Canadian English. Additionally, CoCounsel Library is now available in the UK, Canada and Australia.

HighQ integrates CoCounsel AI for intuitive, conversational client data access

With CoCounsel’s Search a Database skill embedded within HighQ, this customer-driven development allows clients the ability to pose queries regarding their data and receive summarized, highly relevant answers. Sourced from pre-approved content within their site, clients can quickly review summaries, generate reports, and make informed decisions without waiting for manual responses.

Legal Tracker adds AI-powered capabilities

Legal Tracker’s new AI features help users manage legal spend more efficiently. The AI-powered PDF-to-LEDES converter and invoice review speed up invoice evaluation and ensure accurate billing. An AI assistant also streamlines reporting and reduces manual data handling.

Legal Tracker

These transformative features reinforce our commitment to empowering legal professionals with the tools and solutions they need to excel. or to see firsthand how they elevate work to new heights.

To stay abreast of newly added features, monthly releases, and more, please sign up for the .

The AI Implementation Gap Must Be Closed

jeffrey.mccoy@thomsonreuters.com — Mon, 15 Sep 2025 19:33:07 +0000

Law firms have shown they are very bullish on AI. Rightly so, when it comes to the core elements of the legal workflow – researching case law, pouring over documents to find the needles in the haystack, and drafting standardized documents like contracts, policies, and discovery requests – the agentic and generative AI (GenAI) solutions available today are helping firms cover more ground faster and more comprehensively than ever before possible.

Nearly half (47%) of law firm respondents from the Future of Professionals Report say their firms are already experiencing at least one type of benefit from AI adoption and 80% expect AI to fundamentally alter the course of their business over the next five years. Chief among those is time savings. On average, law firm professionals expect to free up 190 hours per year by using AI. At current average hourly rates, that works out to approximately $18,000 in savings per professional, per year – or a total of $20 billion for the U.S. legal industry.

Perception vs. Reality

For all the enthusiasm that exists for AI’s potential, however, there is a large gap emerging between law firms’ AI aspirations and their real-world AI strategies. Even though the majority of law firms expect AI to drive transformational change in the future and nearly half are experiencing some benefits now, far fewer (29%) expect to see high or transformational levels of change this year. When pressed further on what their firms are doing today to leverage AI, nearly one-third (32%) of law firm respondents say they feel their firms are moving too slowly on AI adoption, and just 22% say their firms have a visible AI strategy in place.

This gap between future ideals and current realities is a phenomenon “the GenAI paradox,” which occurs when businesses race to invest in AI pilot projects and buy new solutions, but struggle when it comes to implementing them and integrating them into everyday workflows. Versions of this struggle are playing out in virtually every industry right now as professionals come to grips with the fact that true transformation is not as simple as plugging in an off-the-shelf AI tool. It requires a clear strategy, a carefully planned roadmap, targeted integration of professional-grade AI solutions, and a commitment at all levels for the long haul. A firm cannot afford to sit on the sidelines any longer – it is imperative to have an AI strategy.

Key Steps to True AI Transformation

Over the course of our partnerships helping some of the world’s largest law firms not only access new AI capabilities, but , we’ve found four levers that all firms need to engage to ensure the success of their AI initiatives.

AI Tools Without an AI Strategy will Never Reach Their Potential: Among the 22% of law firms that currently have a visible AI strategy in place, 71% are already experiencing a clear return on investment from AI. By contrast, for those firms that do not have a clear AI strategy in place, just 18% are experiencing a return on investment. That means law firms with a visible AI strategy are almost four times more likely to experience benefits compared to firms without any significant plans for AI adoption.
AI Leadership Comes from the Top: Law firms helmed by leaders who lead by example when introducing change, firms that have added new governance roles, and those that are actively investing in AI are consistently seeing more benefits than those that don’t. For AI to truly add value, it needs to be implemented firm-wide, and that kind of sweeping change can only come with leadership support, clear goals and objectives, and widespread adoption.
Operations is Where the Hard Work Happens: Firm-wide AI integration is impossible without first understanding the need to change and reimagining workflows. To extract maximum value, AI-powered tools must be built directly into existing systems and processes. That requires making transformative changes to underlying business models, including how firms price, staff, and deliver legal work, and how they adapt related workflows and processes, while adding new roles and skills to support their operations.
End-User Adoption: The best AI technology and most well-thought-out strategy in the world will not mean anything if no one uses it. When users within law firms understand AI and feel empowerment, ownership, and accountability for its use, their law firms see results not only in terms of higher levels of AI adoption but in the additional benefits and ROI that they gain as well. Firms need to make sure they are educating staff, making tools readily available, and allowing time for a learning curve to take root.

While the detailed strategies and specific paths to AI implementation will vary from firm to firm, there are a handful of universal truths that apply to all. Foremost is the commitment to address the AI revolution for what it really is – a monumental transformation in the way legal work is conducted on par with the introduction of the personal computer, the internet, and the smartphone. It is not enough just to buy the latest greatest widget. Firms that want to extract real value from AI need to think hard about how it will affect everything they do and start addressing those changes now to unlock the full potential of the technology to transform their firm.

The Must-Have Skill for First Year Legal Associates: Adaptability

jeffrey.mccoy@thomsonreuters.com — Wed, 27 Aug 2025 15:35:43 +0000

As summer associate programs come to an end, another new generation of first year law firm associates is entering the professional workforce. While this historic rite of passage has deep roots in the rich tradition of apprenticeship and professional development that has sustained the legal profession for centuries, first-year associates face an unprecedented challenge: succeeding in an industry transforming at breakneck speed.

The most important skill new lawyers will need to thrive isn’t encyclopedic knowledge of case law, razor-sharp logic, or the ability to tolerate long hours – it’s adaptability. But what does that actually mean for a first-year associate starting their first job at a law firm? It means approaching the role as a continuous learner rather than someone who has “arrived” with their JD in hand.

Building An Adaptability Toolkit

Developing adaptability isn’t about becoming a generalist who can do everything – it’s about becoming someone who can quickly learn what each situation requires. Start by cultivating core habits and building your adaptability toolkit by:

Embracing technology: View legal technology as a collaborator, not a threat. The associates who thrive will be those who become fluent in leveraging the latest tools to produce higher-quality work more efficiently. According to a recent study, 28% of law firms are already using generative AI (GenAI) in their practices and 93% say GenAI will be a central part of their organization’s workflow within the next five years. So next time your firm introduces new research platforms or AI tools, volunteer to part of the beta test rather than waiting for mandatory training. This positions you as someone who embraces innovation rather than fears it.

Seek feedback on efficiency, not just accuracy: An adaptable associate might ask: “I completed this document review in eight hours – what would help me provide the same quality analysis in six hours next time?” This mindset shifts from time-based to value-based thinking, positioning new lawyers for success in an evolving market – and AI can help make this a reality by completing select tasks that take valuable time.

Participating in new experiences early in your career: Volunteer for cross-practice projects, client secondments, or firm innovation initiatives. The broader your exposure to different ways of practicing law, the more adaptable you’ll be when change inevitably happens. For example, if you’re a corporate associate, volunteer to help the litigation team with a discovery project – you’ll learn how the contracts you draft might be evaluated in disputes.

For first year associates to prove their value in this new world, it will require more than just hard work. You should put your energy toward higher-value tasks that put an emphasis on soft skills like adaptability, creativity, leadership, curiosity, and tech fluency. The work is no longer just about the volume of output produced; it is about leveraging all available resources – including technology – to achieve successful outcomes.

Lead Change, Don’t Just Follow It

To that end, adaptability remains the thing that will quickly separate the future leaders from the rest of the pack of first-year associates entering the legal workforce. While versatility as a skillset is not as easy to quantify as billable hours, it is a far more accurate measure of how well new attorneys are able to recognize new opportunities, embrace creative problem-solving, and adopt new approaches focused on achieving the best end result. This is the factor that will help law firms navigate the next decade of transformation – and the individuals prioritizing adaptability early in their careers will set themselves up as key drivers of the firm’s overall success.

Clients are demanding more value, transparency, and efficiency. Legal technology – from AI-assisted research tools to automated contract analysis – is redefining how law firms deliver services.

Today’s most effective legal professionals are those willing to rethink entrenched practices, test new technologies, and collaborate in ways that break down traditional silos. Adaptability in this context means not just keeping up with change but leading it. Start building your adaptability toolkit today, and you’ll be positioned to lead tomorrow’s legal innovations rather than scramble to catch up with them.

Forecasting the Future of the Law Firm

jeffrey.mccoy@thomsonreuters.com — Wed, 05 Mar 2025 19:55:59 +0000

In the ever-evolving landscape of the legal industry, law firm leaders find themselves at a critical juncture, facing unprecedented challenges and opportunities. As we engage in conversations with leaders from global law firms, AmLaw 200 firms, and major independent law firms, a common thread emerges: the pressing need for decisive action in an environment of rapid change.

At the forefront of this transformation is the rise of generative AI (GenAI), which promises to reshape the very foundations of legal practice. This technological revolution is not just another trend; it’s poised to become the most influential force shaping law firms over the next five years, surpassing even economic factors and geopolitical instability in its potential impact. Looking at the implications of GenAI on the law firm business model, it becomes clear that the time for passive observation has passed. In today’s legal industry, there is simply no room for bystanders.

white paper looks at the current environment and forecasts the next three waves of how AI-driven technologies will reshape the legal industry.

Wave 1: Optimization of legal workflows

Law firms are increasingly pressured to adopt AI to reduce costs as their clients embrace these technologies, leading to shifts in cost structures and hiring practices. Despite these changes, the demand for legal services continues to grow as clients face business disruptions due to AI, prompting the need for new legal support. As AI adoption becomes more widespread, it is expected to significantly impact pricing strategies and workforce dynamics in the legal industry.

Wave 2: Legal market disruption and law firm re-engineering

Law firms are adopting more technology and project management strategies, which leads to fewer lawyers being hired and a re-engineering of business models to stay competitive. Legal departments are keeping more work in-house, but smaller firms can leverage AI to handle larger, complex tasks. The competitive landscape is evolving, with middle-market firms feeling pressure as routine work becomes productized and new AI-enabled services enter the market.

Wave 3: Disruption of legal services landscape and AI winners emerge

The use of AI in the legal industry will lead to a shift in how clients interact with law firms, with top-tier firms focusing on high-stakes and complex matters, while smaller firms move up the value chain with AI-enabled solutions. New AI-powered delivery models and self-serve legal products will transform the way legal services are bought and delivered, potentially leading to consolidation in the middle of the market. Ultimately, AI will have a profound impact on the law firm of the future, but it will work best when it complements, rather than substitutes for, legal professionals.

It’s crucial for law firm leaders to recognize that the emergence of AI and GenAI signifies a real and fundamental shift in the legal landscape, impacting how legal work is done. As these technologies promise to transform law firm operations, firms already grappling with pricing, talent, and competition must proactively manage AI adoption.

By addressing the interconnected challenges of client communication, talent acquisition, and AI-driven service pricing, firms can navigate the coming changes and avoid being left behind in this technological revolution.

Download your full copy of white paper.

Beauty Is in the AI of the Beholder

jeffrey.mccoy@thomsonreuters.com — Wed, 26 Feb 2025 17:04:29 +0000

“Welcome to the era of the AI superlative. While the first two years of generative artificial intelligence (GenAI) development were an all-out sprint to create new models, establish proof-of-concept solutions, and define optimal use cases, the next phase to deliver increased efficiency and better work product to clients in the AI lifecycle will be dominated by marketing as well.”

Raghu Ramanathan, president of Legal Professionals at ��VR��Ƶ, opened with these statements and shared his view on industry benchmarks in an article on Above the Law titled .

He noted as more companies develop AI solutions and start-ups seek capital investment, customers will look for benchmarks to evaluate these tools. ��VR��Ƶ does see value in benchmarking, however, Ramanathan added benchmarks must measure products the way they’re designed to be used and should focus on results customers care about.

“The challenge is that one-dimensional metrics do not offer a reliable representation of the real value of GenAI in the legal research process,” stated Ramanathan. “No LLM-based legal research products in the market today provide answers with 100% accuracy, so users must engage in a two-step process of 1) getting the answer and 2) checking the answer for accuracy.”

Chief Legal Operations Officer Meredith Williams-Range from Gibson, Dunn & Crutcher LLP discussed how they are using and seeing results from AI-enabled resources. “There is a widespread misperception around how law firms are using AI and how we conduct legal research. We are not bringing in AI and saying: ‘Go do all the research and write a brief,’ and then replacing all of our junior associates with automated results. We’re using AI-enabled tools that are integrated directly into the research and drafting tools we were using already, and, as a result, we’re getting deeper, more nuanced, and more comprehensive insights faster. We have highly trained professionals doing sophisticated information analysis and reporting, augmented by technology.”

Read the full article on , and as Ramanathan concludes, “the value of legal AI – of any technological innovation for that matter – is in how it gets used in the real world and how well all the different components come together to help lawyers do their jobs more effectively.”

Exploring AI’s Influence on the Legal Profession: Insights from Frost Brown Todd

jeffrey.mccoy@thomsonreuters.com — Thu, 20 Feb 2025 17:01:09 +0000

The legal profession is no stranger to change – noting that the change and its impact on the industry may be viewed very differently. But the rapid evolution of technology, particularly artificial intelligence (AI), is presenting a unique set of opportunities and challenges.

, president of Legal Professionals at ��VR��Ƶ, recently hosted his inaugural podcast episode with Cindy Thurston Bare, chief data and innovation officer, and Kayla Kotila, senior knowledge & research services manager, from Frost Brown Todd.

Highlights from the conversation include:

Adoption and Use Cases: Frost Brown Todd has approximately 250 legal professionals actively integrating generative AI into their daily operations with impressive results. This adoption is not limited to specific tasks; instead, AI is being woven into existing workflows, signaling a fundamental shift in how legal work is conducted.

Measuring Success and Adoption: Success is measured by client satisfaction and efficiency improvements, and the firm uses qualitative feedback and quantitative methods, like A/B testing, to assess AI’s impact. Adoption is widespread across different demographics, with varying use cases depending on experience levels. Junior lawyers are leveraging AI to streamline certain tasks, while partners are using it to enhance their recall and decision-making processes.
Client-Centric Innovation and Collaboration: One key measure of success for technological implementation lies in its ability to enhance client service. From expediting document review processes to uncovering critical insights for litigation, AI enables the firm to deliver faster, more accurate, and ultimately more valuable services. The importance of open communication with clients regarding their AI initiatives is key. Sharing success stories, addressing concerns, and exploring potential applications collaboratively fosters trust and ensures that AI implementation aligns with client needs and expectations that benefit both parties.

Technology continues to move forward and law firms must prioritize a culture of innovation, continuous learning, and client-centricity to thrive in an increasingly complex and competitive environment. The future of law is not about replacing lawyers with machines but rather empowering them with the tools and knowledge to deliver exceptional legal service.

As part of the Clarity podcast series from the ��VR��Ƶ Institute, Ramanathan will speak with customers, industry experts and colleagues, bringing perspectives from legal leaders and subject matter experts shaping the industry. The conversations aim to highlight the innovations driving the legal profession as well as the people and organizations implementing new technologies and approaches to maintain a competitive edge in the rapidly changing market.

You can listen to the full conversation on either or .

��VR��Ƶ Best Practices for Benchmarking AI for Legal Research

jeffrey.mccoy@thomsonreuters.com — Wed, 12 Feb 2025 15:38:20 +0000

At ��VR��Ƶ, we do an enormous amount of AI testing in our efforts to improve our customers’ ability to move through legal work faster and more effectively. We’ve noticed an increase in interest in AI testing generally, and in benchmarking AI applications for legal research specifically. We’ve learned a lot in our thousands of hours of AI testing, as such we offer the following best practices for those interested in considering an updated or differentiated approach when testing or benchmarking AI for legal research.

1. Test for the results you care about most.

This would seem obvious, but we’ve seen a lot of confusion about it, and if we could only make one recommendation, this would be it. It’s foundational for all other recommendations.

If you cared most about determining how long it takes to drive from one place to another, you wouldn’t just measure highway time, you’d measure total door-to-door time. If you cared most about car maintenance costs, you wouldn’t just measure the cost and frequency of brake repairs and maintenance.

With the use of AI for legal research, there are no LLMs nor any LLM-based solutions that offer 100% accuracy. Because of that, all answers generated by large language models or LLM-based solutions, even if they use Retrieval Augmented Generation (RAG), must be independently verified.

Some assume verification is a simple matter of checking the sources cited in an AI answer, but this is incorrect. We’ve seen plenty of examples where an AI-generated answer is wrong, and the cited sources simply corroborate the wrong answer. Verification requires using additional tools (like a citator, statute annotations, etc.) to ensure the answer is correct.

This means every time an AI-generated answer is used for research, there is a three-step process the researcher must engage in: (1) review the answer, (2) review the cited material from the answer, (3) use traditional research tools to make sure the answer and cited material are correct.

When we talk with researchers about research generally and this process specifically, what they care about most is (a) getting to a correct answer or understanding of the relevant law, and (b) the time it takes to get to that correct answer or understanding.

Because of this, the two most important measures are:

Percentage of times using this three-step process the user can get to the right answer, and
Time it takes to complete all three steps

Surprisingly, the percentage of errors in answer in step 1 can have very little impact on the percentage of correct answers by the researcher using all three steps or the time to complete those steps (unless errors are excessive), as long as citations and links to primary law are good and those primary resources are current and easily verified. Focusing on step one is like trying to figure out door-to-door times by measuring highway speeds only. It’s not very useful.

For instance, which of the following systems would you rather use?

System where the initial AI answer is 92% accurate, but verification, on average, takes 18 minutes, and post-verification accuracy is 97%, or
System where the initial AI answer is 89% accurate, but verification, on average, takes 10 minutes, and post-verification accuracy is 99.9%

It’s a clear choice, but there is often a misplaced focus on measurement of the first step in the process to the exclusion of steps two and three. Measure what you care about most.

2. Use realistic, representative questions in your testing.

Presumably you want to evaluate AI for the typical legal research you or your organization does. For instance, if you look at the research your organization does and find the questions are roughly 20% simple questions, 60% medium complexity, and 20% very complex or difficult, and that roughly half are questions about IP law and half are about federal civil procedure, then a benchmark testing 90% simple questions about criminal law would not be very helpful to you.

At ��VR��Ƶ, we model our testing based on the real-world questions we see from our customers every month. For your own testing, focus on the question types that best represent the researchers you’re focused on.

Testing mostly simple questions with clear-cut answers is easiest for testing, but if those types of questions don’t represent what your users do most (it doesn’t well represent most AI usage in Westlaw), then the results are not particularly helpful. Similarly, if you primarily test overly complex, extremely difficult and nuanced questions – or trick questions, those can be useful for testing the limits of a system, but they tend not to be very helpful for most real-world decision making.

3. Test a lot of questions.

In our own testing, we’ve found that testing small sets of questions is rarely representative of actual performance with a larger set. Large language models can generate different responses each time, even with identical inputs. Additionally, if responses are long and complex, graders may disagree, even when judging identical responses. For just a quick general sense of direction, it’s fine to test with a sample of questions as small as 100 or so, but for comparing algorithms/LLMs against each other, we strongly recommend checking the results as you grade and testing until the measure of interest stabilizes. For example, if you are running a comparison between two systems to see which is preferred, you would test until the rate at which one system is preferred over the other stops changing dramatically with each new batch of questions. Another guide to the number of questions you should test is the confidence level and interval you want (see next section).

4. Calculate and report confidence levels and intervals.

Even with a relatively large set of questions, measurements of accuracy are only so precise. When using these measurements to make decisions, it’s important to understand the degree or range of accuracy of the measurement, often referred to as confidence level and confidence interval. You can think of confidence intervals and levels like margin of error in surveys. It lets you know how reliable or repeatable the measurement is expected to be.

For instance, testing AI accuracy based on 200 questions, if you ran the test again with the same questions/answers but different evaluators, or used the same evaluators but with a different 200 random, representative sample of questions, would you expect the exact same result? Typically, you wouldn’t. You’d expect the result to fall within a certain range, so it’s important to report that range along with the results so decision makers understand the differences between algorithms/LLMs that are meaningful and those that are not meaningful. The proper way to report this is with confidence intervals and levels. You can read more about them . Using standard assumptions, when measuring an error rate of 10% from a sample of only 100 questions, you can be about 95% confident that the true error rate is between 4.1% and 15.9%. This is called a 95% confidence level, and the “+/- 5.9%” is the margin of error. If you measure an error rate of 10% from a sample of 500 questions, the 95% confidence interval would be between 7.4% and 12.6%, or 10% +/- 2.6%.

The basic power analysis to estimate a confidence interval assumes a perfect means of detecting the outcome you are trying to measure. If there is some uncertainty in that detection, e.g., if two independent evaluators disagree about the outcome some percentage of the time, then the margin of error increases. A grading process or measurement that’s unreliable ~5% of the time, might increase the margin of error from 5.9% to 7.3%, in our example above with 100 questions. It’s important to note that there are various methods for calculating standard error, and these examples make simplifying assumptions that likely underestimate the confidence intervals observed in practice.

5. Use a combination of automated and manual evaluation efforts.

Having human evaluators pore through lengthy answers to complex questions can be difficult and time-consuming. Ideally, we would just have AI evaluate the accuracy and quality of answers generated by AI. This is sometimes referred to as LLM as judge. But in the same way that AI makes mistakes when generating an answer, it can also make mistakes when evaluating the quality of an answer against a gold-standard answer written by a human. In our experience, modern LLMs are pretty good at evaluating AI-generated answers against gold-standard answers when answers are clear and relatively short. With length and complexity, we’ve found the LLM as judge approach to be very unreliable.

For instance, has shown that LLMs tend to struggle when evaluating responses to complex and challenging questions like those requiring expert knowledge, reasoning, and math.

Since most test sets will contain a sample of simple/easy/clear questions and answers, it makes sense to use AI for automated evaluation of these, then use human evaluators for the rest, at least until AI improves to the point where more can be automated.

6. For human grading, use two separate human evaluators for each answer, and have a third (ideally more experienced) evaluator to resolve conflicts.

For assessments like these, can be a real issue. In our own testing, we’ve found attorneys evaluating AI-generated answers for more complex legal research questions can disagree about the accuracy or quality of answers about 25% of the time, which makes single-grader evaluation unreliable. To improve reliability, we have two evaluators separately grade each answer, and where there are conflicts, we have a third, more experienced evaluator resolves the conflict.

7. When answers are wrong, investigate to see if the gold-standard answer might be wrong.

In the same way people make mistakes in evaluating answers, they can also make mistakes in coming up with the gold-standard answer for testing. In our experience, we’ve found some instances where the AI-generated answer was evaluated as incorrect when compared to the gold-standard answer, but when we dug into it further, it turned out the AI was correct and the person who put together the gold-standard answer was wrong. Sometimes AI makes mistakes and sometimes humans make mistakes – you should check both.

8. If evaluating multiple algorithms/LLMs/solutions, make sure the evaluators are blind to which algorithm/LLM/solution the answer was generated by.

In our evaluations we try to avoid human bias in grading. Sometimes an evaluator has had bad experiences or great experiences with a certain product or LLM in the past, and we don’t want them to bring that bias to the current evaluation, so when evaluating different solutions, we first strip away anything that would identify the source of the solution, so results are not biased by past positive or negative experiences.

9. Grade the value of answers in addition to making a binary determination of whether the answer has an error.

What’s right or wrong in an answer can vary enormously in terms of positive value and negative impact. For instance, consider the following answers:

A. Answer is correct in every way but is short and high level. It just gives a basic description of the legal issue as it relates to the question but doesn’t provide any references to primary or secondary law for verification, nor any nuance regarding exceptions or other considerations.

B. Answer is lengthy and nuanced, addressing multiple aspects of the question and discussing important exceptions that might apply, and it provides references with citations and links for verification, and it’s correct in every way except in one of the citations, the date is incorrect, but that’s easily verified and corrected when clicking the link from the citation.

C. Answer is incorrect in every way and all its linked references point to primary law that simply corroborate the wrong answer.

If the evaluation is simply a binary view of the number of answers that contain an error, then answer A looks good and answers B and C look equally bad. In reality, answer C is far worse and more harmful than answer B, and Answer B is likely much more valuable to the researcher than answer A.

In our evaluations, we’re looking for answer attributes that are helpful to researchers, like depth of the answer and quality of the references, and we don’t just evaluate errors in a binary way. We consider answers that are totally wrong to be far worse than answers with erroneous statements in otherwise correct and helpful answers. Similarly, we consider erroneous statements in answers based on whether they address the core questions or are tangential to it, and whether they’re contradicted in the answer or easily verified with the linked references. We’d like to eradicate all errors, of course, but some are more harmful than others.

10. Look for errors beyond gold-standard answers.

Often LLMs generate answers with information beyond the scope of a gold-standard answer. For instance, the gold-standard answer might say the answer should state that the answer to the question is no, and it should explain that with X, Y, and Z, and it should specifically cite to cases A & B and statute C.

The LLM-generated answer might state the answer is no and explain X, Y, and Z with references to A, B, and C, but it might also add a few statements about exceptions or related issues or an additional case or statute. Sometimes these additional statements are incorrect, even when everything else is correct. So, if an LLM-as-judge or human evaluator only looks at the gold-standard answer to see if the AI-generated answer is correct, that evaluation can miss errors in the additional material. This means evaluators need to do independent research beyond simply looking at the gold-standard answers to determine if an answer has an error.

11. Consider testing reliability.

LLMs often have some randomness built into them. Many have a temperature setting that can be used to minimize or eliminate this, making answers more consistent when asking the same question multiple times.

But some LLMs are better at this than others, and some integrated solutions that use LLMs in conjunction with other techniques, like RAG, don’t set temperature low to allow for more creativity in answers.

For big decisions you might be making, consider testing reliability by running the same question 20 times and seeing if any of the answers are substantially worse than the other answers to the same question.

The above are our and learnings from our extensive expertise with AI, Gen AI and LLMs over the past 30 years. At ��VR��Ƶ we put the customer at the heart of each of these decisions we make and are transparent that at the point of use all our AI generated answers must be checked by a human.

As we work through testing our AI products, our teams do not follow each of these steps for every test we do, sometimes we prioritize speed over accuracy of testing or vice versa, but we ensure we clearly understand the trade-off in prioritizing some of these steps and communicate this with our teams. The bigger and more important the decision we’re trying to make, the more of these steps we follow.

This is a guest post from Mike Dahn, head of Westlaw Product, and Dasha Herrmannova, senior applied scientist, from ��VR��Ƶ.

The $28 Billion Rise of Alternative Legal Services Providers and a Looming Bifurcation in the Legal Market

jeffrey.mccoy@thomsonreuters.com — Mon, 03 Feb 2025 00:05:11 +0000

The legal landscape is changing, and a new report reveals just how significant the shift is becoming. According to the Alternative Legal Services Providers 2025 Report, released by ��VR��Ƶ, the ALSP market has ballooned to an estimated $28.5 billion, fueled by an 18% compound annual growth rate from 2021 to 2023. This growth signifies a powerful trend: the increasing adoption and reliance on ALSPs by both corporate legal departments and traditional law firms.

One of the most intriguing aspects of this growth is the role of technology, particularly generative AI (GenAI). The report found that a significant percentage of both law firms (35%) and corporate legal departments (40%) find ALSPs leveraging GenAI to be more appealing.

However, the rise of ALSPs and the adoption of AI is also creating a division within the legal market. The report points to a bifurcation emerging between forward-looking legal entities embracing alternative delivery models and those clinging to traditional practices. This divide is especially notable as many corporate legal departments are signaling a decrease in spending with law firms hesitant to adapt to these evolving client expectations.

“The legal industry is going through significant transformation, driven by the adoption of GenAI technology,” stated Laura Clayton McDonnell, president of Corporates at ��VR��Ƶ. “As legal departments become more sophisticated in their use of technology, they will increasingly expect their law firms and alternative legal service providers to deliver tech-enabled services that meet their evolving needs, driving a wave of innovation and efficiency across the entire legal industry.”

This doesn’t mean traditional law firms are or will be obsolete. Many are successfully integrating ALSPs into their workflows, recognizing the value of their specialized expertise and cost-effectiveness. The key takeaway is clear: adaptability is critical. Those who embrace innovation, whether through incorporating ALSPs, leveraging AI, or adopting other technological advancements, are better positioned for success in this evolving landscape.

The legal market is at a crossroads. The future belongs to those who are willing to adapt and evolve, embracing new technologies and service models to deliver greater value and efficiency to their clients. The Alternative Legal Services Providers 2025 Report serves as a roadmap, highlighting the trends shaping the future of the legal profession.

GenAI Archives - ����VR��Ƶ Institute