Make a Model Context Protcol (MCP) server for Case Law API #88
Replies: 15 comments 15 replies
-
|
This is cool. Two quick thoughts: I assume organizations that are making their databases available aren't doing so for free (or at least they aren't always doing so for free). Is there a method here for compensating databases that are utilized? @rachlllg, have you been following this and do you have thoughts? |
Beta Was this translation helpful? Give feedback.
-
|
Another request for this just posted yesterday: freelawproject/courtlistener#6057 It notes that there are three unofficial CourtListener MCP servers now:
I haven't checked them out at all, but |
Beta Was this translation helpful? Give feedback.
-
|
I agree that the AI tools market is evolving at an incredible pace. We're seeing new capabilities and applications emerge constantly. This rapid evolution means that even the major players in the AI-legal space, like Harvey, Westlaw, and Casetext, will likely need to adapt and evolve their product offerings continuously. As the underlying AI models become more powerful and AI-assisted workflows become more refined, the legal tech landscape will continue to shift. It's also worth noting that many general-purpose AI tools can be adapted for ad-hoc legal workflows with relatively minor adjustments. This flexibility lowers the barrier to entry for legal tech innovation and opens up many possibilities. In this context, a CourtListener MCP server could be a very strategic move. While its shelf life might be limited as the market matures, its immediate value would be in significantly increasing the exposure and usage of the CourtListener API. By making it easier for developers and researchers to experiment with AI-legal workflows, it could foster a vibrant ecosystem of innovation around CourtListener's data and services. |
Beta Was this translation helpful? Give feedback.
-
|
Looks like another MCP is probably here: https://github.com/beshkenadze/us-legal-tools (I haven't checked it out yet.) |
Beta Was this translation helpful? Give feedback.
-
|
Hey! I’m the author of https://github.com/beshkenadze/us-legal-tools and https://github.com/beshkenadze/eyecite-js. Both projects are under active development, so I wouldn’t consider them production-ready yet due to ongoing changes in their architecture. Also, because some of the APIs lack proper OpenAPI v3 documentation, there may be occasional inaccuracies in how the official APIs have been translated into OpenAPI v3. Feel free to reach out if you have any questions about the projects! |
Beta Was this translation helpful? Give feedback.
-
|
Got one more request for this today in my call with https://github.com/freelawproject/crm/issues/932 |
Beta Was this translation helpful? Give feedback.
-
|
Found an MCP server for Brazilian law among the list of community MCP servers. |
Beta Was this translation helpful? Give feedback.
-
|
And another MCP implementation is here: https://github.com/khizar-anjum/courtlistener-mcp We better get on this. :) |
Beta Was this translation helpful? Give feedback.
-
|
Another request today. |
Beta Was this translation helpful? Give feedback.
-
|
We have begun designing our MCP. If anybody here has any input they'd like to share as we do this (things we might not think of or lessons learned), please chime in! We'll be posting a contract position to build this for CL once we've got the design figured out. |
Beta Was this translation helpful? Give feedback.
-
|
People in law firms and in-house teams would appreciate instructions on how to use a CL MCP with Microsoft Copilot, which many of those folks have (and may be limited to). Consider having technical and non-technical docs. Technical to help the IT team implement, and addressing usual security type concerns. Non-technical to explain to the legal folks why they should care, how to ask IT to set it up, and how to use it once it is set up. See, e.g., this article on using MCP with the Copilot Studio agent builder. This is likely to be a pain in the you-know-what, so I wouldn't prioritize it for initial launch! |
Beta Was this translation helpful? Give feedback.
-
|
Adding lessons learned from a CourtListener MCP I'm currently building (not released yet) focused specifically on citation validation and hallucination detection, and several USPTO MCPs, that are directly relevant to the design conversation here. I presented on MCP in legal practice at an ILTA webinar late last year, including a live demo of a citation validator built on @blakeox's courtlistener-mcp and a 100+ page guide on setting up MCP servers in legal environments, so this is something I've thought about at length. I built on prior community work from @JamesANZ and @blakeox, whose repos were valuable source material. I reduced the tools count to 6 and use docstrings and return responses to guide the LLM to the same workflow without a lengthy system prompt. 6 focused tools vs the 33 in the original, which matters when every token of system prompt is competing with your context window. My implementation isn't public yet but will be released soon. Microsoft CoPilot Studio requires Streamable HTTP, not STDIO @anseljh is right that this is painful. CoPilot Studio only connects to MCP servers over HTTPS with the Streamable HTTP transport. It cannot talk to a locally-running STDIO process the way Claude Desktop can. This means you need a publicly-reachable web server: Docker plus a public URL, a cloud deployment, an MCP gateway, or an MCP proxy. Dev tunnels (VS Code tunnels, ngrok, etc.) can expose a local server publicly, but managing one tunnel per user is operationally complex and fragile, and introduces real security concerns. Each tunnel is a publicly-reachable endpoint exposing access to a user's CourtListener API key and potentially their local environment, with no enterprise auth layer, audit logging, or centralized revocation. Not realistic, and not acceptable, for a law firm. An MCP gateway (such as Cloudflare's, or self-hosted via LiteLLM Proxy which has a full MCP gateway) is the cleanest enterprise path. It handles HTTPS termination, manages MCP access by Key, Team, and Organization (directly solving the per-user API key problem), bridges all three transports so a STDIO server can be exposed to CoPilot over Streamable HTTP without rewriting it, and provides the audit logging that dev tunnels lack entirely. The catch is that properly documenting a CourtListener MCP deployment then stops being about how to use the MCP and becomes a guide on how to set up and operate an MCP gateway, a meaningfully different and more complex problem that will be out of reach for most legal teams without dedicated IT support. It should be a first-class design consideration from day one, not an afterthought. The citation-lookup throttle makes shared multi-tenant deployments impractical This is the bigger structural issue. The citation-lookup API has an additional throttle of 60 valid citations per minute per API key, and CourtListener API keys are issued for individual use. In a firm deploying a shared MCP endpoint used by 20 attorneys, all requests share one key and one rate limit, which collapses immediately under any real workload. The only architecturally sound workaround is one API key per user, each routed through their own server instance. With dev tunnels that means one tunnel per user, and with an MCP gateway it means per-user key injection configured at the gateway level. Either way, genuinely out of reach for non-technical deployments and arguably defeats the point of a shared enterprise tool. Any official FLP MCP design should address this explicitly, either per-user key auth at a gateway level, or clear documentation that citation-lookup at scale requires individual deployments. Context window flooding is real, and progressive disclosure with API-level field filtering is the answer @anya2975 is exactly right that agents will enthusiastically call tools multiple times and pile up results. The pattern that works is progressive disclosure: return minimal identifying fields in search results, then let the agent request full detail on a specific record only when needed. My USPTO Patent File Wrapper MCP does this with configurable field tiers enforced at the API request level, not post-processing:
Certain heavy fields like For citation validation specifically, this matters because a brief can contain dozens of citations and each Happy to share more once we release, and happy to answer questions in the meantime. |
Beta Was this translation helpful? Give feedback.
-
|
+1 The biggest resource to optimize for in an MCP server is the context. This means that we cannot go by the API standard of returning everything that matches the request. The results for each tool call need to be informative enough so the model can move forward but precise enough that it does not overfill the context window. @john-walkoe points out the right strategy here of API-level field filtering. On the subject of hosting the MCP server, Smithery is one option to consider. They function as a registry and discovery platform for MCPs and stay current with the MCP specification. Users can connect their favorite MCP-compatible client to deployed servers without FLP needing extensive DevOps configuration. However, Smithery's free hosting tier has been discontinued, so this would require a paid plan. If the MCP server requires API key management, such as CourtListener API key, that would fall on the end user. That is what I did when I hosted my MCP server on Smithery. |
Beta Was this translation helpful? Give feedback.
-
|
A thought from an FLP board member, @nadahlberg:
|
Beta Was this translation helpful? Give feedback.
-
|
Following up on my earlier comment — the CourtListener citation validation MCP I mentioned is now public: https://github.com/john-walkoe/courtlistener_citations_mcp It's focused specifically on citation validation and hallucination detection rather than general case law search. The Mata v. Avianca brief (the canonical AI hallucination case) was used as the primary test — the demo video on the README shows it in action. A few things that came out of building it that may be relevant to FLP's design:
Also submitted to the Docker MCP Catalog today (PR #1517 pending review), which would make it one-command deployable for firms already running Docker Desktop. On @nadahlberg's usage communication point — fully agree. Silent throttle failures are particularly confusing with citation-lookup since the limit counts valid citations, not requests. A brief with 60 real cases burns the entire minute's quota in one call, with no feedback to the user about why it paused. It's MIT licensed — FLP is welcome to use it or any parts of it as reference or foundation for the MCP you're designing. If it does inform the work, a mention would be appreciated but not required. Happy to answer questions or share anything that might be useful for FLP's design. |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
-
IDEA:
Anthropic made the Model Context Protocol, releasing it around Nov 2024, which allows easily connecting AI LLMs to data sources and tools. It's been gaining traction. Google just announced they'll be integrating it into Gemini, with Open AI pledging the same weeks earlier. (https://techcrunch.com/2025/04/09/google-says-itll-embrace-anthropics-standard-for-connecting-ai-models-to-data/)
I think it's time for the free law project to make an MCP server to connect your case law API to AI LLMs like Gemini.
Rationale:
I think once frontier LLM models have direct access to comprehensive case law databases, the subscription costs of professional case law platforms will be much harder to sell. The value of the Free Law Project would become more apparent, and upon experiencing the benefits of being able to use their preferred frontier model with FLP's case law database, people (nonprofits, legal professionals, citizens) might see now as the opportune time to provide more resources and support to the FLP so that developments in AI can be fully leveraged to our collective benefit.
I think it's just a matter of time until people start building their own personal AI agents by uploading the databases, text books, and other sources of information relevant to their work to Google Drive, with Gemini then treating the folder designated as a knowledgebase as one giant repository. Eventually that repository will not just be files you upload, but the data you generate while using the AI agent. People will build their experts by adding databases, tools, and giving them direct instruction through discussing and exploring subjects and materials. I gather one already can do many of these things using Drive and Google Cloud; that the free law project's case law database could be uploaded to Drive and turned into a RAG database using Google Cloud that Gemini can utilize via Gemini API.
Gemini 2.5 Pro Deep Research (released yesterday) is very impressive and useful. Can you imagine how amazing it would be to connect it to a case law database optimized for AI LLMs? That's the future I see coming, and it doesn't have to cost $700 a month.
Beta Was this translation helpful? Give feedback.
All reactions