Our client delivers innovative software solutions to the global legal industry, where speed and insight matter, but confidentiality, jurisdictional control, and regulatory compliance are non-negotiable.
They had already demonstrated the potential of AI through an early proof of concept that used large language models to summarise legal documentation, extract key topics, identify actors and role players, and allow users to explore case material conversationally. The system dramatically reduced the manual effort involved in building context around complex cases and trials.
But as the client prepared to move from proof of concept to production, critical challenges emerged. The original implementation relied on OpenAI, routing sensitive legal data through the United States, restricting model choice, and creating compliance and cost concerns for customers operating across Europe, Canada, the Middle East, and Asia-Pacific.
To take this solution to market, the client needed an architecture that could be trusted at scale. The BBD team employed Large Language Models (LLMs), Generative AI, AWS cloud services and Retrieval Augmented Generation (RAG), and a model abstraction layer to remove platform constraints and support long-term evolution across regions.
Objectives
- Replace OpenAI with a secure, regionally compliant AI foundation
- Re-engineer the solution using RAG as the foundation to ensure AI outputs were grounded only in approved legal documents
- Maintain strict data sovereignty across multiple geographic regions
- Enable flexibility in model selection and execution to manage cost, performance and platform limitations
- Introduce a model abstraction layer to improve cost visibility, reduce dependency on a single AI platform, and future-proof the solution
- Move the solution into production incrementally, informed by real customer use
- Ensure solution was architected as GDPR-aware by design, not policy
- Deliver real, incremental value to legal professionals while reducing operational and compliance risk
Benefits
A key strength of the solution lies in its flexibility. The client can:
- Select the most appropriate model for each AI task
- Adopt improved foundation models without re-architecture or dependency on a single AI platform
- Continue evolving the platform without introducing new compliance risk
- Reduced compliance and data exposure risk through per-document retrieval and the absence of a global knowledge store
- More predictable performance and cost as usage scales, driven by constrained retrieval and reuse of document embeddings
- Increased trust in AI-generated insights through grounded, document-based responses and reduced hallucination risk
- Reduced risk from platform limitations by decoupling AI capabilities from any single managed service
- Once LiteLLM abstraction layer was in place, could predict and manage AI costs more effectively through per-prompt tracking and improved observability across AI workloads
As generative AI capabilities advance, the platform is positioned to evolve in step, without sacrificing the controls required in regulated legal environments.
Overview and approach
Rather than creating a broad or persistent AI knowledge base, BBD implemented a deliberately constrained RAG architecture. Workspaces act only as a selection and access boundary; vector embeddings remain scoped per document and are never shared across workspaces, cases or customers.
Legal documents are processed and prepared for analysis within defined workspaces. When a user asks a question, the platform retrieves only the most relevant content from the specific documents the user has explicitly selected, and supplies that information to the language model to generate a response.
This approach ensures that:
- Responses are grounded in source material, not general model knowledge
- No information is retrieved across cases, projects or customers
- Sensitive data is never reused or exposed outside its intended context
- The risk of hallucination is significantly reduced
By design, the solution trades unbounded intelligence for precision, transparency and trust, aligning closely with the expectations of legal professionals and regulated environments.
Secure, region-aware rollout
Trial and court case information is naturally highly sensitive, and data sovereignty regulations vary significantly across the regions the client serves. Canadian and European data cannot transit through the United States, while additional regional constraints apply in the Middle East and Asia-Pacific.
To address this, the platform was built on AWS Bedrock, ensuring that all AI execution and data processing remain within AWS-managed, regionally controlled infrastructure.
The rollout followed a phased approach. The first production deployment focused on Europe, with Canadian customers federated into that environment as an MVP. Additional regions were subsequently established for the Middle East and Asia-Pacific, including Singapore and Hong Kong. This allowed the client to validate the solution with real customers, gather feedback, and refine functionality continuously before expanding further.
Within the first phase, the platform delivered:
- Intelligent document summarisation
- Topic and entity extraction across complex legal material
- Conversational Q&A grounded strictly in retrieved document content
- Secure, compliant AI capabilities across multiple jurisdictions
Technical overview
At the core of the solution is a Retrieval Augmented Generation (RAG) pipeline designed specifically for regulated, multi-region environments.
Documents uploaded into the platform are processed, normalised and split into optimised text chunks. Each chunk is embedded and stored for reuse, improving performance and cost efficiency while maintaining strict scope boundaries.
Rather than storing embeddings in a shared or global index, vector embeddings are scoped per document, not per tenant or platform. There is no cross-project or cross-customer retrieval. This architectural decision significantly reduces GDPR exposure, limits data gravity, and lowers operational complexity.
At query time, the user’s question is embedded using the same model as the document content, and a similarity search is performed only against the selected documents. The most relevant content is then injected into a guarded prompt, ensuring the language model generates responses grounded exclusively in retrieved source material.
All model execution is handled via AWS Bedrock within regionally controlled infrastructure. As the platform evolved, an abstraction layer was introduced between the platform and the underlying models, including a LiteLLM-based operator which decoupled the solution from the limitations of any single managed AI service. This layer enables different foundation models to be used for different tasks such as summarisation, extraction or conversational analysis, while providing improved cost visibility through per-prompt tracking. It avoids long-term vendor lock-in and allows the platform to evolve as models and providers continue to mature.
This abstraction layer addressed real-world production constraints encountered during scale-up, and has since been open-sourced to support broader adoption.
Responses are generated synchronously and treated as advisory, reinforcing the system’s role as a decision-support tool rather than an authoritative source.
The outcome
By introducing RAG as the foundation of the platform, BBD helped the client move confidently from proof of concept to a production-ready, globally deployable solution.
The result is an AI-enabled legal platform that:
- Respects data sovereignty by design
- Delivers explainable, grounded AI responses
- Scales across regions without increasing compliance risk
- Provides meaningful productivity gains to legal professionals
In an industry where trust is paramount, this project demonstrates how modern technologies can unlock real value when engineered responsibly, not opportunistically.
Impact of BBD’s partnership
BBD’s partnership enabled the client to move a promising AI concept into production with confidence. Beyond replacing a single technology provider, the engagement established a scalable, GDPR-aware RAG architecture, strengthened by a model abstraction layer that removes platform constraints and supports innovation today while remaining adaptable for the future.
Together, BBD and the client delivered a solution that balances innovation with responsibility, proving that powerful AI and regulatory trust do not have to be at odds when they are designed that way.