Launch of new Legal Search Model Noxtua Voyage Embed
Xayn launches new Legal Embedding Model with Stanford spin-off Voyage AI and German legal database dejure.org
International collaboration with combined AI and legal expertise
Noxtua Voyage Embed outperforms Open AI’s most powerful search model on average by a factor of 2 on legal text benchmarks
Noxtua Voyage Embed can be used to build custom AI search solutions
BERLIN/STANFORD, September 20, 2024 – The Legal AI Oxford spin-off Xayn, Stanford spin-off Voyage AI and the German legal database dejure.org jointly launch Noxtua Voyage Embed, a fine-tuned legal search embedding model specialized in European and German law. The search model is trained on a dataset, which comprises around 20 billion tokens of legal texts from dejure.org and Europe's first sovereign Legal AI Noxtua, developed by Xayn with legal expertise from the international law firm CMS. The newly released legal search model can process up to 32k tokens context length, making it really useful for processing legal texts, which tend to be lengthy.
To validate its quality, Noxtua Voyage Embed (voyage-law-2-xayn) was tested on several legal text benchmarks, where it performed more than 25 points (or 1.7 times) better than Open AI's most powerful search model (text-embedding-3-large) in terms of the search accuracy, and 2.2 times better with regards to the model’s ranking quality with a 3x smaller dimensionality (3x smaller storage and latency in vector-based search).
This international partnership combines deep AI expertise based on the latest AI research with in-depth legal specialization:
Voyage AI is a leading developer of customized embedding models and LLM retrieval/search infrastructure and already gained extensive experience in building custom legal embeddings working with Harvey.AI, a US-American legal AI platform.
dejure.org is one of the most used legal services in Germany with around 15 million hits per month and a database with more than 2 million court decisions and the most relevant laws in practice. For over twenty years, dejure.org has stood for the consistent combination of legal expertise with the integration and networking of legal information and technology.
Berlin-based Xayn offers the perfect combination of deep AI and legal insights with Europe's first sovereign Legal AI Noxtua, which they developed with the internationally renowned law firm CMS to provide lawyers with a legally competent and compliant AI assistant. Hosted in the EU, Noxtua meets not only GDPR requirements but also high legal standards of professional confidentiality and privacy.
Embedding models are used to build retrieval-augmented-generated AI systems (RAG) which are needed when creating efficient and reliable AI search solutions. They complement more traditional strategies such as keyword searching by finding items based on their semantic meaning. Noxtua Voyage Embed can be used to build custom AI search solutions focused on legal topics, as it has been trained on high-quality legal documents and is therefore specialized in the legal domain.
Enabling the creation of custom legal AI search solutions
“We are excited to launch Noxtua Voyage Embed together with Voyage AI and dejure.org, as this will drive the development of specialized and high-quality AI solutions. It will also allow us to further refine Noxtua, Europe's first sovereign legal AI, to support lawyers in their daily work. We see this international collaboration between established experts in their specific fields as another confirmation of our quest to promote independent AI with European values that meets the highest standards of competence and compliance. It’s the perfect 'source global, host local’ approach,” highlights Dr. Leif-Nissen Lundbæk (CEO & Co-Founder Xayn).
“This collaboration between Voyage AI, Xayn, and dejure.org combines excellent AI and legal knowledge with two research-based AI startups from Oxford and Stanford, two of the most prestigious universities in the world, and one of the most important data bases for legal documents in Germany. At Voyage, we focus on developing new techniques so that models can capture the nuances of specialized texts similarly to domain experts. The legal domain is very interesting and challenging for AI development because it is such a rule-based system that relies on accuracy and very precise use of language. Therefore, it is exciting to work on AI solutions that require deep technical and domain-specific knowledge, while also being practical,” states Tengyu Ma (CEO & Co-Founder Voyage AI, Assistant Professor at Stanford University).
“Searching for and in legal sources such as judgments and laws and the consolidation and evaluation of this information are essential components of legal work. To do this, lawyers need reliable and efficient solutions that they can trust. However, conventional AI offerings are usually trained on general data and therefore struggle immensely to achieve reliable results in such a highly specialized field as law. That's why we're excited to partner with Xayn and Voyage AI on this new and powerful legal search model. Noxtua Voyage Embed enables experts to develop customized AI applications on top of it that help legal professionals find relevant information faster and easier,” says Oliver García (CEO dejure.org).
Noxtua Voyage Embed as part of Europe’s first sovereign Legal AI Noxtua
The new Noxtua Voyage Embed search model is now available via API and can be used to build custom Legal AI search solutions.
Noxtua Voyage Embed is part of Xayn's larger sovereign Legal AI offering, which also includes Noxtua Legal Copilot, which lawyers can use to analyze, review, translate, and (re)write legal documents while protecting their clients' privacy. Europe's first sovereign legal AI is legally competent, having been trained on legal texts labeled by experts. In addition to being GDPR-compliant, Noxtua is hosted in the EU and meets the high requirements of professional secrecy (e.g. Section 203 German Criminal Code, Section 43e German Federal Code for Lawyers).
More Information: www.xayn.com/noxtua
_________________________________
About Xayn
The AI startup Xayn develops Noxtua, Europe’s first sovereign Legal AI. Noxtua helps lawyers analyze, review, and draft legal documents in seconds while being legally competent and compliant. Hosted in the EU, Noxtua meets high standards of confidentiality in handling client data, professional secrecy (e.g. Section 203 German Criminal Code, Section 43e German Federal Code for Lawyers), and European data protection. The Legal Copilot is powered by the specialized AI models Noxtua Large Language Model and Noxtua Voyage Embed. The models are trained with high-quality legal data provided by the Legal AI Alliance which the tech startup Xayn initiated with the international business law firm CMS. This makes Noxtua the secure, independent, and specialized European Legal AI.
The Berlin-based AI company Xayn was born out of research at Oxford University and Imperial College London by Dr Leif-Nissen Lundbæk and Professor Michael Huth. Founded in 2017, Xayn’s academic vision remains, with a workforce comprised of approx. 30% PhDs. The startup has received investment funding of 19.5 million EURO from Global Brain Corporation, KDDI Open Innovation Fund, Earlybird, and Dominik Schiener.
More information: www.xayn.com
Downloads:
About Voyage AI
Voyage AI is a leading developer of customized embedding models and LLM retrieval infrastructure. The Stanford spin-off is led by professor Tengyu Ma and includes fellow Stanford professors Christopher Manning, Christopher Ré, and Fei-Fei Li as academic advisors. Voyage has assembled a world-class AI research team that has developed novel techniques that enable embeddings to better capture the nuances of specialized text in the same way as domain experts.
For more information, please visit www.voyageai.com
About dejure.org
dejure.org is a free legal information portal. With around 15 million hits per month, it is one of the most widely used and established legal services and databases in Germany. Founded in 2000, the platform includes a legal database with around 300 laws. Despite the small number of laws, the selection covers around 90% of typical research requirements. In addition, dejure.org contains a case law database with more than 2 million court decisions, which is expanded daily, links this information with secondary sources and - in addition to search and research functions - provides powerful tools for evaluation, monitoring, etc.
For more information, please visit www.dejure.org
Media Contact
Dr Clara Herdeanu
VP Communications & Public Affairs
press@xayn.com
+49 174 4758 847