NOXTUA VOYAGE EMBED is an embedding model fine-tuned on legal documents that Xayn shared with Voyage. It provides a substantial quality improvement of 25.3% over OpenAIʼs text-embedding-3-large.

NOXTUA VOYAGE EMBED is a model customized for retrieval tasks on legal documents that Xayn shared with Voyage. The model is based on voyage-multilingual-2, Voyageʼs latest and most powerful embedding model tailored to multilingual retrieval, and fine-tuned on EU_GER_Xayn_DeJure_Laws_Decisions, a proprietary dataset provided by Xayn. The context length is 32K tokens and the embedding dimension is 1024. The rest of the document will describe the evaluation results of NOXTUA VOYAGE EMBED against other baseline models, including voyage-multilingual-2 and OpenAIʼs text-embedding- 3-large.

Evaluation Datasets

EU_GER_Xayn_DeJure_Laws_Decisions is an extensive collection of German and EU law books as well as law cases, comprising decisions from various courts across Germany and the EU courts. The dataset is meticulously organized, with each case document containing essential metadata such as the date of decision, court hierarchy, and a summary of the key facts and outcomes. The cases cover a broad range of legal domains, including civil, criminal, administrative, and labor law, ensuring a comprehensive representation of the German and EU legal system.

The dataset consists of totally more than 20B tokens. Voyage generates 47k queries from the documents in the dataset to form an evaluation dataset, named xayn-syn-pairs-eval.

Example of pairs in xayn-syn-pairs:

Query

““Welche rechtlichen Konsequenzen hat das Verschweigen von Krankheitshistorien bei der Lebensversicherung?””

Relevant Doc

“==STAMMDATEN==

LG Hamburg, 04.10.1990 - 327 O 125/90

==LEITSATZ==

1. Verschweigt der Versicherte bei Antragstellung einen chronischen und über Jahre medikamentös behandelten Bluthochdruck, ist der Versicherer nach § 123 BGB zur Anfechtung des Versicherungsvertrags berechtigt (hier: 52 Jahre alter Mann mit langjährigem Verdacht auf Koronarinsuffizienz, verbunden mit Hypertonie und Fettstoffwechselstörungen).

2. War der Versicherte wegen dieser Krankheit in einem Zeitraum von neun Jahren mehr als 50mal in ärztlicher Behandlung, läßt dies den Schluß zu, daß er die Angaben über die Vorerkrankungen deshalb unterließ, weil er befürchtete, der Versicherer werde anderenfalls den beantragten Lebensversicherungsvertrag nicht abschließen.”

Evaluation Results

We compare the NOXTUA VOYAGE EMBED against other embedding models on xayn-syn-pairs-eval.

OpenAI embedding model: text-embedding-3-large
voyage-law-2, Voyage AI embedding model optimized for legal retrieval quality
voyage-multilingual-2, Voyage AI embedding model optimized for multilingual legal retrieval quality

Given a query, we retrieve the top-100 documents based on cosine similarities. We report NDCG10 and Recall@100. Both are standard metric for retrieval quality - higher is better. The table below presents the results.

The NOXTUA VOYAGE EMBED model significantly outperforms other embedding models, achieving a substantial average improvement of 25.3% over OpenAI text-embedding-3-large. Compared with voyage-multilingual-2 and voyage-law-2, NOXTUA VOYAGE EMBED achieves a 10.7% improvement in Recall@100, which validates the effectiveness of fine-tuning.

In addition, we also evaluate NOXTUA VOYAGE EMBED and baselines on a few common public legal retrieval benchmarks, such as legal_summarization, legalbench_consumer_contracts_qa, GerDaLIRSmall, and LegalQuAD. NOXTUA VOYAGE EMBED significantly outperforms text-embedding-3-large as well as other voyage models on these datasets as well.

NOXTUA VOYAGE EMBED Benchmarking Report

Evaluation Datasets

Evaluation Results

Extreme compression of sentence-transformer ranker models