Legal Information Retrieval and Entailment Using Transformer-based Approaches

Rev Socionetwork Strateg. 2024;18(1):101-121. doi: 10.1007/s12626-023-00153-z. Epub 2024 Jan 11.

Abstract

The challenge of information overload in the legal domain increases every day. The COLIEE competition has created four challenge tasks that are intended to encourage the development of systems and methods to alleviate some of that pressure: a case law retrieval (Task 1) and entailment (Task 2), and a statute law retrieval (Task 3) and entailment (Task 4). Here we describe our methods for Task 1 and Task 4. In Task 1, we used a sentence-transformer model to create a numeric representation for each case paragraph. We then created a histogram of the similarities between a query case and a candidate case. The histogram is used to build a binary classifier that decides whether a candidate case should be noticed or not. In Task 4, our approach relies on fine-tuning a pre-trained DeBERTa large language model (LLM) trained on SNLI and MultiNLI datasets. Our method for Task 4 was ranked third among eight participating teams in the COLIEE 2023 competition. For Task 4, We also compared the performance of the DeBERTa model with those of a knowledge distillation model and ensemble methods including Random Forest and Voting.

Keywords: COLIEE 2023; Legal information entailment; Legal information retrieval; Transformer-based legal information extraction.