Weighted-Attribute Triplet Hashing for Large-Scale Similar Judicial Case Matching

Comput Intell Neurosci. 2021 Apr 16:2021:6650962. doi: 10.1155/2021/6650962. eCollection 2021.

Abstract

Similar judicial case matching aims to enable an accurate selection of a judicial document that is most similar to the target document from multiple candidates. The core of similar judicial case matching is to calculate the similarity between two fact case documents. Owing to similar judicial case matching techniques, legal professionals can promptly find and judge similar cases in a candidate set. These techniques can also benefit the development of judicial systems. However, the document of judicial cases not only is long in length but also has a certain degree of structural complexity. Meanwhile, a variety of judicial cases are also increasing rapidly; thus, it is difficult to find the document most similar to the target document in a large corpus. In this study, we present a novel similar judicial case matching model, which obtains the weight of judicial feature attributes based on hash learning and realizes fast similar matching by using a binary code. The proposed model extracts the judicial feature attributes vector using the bidirectional encoder representations from transformers (BERT) model and subsequently obtains the weighted judicial feature attributes through learning the hash function. We further impose triplet constraints to ensure that the similarity of judicial case data is well preserved when projected into the Hamming space. Comprehensive experimental results on public datasets show that the proposed method is superior in the task of similar judicial case matching and is suitable for large-scale similar judicial case matching.

MeSH terms

  • Algorithms*