Weighted-Attribute Triplet Hashing for Large-Scale Similar Judicial Case Matching

Jiamin Li; Xingbo Liu; Xiushan Nie; Lele Ma; Peng Li; Kai Zhang; Yilong Yin

doi:10.1155/2021/6650962

Weighted-Attribute Triplet Hashing for Large-Scale Similar Judicial Case Matching

Comput Intell Neurosci. 2021 Apr 16:2021:6650962. doi: 10.1155/2021/6650962. eCollection 2021.

Authors

Jiamin Li¹, Xingbo Liu¹, Xiushan Nie², Lele Ma¹, Peng Li³, Kai Zhang³, Yilong Yin¹

Affiliations

¹ School of Software, Shandong University, Jinan, China.
² School of Computer Science and Technology, Shandong Jianzhu University, Jinan, China.
³ Shandong Liju Robot Technology Co., Ltd, Yantai, China.

Abstract

Similar judicial case matching aims to enable an accurate selection of a judicial document that is most similar to the target document from multiple candidates. The core of similar judicial case matching is to calculate the similarity between two fact case documents. Owing to similar judicial case matching techniques, legal professionals can promptly find and judge similar cases in a candidate set. These techniques can also benefit the development of judicial systems. However, the document of judicial cases not only is long in length but also has a certain degree of structural complexity. Meanwhile, a variety of judicial cases are also increasing rapidly; thus, it is difficult to find the document most similar to the target document in a large corpus. In this study, we present a novel similar judicial case matching model, which obtains the weight of judicial feature attributes based on hash learning and realizes fast similar matching by using a binary code. The proposed model extracts the judicial feature attributes vector using the bidirectional encoder representations from transformers (BERT) model and subsequently obtains the weighted judicial feature attributes through learning the hash function. We further impose triplet constraints to ensure that the similarity of judicial case data is well preserved when projected into the Hamming space. Comprehensive experimental results on public datasets show that the proposed method is superior in the task of similar judicial case matching and is suitable for large-scale similar judicial case matching.

MeSH terms

Algorithms*