Masakazu Kanazawa, Atsushi Ito, Kazuyuki Yamasawa, Takehiko Kasahara, Yuya Kiryu and Fubito Toyama
Method to Predict Confidential Words in Japanese Judicial Precedents Using Neural Networks With Part-of-Speech Tags
Cognitive Infocommunications involve a combination of informatics and telecommunications. In the future, infocommunication is expected to become more intelligent and life supportive. Privacy is one of the most critical concerns in infocommunications. Encryption is a well-recognized technology that ensures privacy; however, it is not easy to completely hide personal information. One technique to protect privacy is by finding confidential words in a file or a website and changing them into meaningless words. In this paper, we investigate a technology used to hide confidential words taken from judicial precedents. In the Japanese judicial field, details of most precedents are not made available to the public on the Japanese court web pages to protect the persons involved. To ensure privacy, confidential words, such as personal names, are replaced by other meaningless words. This operation takes time and effort because it is done manually. Therefore, it is desirable to automatically predict confidential words. We proposed a method for predicting confidential words in Japanese judicial precedents by using part-of-speech (POS) tagging with neural networks. As a result, we obtained 88% accuracy improvement over a previous model. In this paper, we describe the mechanism of our proposed model and the prediction results using perplexity. Then, we evaluated how our proposed model was useful for the actual precedents by using recall and precision. As a result, our proposed model could detect confidential words in certain Japanese precedents.