Word Segmentation for Chinese Judicial Documents

Linxia Yao, Jidong Ge, Chuanyi Li, Yuan Yao, Zhenhao Li, Jin Zeng, Bin Luo, Victor Chang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Word segmentation is an integral step in many knowledge discovery applications. However, existing word segmentation methods have problems when applying to Chinese judicial documents: (1) existing methods rely on large-scale labeled data which is typically unavailable in judicial documents, and (2) judicial document has its own language features and writing formats. In this paper, a word segmentation method is proposed for Chinese judicial documents. The proposed method consists of two steps: (1) automatically generating some labeled data as legal dictionaries, and (2) applying a hybrid multi-layer neural networks to do word segmentation incorporating legal dictionaries. Experiments are conducted on a dataset of Chinese judicial documents showing that the proposed model can achieve better results than the existing methods.

Original languageEnglish
Title of host publicationData Science - 5th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2019, Proceedings
EditorsXiaohui Cheng, Weipeng Jing, Xianhua Song, Zeguang Lu
PublisherSpringer-Verlag
Pages466-478
Number of pages13
ISBN (Print)9789811501173
DOIs
Publication statusPublished - 13 Sep 2019
Event5th International Conference of Pioneer Computer Scientists, Engineers and Educators - Guilin, China
Duration: 20 Sep 201923 Sep 2019

Publication series

NameCommunications in Computer and Information Science
Volume1058
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference5th International Conference of Pioneer Computer Scientists, Engineers and Educators
Abbreviated titleICPCSEE 2019
CountryChina
CityGuilin
Period20/09/1923/09/19

Fingerprint Dive into the research topics of 'Word Segmentation for Chinese Judicial Documents'. Together they form a unique fingerprint.

  • Cite this

    Yao, L., Ge, J., Li, C., Yao, Y., Li, Z., Zeng, J., Luo, B., & Chang, V. (2019). Word Segmentation for Chinese Judicial Documents. In X. Cheng, W. Jing, X. Song, & Z. Lu (Eds.), Data Science - 5th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2019, Proceedings (pp. 466-478). (Communications in Computer and Information Science; Vol. 1058). Springer-Verlag. https://doi.org/10.1007/978-981-15-0118-0_36