The OPT submission to the Shared Task of the 2016 Conference on Natural Language Learning (CoNLL) implements a ‘classic’ pipeline architecture, combining binary classification of (candidate) explicit connectives, heuristic rules for non-explicit discourse relations, ranking and ‘editing’ of syntactic constituents for argument identification, and an ensemble of classifiers to assign discourse senses. With an end-to-end performance of 27.77 F1 on the English ‘blind’ test data, our system advances the previous state of the art (Wang & Lan, 2015) by close to four F1 points, with particularly good results for the argument identification sub-tasks. OPT system results appear more competitive on the new, ‘blind’ test data than on the ‘test’ and ‘development’ sections of the Penn Discourse Treebank (PDTB; Prasad et al., 2008), which may indicate reduced over-fitting to specific properties of the venerable Wall Street Journal (WSJ) text underlying the PDTB.
|Publication status||Published - 11 Aug 2016|
|Event||20th Conference on Computational Natural Language Learning - Berlin, Germany|
Duration: 11 Aug 2016 → 12 Aug 2016
|Conference||20th Conference on Computational Natural Language Learning|
|Period||11/08/16 → 12/08/16|