Abstract
This paper used an amino acid location-based sequence encoding as a feature extraction techniques to identify single chains antibody molecules that bind to B-lymphocyte stimulator (BLyS) antigen. The data were manually derived from the European patent (EP2275449B1) text. The dataset was cleaned and made suitable for the machine learning models. The accuracy, precision and recall achieved across individual descriptors (Membrane and Soluble) for Logistic regression, KNN, KSVM, and Random Forest Tree was above 80%. However, it was much lower for the Naïve Bayes except for the precision score. The promising accuracy value achieved from such a minimal dataset has significant implications for the drug discovery process - this includes considerable savings in time and resources.
Original language | English |
---|---|
Title of host publication | Proceedings of the 10th International Conference on Information Communication and Management |
Pages | 20–24 |
Number of pages | 5 |
DOIs | |
Publication status | Published - 24 Sept 2020 |
Externally published | Yes |
Event | 10th International Conference on Information Communication and Management - Paris, France Duration: 12 Aug 2020 → 14 Aug 2020 Conference number: 10 |
Conference
Conference | 10th International Conference on Information Communication and Management |
---|---|
Country/Territory | France |
City | Paris |
Period | 12/08/20 → 14/08/20 |