A novel feature selection method for text classification using association rules and clustering

Navid Sheydaei, Mohamad Saraee, Azar Shahgholian

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Readability and accuracy are two important features of any good classifier. For reasons such as acceptable accuracy, rapid training and high interpretability, associative classifiers have recently been used in many categorization tasks. Although features could be very useful in text classification, both training time and the number of produced rules will increase significantly owing to the high dimensionality of text documents. In this paper an association classification algorithm for text classification is proposed that includes a feature selection phase to select important features and a clustering phase based on class labels to tackle this shortcoming. The experimental results from applying the proposed algorithm in comparison with the results of selected well-known classification algorithms show that our approach outperforms others both in efficiency and in performance.
Original languageEnglish
Pages (from-to)-
JournalJournal of Information Science
DOIs
Publication statusPublished - 2015

Fingerprint

Association rules
Feature extraction
Classifiers
Labels
efficiency
performance

Cite this

@article{4856402eb9ff4f1792fffc4c2036a1ff,
title = "A novel feature selection method for text classification using association rules and clustering",
abstract = "Readability and accuracy are two important features of any good classifier. For reasons such as acceptable accuracy, rapid training and high interpretability, associative classifiers have recently been used in many categorization tasks. Although features could be very useful in text classification, both training time and the number of produced rules will increase significantly owing to the high dimensionality of text documents. In this paper an association classification algorithm for text classification is proposed that includes a feature selection phase to select important features and a clustering phase based on class labels to tackle this shortcoming. The experimental results from applying the proposed algorithm in comparison with the results of selected well-known classification algorithms show that our approach outperforms others both in efficiency and in performance.",
author = "Navid Sheydaei and Mohamad Saraee and Azar Shahgholian",
year = "2015",
doi = "10.1177/0165551514550143",
language = "English",
pages = "--",
journal = "Journal of Information Science",
issn = "1741-6485",
publisher = "SAGE Publications Ltd",

}

A novel feature selection method for text classification using association rules and clustering. / Sheydaei, Navid; Saraee, Mohamad; Shahgholian, Azar.

In: Journal of Information Science, 2015, p. -.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - A novel feature selection method for text classification using association rules and clustering

AU - Sheydaei, Navid

AU - Saraee, Mohamad

AU - Shahgholian, Azar

PY - 2015

Y1 - 2015

N2 - Readability and accuracy are two important features of any good classifier. For reasons such as acceptable accuracy, rapid training and high interpretability, associative classifiers have recently been used in many categorization tasks. Although features could be very useful in text classification, both training time and the number of produced rules will increase significantly owing to the high dimensionality of text documents. In this paper an association classification algorithm for text classification is proposed that includes a feature selection phase to select important features and a clustering phase based on class labels to tackle this shortcoming. The experimental results from applying the proposed algorithm in comparison with the results of selected well-known classification algorithms show that our approach outperforms others both in efficiency and in performance.

AB - Readability and accuracy are two important features of any good classifier. For reasons such as acceptable accuracy, rapid training and high interpretability, associative classifiers have recently been used in many categorization tasks. Although features could be very useful in text classification, both training time and the number of produced rules will increase significantly owing to the high dimensionality of text documents. In this paper an association classification algorithm for text classification is proposed that includes a feature selection phase to select important features and a clustering phase based on class labels to tackle this shortcoming. The experimental results from applying the proposed algorithm in comparison with the results of selected well-known classification algorithms show that our approach outperforms others both in efficiency and in performance.

U2 - 10.1177/0165551514550143

DO - 10.1177/0165551514550143

M3 - Article

SP - -

JO - Journal of Information Science

JF - Journal of Information Science

SN - 1741-6485

ER -