Arabic Text Categorization Using k-nearest neighbour, Decision Trees (C4.5) and Rocchio Classi¯er: A Comparative Study

Adel Hamdan Mohammad, Omar Almomani, Tariq Alwada'n

    Research output: Contribution to journalArticlepeer-review

    Abstract

    No doubt that text classification is an important research area in information retrieval. In fact there are many researches about text classification in English language. A few researchers in general talk about text classification using Arabic data set. This research applies three well known classification algorithm. Algorithm applied are K-Nearest neighbour (K-NN), C4.5 and Rocchio algorithm. These well-known algorithms are applied on in-house collected Arabic data set. Data set used consists from 1400 documents belongs to 8 categories. Results show that precision and recall values using Rocchio classifier and K-NN are better than C4.5. This research makes a comparative study between mentioned algorithms. Also this study used a fixed number of documents for all categories of documents in training and testing phase.
    Original languageEnglish
    JournalInternational Journal of Current Engineering and Technology
    Volume6
    Issue number2
    Early online date10 Mar 2016
    Publication statusPublished - 30 Apr 2016

    Fingerprint

    Dive into the research topics of 'Arabic Text Categorization Using k-nearest neighbour, Decision Trees (C4.5) and Rocchio Classi¯er: A Comparative Study'. Together they form a unique fingerprint.

    Cite this