Sentiment Classification of Hate Speech Against Islam onTwitter Platform Using Multinomial Naïve Bayes
Abstract
Twitter is widely used by public figures, politicians, celebrities, and organizations to communicate with the public. However, the
platform's freedom of speech is often misused, leading to conflicts such as hate speech, especially against Islam. This study aims to
develop a text classification system for detecting hate speech against Islam and to evaluate the performance of Multinomial Naïve Bayes
(MNB) in this task. The data was obtained through Twitter data crawling and processed through several pre-processing steps: cleaning,
case folding, tokenizing, stop words removal, and stemming. The processed data was then transformed using Bag of Words to compute
word frequency, which was input into MNB. The first test compared the ratio of training to test data, adjusting the alpha hyperparameter
within its minimum and maximum ranges. The second test involved k-fold cross-validation for model validation. The results showed
the highest accuracy of 85% at a 90:10 training-to-test data ratio with the maximum alpha value. Using 10-fold cross-validation, the
model achieved an average accuracy of 79.09%, with the highest accuracy of 85.05% in the 4th iteration. This study demonstrates that
the training/test data ratio, alpha parameter, and cross-validation influence MNB's performance in classifying hate speech.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Zul Iflah Al Juhaeda, Muhammad Faisal, suhartono (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.