Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddings
In recent years, opinions and communication can be easily expressed through social media networks that have allowed users to communicate and share their opinions and views, resulting in massive user-generated content. This content may contain text that is hateful to large groups or specific ind...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Growing Science
2025-01-01
|
Series: | International Journal of Data and Network Science |
Online Access: | https://www.growingscience.com/ijds/Vol9/ijdns_2024_152.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1839622730981310464 |
---|---|
author | Samar Al-Saqqa Arafat Awajan Bassam Hammo |
author_facet | Samar Al-Saqqa Arafat Awajan Bassam Hammo |
author_sort | Samar Al-Saqqa |
collection | DOAJ |
description |
In recent years, opinions and communication can be easily expressed through social media networks that have allowed users to communicate and share their opinions and views, resulting in massive user-generated content. This content may contain text that is hateful to large groups or specific individuals. Therefore, in most website policies, automatic hate speech detection is required, and early automatic detection or filtering of such content is critical and necessary in online social networks, especially with large and increasingly user-generated content. This paper presents a suggested model to enhance the detection performance of hate speech using deep learning models with two types of word embedding models, the first model is Arabic models based on Wor2Vec including AraVec and Mazajak. The second is word embedding techniques models based on BERT including three pre-trained models namely ARABERT, MARBERT and CAMeLBERT. Common metrics in text classification are used including precision, recall, accuracy, and F1 score for model assessment. The experimental results show fine-tuned Arabic BERT models outperform Word2Vec based models, and that MARBERT outperforms both ARABERT and CAMeLBERT across all deep learning architectures, highlighting its superior ability to classify Arabic text. Additionally, BLSTM models show the highest performance on ARABERT, MARBERT, and CAMeLBERT, achieving an accuracy of 0.9945 with MARBERT. |
format | Article |
id | doaj-art-c1c91be081c342e9a40a7f9c63ffb1b3 |
institution | Matheson Library |
issn | 2561-8148 2561-8156 |
language | English |
publishDate | 2025-01-01 |
publisher | Growing Science |
record_format | Article |
series | International Journal of Data and Network Science |
spelling | doaj-art-c1c91be081c342e9a40a7f9c63ffb1b32025-07-21T21:11:11ZengGrowing ScienceInternational Journal of Data and Network Science2561-81482561-81562025-01-019358760010.5267/j.ijdns.2024.8.008Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddingsSamar Al-SaqqaArafat AwajanBassam Hammo In recent years, opinions and communication can be easily expressed through social media networks that have allowed users to communicate and share their opinions and views, resulting in massive user-generated content. This content may contain text that is hateful to large groups or specific individuals. Therefore, in most website policies, automatic hate speech detection is required, and early automatic detection or filtering of such content is critical and necessary in online social networks, especially with large and increasingly user-generated content. This paper presents a suggested model to enhance the detection performance of hate speech using deep learning models with two types of word embedding models, the first model is Arabic models based on Wor2Vec including AraVec and Mazajak. The second is word embedding techniques models based on BERT including three pre-trained models namely ARABERT, MARBERT and CAMeLBERT. Common metrics in text classification are used including precision, recall, accuracy, and F1 score for model assessment. The experimental results show fine-tuned Arabic BERT models outperform Word2Vec based models, and that MARBERT outperforms both ARABERT and CAMeLBERT across all deep learning architectures, highlighting its superior ability to classify Arabic text. Additionally, BLSTM models show the highest performance on ARABERT, MARBERT, and CAMeLBERT, achieving an accuracy of 0.9945 with MARBERT.https://www.growingscience.com/ijds/Vol9/ijdns_2024_152.pdf |
spellingShingle | Samar Al-Saqqa Arafat Awajan Bassam Hammo Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddings International Journal of Data and Network Science |
title | Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddings |
title_full | Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddings |
title_fullStr | Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddings |
title_full_unstemmed | Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddings |
title_short | Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddings |
title_sort | hate speech detection in arabic social networks using deep learning and fine tuned embeddings |
url | https://www.growingscience.com/ijds/Vol9/ijdns_2024_152.pdf |
work_keys_str_mv | AT samaralsaqqa hatespeechdetectioninarabicsocialnetworksusingdeeplearningandfinetunedembeddings AT arafatawajan hatespeechdetectioninarabicsocialnetworksusingdeeplearningandfinetunedembeddings AT bassamhammo hatespeechdetectioninarabicsocialnetworksusingdeeplearningandfinetunedembeddings |