Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddings

In recent years, opinions and communication can be easily expressed through social media networks that have allowed users to communicate and share their opinions and views, resulting in massive user-generated content. This content may contain text that is hateful to large groups or specific ind...

Full description

Saved in:

Bibliographic Details
Main Authors:	Samar Al-Saqqa, Arafat Awajan, Bassam Hammo
Format:	Article
Language:	English
Published:	Growing Science 2025-01-01
Series:	International Journal of Data and Network Science
Online Access:	https://www.growingscience.com/ijds/Vol9/ijdns_2024_152.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1839622730981310464
author	Samar Al-Saqqa Arafat Awajan Bassam Hammo
author_facet	Samar Al-Saqqa Arafat Awajan Bassam Hammo
author_sort	Samar Al-Saqqa
collection	DOAJ
description	In recent years, opinions and communication can be easily expressed through social media networks that have allowed users to communicate and share their opinions and views, resulting in massive user-generated content. This content may contain text that is hateful to large groups or specific individuals. Therefore, in most website policies, automatic hate speech detection is required, and early automatic detection or filtering of such content is critical and necessary in online social networks, especially with large and increasingly user-generated content. This paper presents a suggested model to enhance the detection performance of hate speech using deep learning models with two types of word embedding models, the first model is Arabic models based on Wor2Vec including AraVec and Mazajak. The second is word embedding techniques models based on BERT including three pre-trained models namely ARABERT, MARBERT and CAMeLBERT. Common metrics in text classification are used including precision, recall, accuracy, and F1 score for model assessment. The experimental results show fine-tuned Arabic BERT models outperform Word2Vec based models, and that MARBERT outperforms both ARABERT and CAMeLBERT across all deep learning architectures, highlighting its superior ability to classify Arabic text. Additionally, BLSTM models show the highest performance on ARABERT, MARBERT, and CAMeLBERT, achieving an accuracy of 0.9945 with MARBERT.
format	Article
id	doaj-art-c1c91be081c342e9a40a7f9c63ffb1b3
institution	Matheson Library
issn	2561-8148 2561-8156
language	English
publishDate	2025-01-01
publisher	Growing Science
record_format	Article
series	International Journal of Data and Network Science
spelling	doaj-art-c1c91be081c342e9a40a7f9c63ffb1b32025-07-21T21:11:11ZengGrowing ScienceInternational Journal of Data and Network Science2561-81482561-81562025-01-019358760010.5267/j.ijdns.2024.8.008Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddingsSamar Al-SaqqaArafat AwajanBassam Hammo In recent years, opinions and communication can be easily expressed through social media networks that have allowed users to communicate and share their opinions and views, resulting in massive user-generated content. This content may contain text that is hateful to large groups or specific individuals. Therefore, in most website policies, automatic hate speech detection is required, and early automatic detection or filtering of such content is critical and necessary in online social networks, especially with large and increasingly user-generated content. This paper presents a suggested model to enhance the detection performance of hate speech using deep learning models with two types of word embedding models, the first model is Arabic models based on Wor2Vec including AraVec and Mazajak. The second is word embedding techniques models based on BERT including three pre-trained models namely ARABERT, MARBERT and CAMeLBERT. Common metrics in text classification are used including precision, recall, accuracy, and F1 score for model assessment. The experimental results show fine-tuned Arabic BERT models outperform Word2Vec based models, and that MARBERT outperforms both ARABERT and CAMeLBERT across all deep learning architectures, highlighting its superior ability to classify Arabic text. Additionally, BLSTM models show the highest performance on ARABERT, MARBERT, and CAMeLBERT, achieving an accuracy of 0.9945 with MARBERT.https://www.growingscience.com/ijds/Vol9/ijdns_2024_152.pdf
spellingShingle	Samar Al-Saqqa Arafat Awajan Bassam Hammo Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddings International Journal of Data and Network Science
title	Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddings
title_full	Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddings
title_fullStr	Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddings
title_full_unstemmed	Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddings
title_short	Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddings
title_sort	hate speech detection in arabic social networks using deep learning and fine tuned embeddings
url	https://www.growingscience.com/ijds/Vol9/ijdns_2024_152.pdf
work_keys_str_mv	AT samaralsaqqa hatespeechdetectioninarabicsocialnetworksusingdeeplearningandfinetunedembeddings AT arafatawajan hatespeechdetectioninarabicsocialnetworksusingdeeplearningandfinetunedembeddings AT bassamhammo hatespeechdetectioninarabicsocialnetworksusingdeeplearningandfinetunedembeddings

Hate speech detection in Arabic social networks using deep learning and fine-tuned embeddings

Similar Items