Yudhis Tri Hardianza (1), Asfan Muqtadir (2), Andik Adi Suryanto (3)
General Background: Cyberbullying detection in Indonesian social media has become increasingly important due to rapid digital communication growth and complex informal language usage. Specific Background: Automated identification remains challenging because Indonesian online discourse frequently contains slang, ambiguity, sarcasm, and class imbalance, limiting the capability of conventional statistical and earlier deep learning approaches. Knowledge Gap: Prior studies have emphasized traditional classifiers and encoder-based Transformers such as IndoBERT, while generative text-to-text architectures like T5 and their comparison with hybrid feature fusion strategies remain underexplored in Indonesian-language corpora. Aims: This study systematically compares three modeling scenarios—T5 Base, Hybrid (T5 + TF-IDF), and Enhanced (T5 + TF-IDF + sentiment)—to evaluate their performance in detecting cyberbullying from 20,000 Indonesian social media comments with naturally imbalanced distribution. Results: Experimental findings show that T5 Base achieves the highest test Accuracy (0.8325) and Macro F1-Score (0.8329), while Hybrid and Enhanced models yield slightly lower yet competitive performance. The results indicate that contextual semantic representations learned by T5 sufficiently capture explicit and implicit abusive expressions, and additional statistical and sentiment features do not yield superior classification outcomes. Novelty: This research provides empirical evidence that a standalone text-to-text Transformer architecture can outperform hybrid feature fusion strategies in Indonesian cyberbullying detection under limited training data conditions. Implications: The findings support the adoption of end-to-end Transformer-based models for scalable, robust, and linguistically adaptive monitoring systems in low-resource social media environments.
Highlights:
The standalone text-to-text architecture produced the strongest test-set metrics among all evaluated scenarios.
Integration of statistical weighting and sentiment signals did not surpass the semantic-only configuration.
Stable generalization was maintained despite limited training allocation and naturally imbalanced data distribution.
Keywords: Text Classification, Social Media Analysis, Transformer Models, Indonesian Language, Cyberbullying Detection.
Technological advancements have greatly changed how people communicate, interact, and access information through various online platforms that are increasingly accessible. The growing availability of the internet, including in Indonesia, has shifted modern communication patterns, which now heavily rely on social media in daily life. The fast and real-time nature of interactions brings various benefits but also introduces new risks that require serious attention, especially regarding security and users' psychological well-being [1], [2]. The birth and growth of harmful behaviors, like cyberbullying, which can happen anywhere at any time and are frequently made worse by internet users' anonymity, is one unfavorable effect of greater online activity. Cyberbullying incidents on social media have been on the rise in recent years, according to reports [3].
Teenagers and adults in Indonesia are big fans of websites like YouTube, Instagram, and TikTok. One of the nations with the highest rates of social media use worldwide is Indonesia, which raises the likelihood of encountering hostile conduct and derogatory remarks online [4]. The usage of colloquial language, intense emotional responses, and constantly changing slang are characteristics of the dynamic online communication environment. Because the content of utterances is frequently implicit and reliant on social or language circumstances, this condition poses significant hurdles for automated systems to effectively identify cyberbullying. According to research, victims' mental health is significantly impacted by cyberbullying. Cyberbullying has been linked to long-term psychological effects, including an increased risk of despair, anxiety, and suicidal ideation [5].
In order to better protect users, it is essential to identify aggressive activity on social media as soon as possible. In addition to social and psychological factors, this problem need an efficient and flexible computer method for tracking and automatically detecting dangerous information. Within the context of Natural Language Processing (NLP), the Indonesian language has unique and complex linguistic characteristics. The dominant use of informal language on social media, including continuously evolving slang, local terms, casual writing styles, spelling variations, and code-switching between Indonesian and English, poses a major challenge in cyberbullying detection [6].
Traditional NLP models based on rules or simple statistical approaches often struggle to capture contextual and pragmatic meanings of non-standard texts, leading to a risk of failing to identify implicit offensive intentions in digital speech [7], [8]. Previous studies show that conventional approaches like Neural Networks (NN), Support Vector Machines (SVM), and K-Nearest Neighbor (K-NN) with TF-IDF feature extraction are still widely used in detecting hate speech and cyberbullying. In one study, a configuration based on TF-IDF and K-NN achieved an accuracy of 86.88%, with a precision of 88.27%, recall of 86.88%, and an F1 score of 86.50%, indicating fairly good performance but still limited in understanding informal language context [9], [10].
According to other research, deep learning models like Long Short-Term Memory (LSTM) and Convolutional Neural Networks (CNN) can improve the detection of harmful speech. An accuracy of 0.90, precision of 0.89, recall of 0.90, specificity of 0.91, and an F1 score of 0.89 were reported for a deep learning-based detection system [11], [12]. Transformer architecture, which includes pre-trained models like BERT, RoBERTa, and mBERT, is a hallmark of recent advances in NLP [13], [14]. Transformer models greatly enhance performance in classifying hate speech and cyberbullying, according to international studies. BERT has been shown to detect damaging speech with up to 96.8% accuracy in the Indonesian language setting [15], [16].
Additionally, studies conducted in Indonesia demonstrate that IndoBERT outperforms traditional deep learning models. According to evaluation results, CNN had the highest accuracy at 80.69%, LSTM at 80.67%, CNN-LSTM at 81.18%, and CNN-LSTM-IndoBERT at 82.05% [17]. Nevertheless, these studies have not fully assessed generative text-to-text models like T5 and instead concentrate on traditional classification techniques. By converting all language processing activities into text-based input-output formats, the T5 (Text-to-Text Transfer Transformer) model offers a new paradigm in natural language processing. T5 has the potential to enhance cyberbullying detection, particularly for informal and unstructured social media messages, thanks to its strong pre-training technique against data noise and its capacity to comprehend broader contextual links [18]. Nevertheless, cyberbullying detection remains a critical challenge in mitigating digital threats, as many existing models, particularly for Indonesian language corpora, have not fully utilized the flexibility of text-to-text architectures and still show inconsistent performance [19], [20].
The efficacy of single models and hybrid models in identifying cyberbullying on Indonesian social media is assessed in this study using a comparative experimental method. With an emphasis on the model's resistance to non-standard linguistic variances and class imbalance, the methodology framework compares three primary scenarios: T5 Base, Hybrid (T5 + TF-IDF), and Enhanced (T5 + TF-IDF + Sentiment). As illustrated in Figure 1 System Architecture, the research process consists of steps such dataset preparation, preprocessing, feature extraction, feature fusion, model training, and performance evaluation.
Figure 1. Workflow of the proposed cyberbullying detection, integrating semantic features from T5, statistical features from TF-IDF, and sentiment information through a late fusion strategy.
A dataset of 20,000 social media comments divided into two categories—bullying and non-bullying—is used in this study. In order to address the lack of large-scale labeled data in the Indonesian language, the dataset was translated from an English-language corpus into Indonesian using an autonomous machine translation technique. In order to preserve the inherent features of social media language, such as slang, abbreviations, spelling mistakes, and informal writing styles, the data are kept in their raw textual form. The distribution of classes is naturally unbalanced, with non-bullying incidents making up the bulk and mirroring interactions on social media in real life. Stratified sampling is used to split the data, allocating 15% for training, 15% for validation, and 70% for testing. This allows for a thorough assessment of the model's capacity for generalization.
In order to preserve the original linguistic characteristics of social media texts, little text preparation is carried out. URLs, unnecessary non-alphabetic letters, and basic text normalization are all eliminated during this phase. Every modeling pathway has its own tokenization. Tokenization for the Transformer-based technique uses the SentencePiece-based T5 tokenizer, which is well-suited to managing morphological diversity and informal language and does not rely on explicit word segmentation. Word-level tokenization is used in the statistical method to make TF-IDF feature extraction easier. Three primary components are used to construct the feature representation: sentiment features obtained from an IndoBERT-based sentiment classifier to incorporate affective information, statistical features extracted through TF-IDF with unigram and bigram configurations limited to 5,000 dimensions, and semantic features derived from the T5-base model as a contextual encoder.
A late fusion technique, which combines probabilistic outputs from each modeling pathway at the decision level, is used to integrate features. The final prediction probability in the Hybrid and Enhanced models is computed as a weighted mixture of T5 and TF-IDF outputs, with weights of 0.7 and 0.3, respectively, established via initial validation tests. The AdamW optimizer, which uses cross-entropy loss as the objective function, is used to train all Transformer-based models with a learning rate of 1×10⁻⁵, a batch size of 16, a weight decay of 0.01, and a maximum of 50 training epochs. Model performance is evaluated using Accuracy, Precision, Recall, and Macro F1-Score, with particular emphasis on Macro F1-Score due to its robustness in handling class imbalance. Three experimental scenarios are designed—T5 Base, Hybrid, and Enhanced—allowing for a systematic analysis of the contribution of semantic, statistical, and sentiment-based features in cyberbullying detection.
1. Experimental Results
This section shares the results from three different ways of building models: T5 Base, Hybrid (T5 plus TF-IDF), and Enhanced (T5 plus TF-IDF plus Sentiment). We checked how well each model worked using Accuracy and Macro F1-Score on three groups of data: training, validation, and test. The test group had 70% of the total data and was made this way on purpose to see how well the models handle new, unseen information.
Figure 2. Accuracy and F1-Score Comparison Across Model and Data Splits.
As summarized in Figure 2, all three models did pretty well, with test Macro F1-Scores higher than 0.83. The T5 Base model performed the best, with an Accuracy of 0.8325 and a Macro F1-Score of 0.8329. The Hybrid and Enhanced models had slightly lower but still good results, showing that adding extra features doesn’t hurt the model’s performance and doesn’t clearly make it better than just using T5 alone. The small difference between how the models did on the validation and test sets across all models suggests that the training was stable and that the models didn’t overfit the data too much, even though the training set was not very large.
2. Comparative Performance Analysis
Figure 3. Test-set confusion matrices for the (A) Base, (B) Hybrid, and (C) Enhanced models.
Figure 3 shows the confusion matrix for the T5 Base model on the test set. The majority of the predictions are true positives, and there are not many false negatives, which means the model is able to spot most cyberbullying cases correctly. This matches the high Macro F1-Score from Figure 2, showing the model works well even in tough testing situations. The remaining errors mostly come from false positives and a few false negatives. These are usually because of messages that are unclear or use sarcasm, where the mean to insult isn’t obvious and depends on the situation. This suggests that while the T5 Base model does a good job at finding direct cases of cyberbullying, it still has trouble with more tricky cases where the line between offensive and non-offensive content is not clear.
3. Generalization Analysis
Figure 4. Comparison of training loss trajectories for the Base, Hybrid, and Enhanced models across 50 training epochs.
Figure 4 shows how the model's training and validation loss change over time during training. The model gradually becomes better and stabilizes at the 50th training cycle, according to the results. The validation loss, which decreases as training goes on, roughly resembles the trajectory of the training loss. The model is learning effectively and the training process is stable because the two lines don't move very far apart. This pattern demonstrates that, even in cases where the data set is small, the early halting technique effectively prevents the model from learning too much from the training data. The model performs well on fresh, untested data and remains dependable throughout the training process, as evidenced by the consistency between the training and validation losses.
The experimental results indicate that increasing model complexity does not necessarily lead to improved performance in cyberbullying detection. Despite the integration of additional statistical (TF-IDF) and sentiment-based features, both the Hybrid and Enhanced models failed to outperform the T5 Base model on the test set. This finding suggests that the contextual semantic representations learned by the T5 model are sufficiently expressive to capture both explicit and implicit forms of cyberbullying in Indonesian social media texts. Large pre-trained Transformer models showed diminishing results when reinforced with additional handmade features, especially when semantic embeddings already hold extensive contextual and pragmatic information, according to similar findings from earlier research [18]. The increasing agreement that end-to-end Transformer-based models may successfully absorb lexical, syntactic, and semantic inputs without the need for extra feature engineering is supported by this result.
From the standpoint of representation, the T5 Base model's robust performance can be explained by its capacity to produce contextualized embeddings that naturally encapsulate important lexical signals that are normally captured by conventional TF-IDF features. T5 is a text-to-text Transformer that interprets ambiguous expressions, sarcasm, and indirect aggression—phenomena that are common in Indonesian social media discourse—by dynamically modeling word meaning based on surrounding context. This is similar with earlier findings in Indonesian text categorization research, which demonstrate that Transformer-based systems like IndoBERT and T5 frequently outperform classical feature-based approaches, especially in tasks involving semantic nuance and informal language [21], [22]. Therefore, when powerful pre-trained representations are already used, the inclusion of TF-IDF features may increase redundancy rather than complementary information, supporting the model parsimony principle.
Nevertheless, the Hybrid and Enhanced models still demonstrate competitive and robust performance, particularly in detecting overtly offensive or emotionally charged language. In these cases, TF-IDF features and sentiment signals help reinforce explicit lexical patterns and affective polarity, which have been shown to be strong indicators of cyberbullying behavior in prior research [23], [24]. However, the marginal gains achieved by feature fusion are constrained by the increased optimization complexity introduced by late fusion strategies, especially under limited training data conditions. Similar difficulties have been reported in multi-feature fusion research, where, if not properly calibrated, integrating disparate representations can impede convergence and lower generalization performance [21], [25].
This study's evaluation approach, which uses a sizable test set that makes up 70% of the entire data, is a significant contribution. The T5 Base model demonstrates the efficacy of transfer learning for low-resource and linguistically challenging contexts like Indonesian, even though it was trained on only 15% of the dataset. Given Indonesia's high social media participation (about 77% of the population), where language use is highly dynamic, informal, and context-dependent [26], this conclusion is especially pertinent. The findings support previous research showing that pre-trained language models are suitable for real-world social media monitoring tasks, where linguistic variance is significant and labeled data are hard to come by [27], [28].
Finally, the broader societal relevance of this work must be considered in light of the documented negative impacts of cyberbullying on psychological well-being, academic performance, and social relationships. Prior studies consistently report strong associations between cyberbullying exposure and anxiety, depression, social withdrawal, and reduced school engagement among adolescents [23], [29], [30]. In the Indonesian context, these risks are further amplified by linguistic ambiguity, widespread use of abbreviations, regional languages, and sarcasm, all of which complicate automatic detection [31], [32], [33]. The strong performance of the T5 Base model in this study suggests that advanced semantic modeling offers a promising direction for developing scalable, culturally adaptive cyberbullying detection systems capable of supporting prevention and intervention efforts in highly diverse digital environments.
This study demonstrates that advanced semantic representations derived from pre-trained Transformer models are highly effective for cyberbullying detection in Indonesian social media contexts. The experimental results show that even when trained on a relatively small percentage of the dataset, the T5 Base model regularly exhibits good generalization performance, and that increasing model complexity through feature fusion does not always result in performance improvements. These results show that both explicit and implicit types of cyberbullying, such as those conveyed through informal language, ambiguity, and indirect expressions, can be captured by T5's contextual embeddings. Thus, our study lends credence to the idea that end-to-end semantic modeling can either match or surpass hybrid systems that depend on extra statistical or handmade features.
The resilience of the T5 Base model has significant implications for cyberbullying identification in low-resource and linguistically varied countries like Indonesia, both practically and socially. The results indicate that using semantically driven models may offer a scalable and dependable solution for real-world monitoring and intervention systems, given the high prevalence of social media use and the established detrimental effects of cyberbullying on mental health, academic engagement, and social well-being. Future research is encouraged to explore domain adaptation, cross-lingual learning, and explainable AI techniques to further enhance model transparency and cultural sensitivity, thereby supporting more effective and ethically responsible cyberbullying mitigation strategies.
R. Chudal et al., “Victimization by Traditional Bullying and Cyberbullying and the Combination of These Among Adolescents in 13 European and Asian Countries,” European Child and Adolescent Psychiatry, vol. 31, no. 9, pp. 1391–1404, Sep. 2022, doi: 10.1007/s00787-021-01779-6.
N. Cholifah, N. F. Nuzula, N. Zahra, and G. L. Perdani, “Strategi Untuk Menangani dan Mencegah Cyberbullying di Media Sosial: Studi Literatur,” in Social, Humanities, and Educational Studies (SHEs): Conference Series, vol. 7, Jul. 2024, pp. 1369–1375. [Online]. Available: https://jurnal.uns.ac.id/shes
S. T. Lokkeberg, C. Ihlebak, G. Brottveit, and L. Del Busso, “Digital Violence and Abuse: A Scoping Review of Adverse Experiences Within Adolescent Intimate Partner Relationships,” Trauma, Violence, and Abuse, Jul. 2024, doi: 10.1177/15248380231201816.
Asosiasi Penyelenggara Jasa Internet Indonesia, Laporan Survei Internet Indonesia 2023. Jakarta: APJII, 2023.
C. Maurya, T. Muhammad, P. Dhillon, and P. Maurya, “The Effects of Cyberbullying Victimization on Depression and Suicidal Ideation Among Adolescents and Young Adults: A Three-Year Cohort Study From India,” BMC Psychiatry, vol. 22, no. 1, 2022, doi: 10.1186/s12888-022-04238-x.
E. W. Pamungkas, D. Galih, P. Putri, and A. Fatmawati, “Hate Speech Detection in Bahasa Indonesia: Challenges and Opportunities,” 2023. [Online]. Available: https://www.statista.com/statistics/242606/
S. Unnava and S. R. Parasana, “A Study of Cyberbullying Detection and Classification Techniques: A Machine Learning Approach,” Engineering, Technology and Applied Science Research, vol. 14, no. 4, pp. 15607–15613, Aug. 2024, doi: 10.48084/etasr.7621.
O. Weller, K. Seppi, and M. Gardner, “When to Use Multi-Task Learning vs Intermediate Fine-Tuning for Pre-Trained Encoder Transfer Learning,” arXiv:2205.08124, May 2022. [Online]. Available: http://arxiv.org/abs/2205.08124
A. D. Fikri, F. Utaminingrum, and E. G. E. Setyawan, “Sistem Pendeteksi Kekerasan di Ruang Publik Menggunakan Metode 3D Convolutional Neural Network,” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer (J-PTIIK), 2024. [Online]. Available: http://j-ptiik.ub.ac.id
A. Ulinuha, E. Majid, and R. Nuari, “Performance Comparison of BERT Metrics and Classical Machine Learning Models (SVM, Naive Bayes) for Sentiment Analysis,” Scientific Journal, vol. 10, no. 2, 2025.
D. Dhelviana, T. Amelia, J. Sulaksono, and D. W. Widodo, “Sistem Pendeteksi Kekerasan Berbasis Convolutional Neural Network,” 2023.
K. Hadi and E. Utami, “Analisis K-NN dengan Integrasi BoW, TF-IDF, dan N-Grams untuk Klasifikasi Ujaran Kebencian pada Twitter,” JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika), vol. 10, no. 4, pp. 2971–2983, 2025, doi: 10.29100/jipi.v10i4.6694.
P. Ekspansi et al., Penerapan Konsep Machine Learning dan Deep Learning. 2024.
K. Nemkul, “Use of Bidirectional Encoder Representations from Transformers (BERT) and Robustly Optimized BERT Pretraining Approach (RoBERTa) for Nepali News Classification,” Tribhuvan University Journal, vol. 39, no. 1, pp. 124–137, Jun. 2024, doi: 10.3126/tuj.v39i1.66679.
M. D. Maulana, C. Sri, and K. Aditya, “Perbandingan IndoBERT dan Bi-LSTM dalam Mendeteksi Pelanggaran Undang-Undang ITE,” SINTECH Journal, vol. 8, no. 1, Apr. 2025.
M. Amien, G. F. Gunawan, and K. Kunci, “BERT dan Bahasa Indonesia: Studi tentang Model NLP Berbasis Transformer,” Elang: Jurnal Interdisciplinary Research, 2024.
A. A. Hafiza and E. B. Setiawan, “Enhancing Cyberbullying Detection on Platform X Using IndoBERT and Hybrid CNN-LSTM Model,” Jurnal Teknik Informatika, vol. 6, no. 2, pp. 655–672, Apr. 2025, doi: 10.52436/1.jutif.2025.6.2.4321.
C. Raffel et al., “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,” Journal of Machine Learning Research, vol. 21, pp. 1–67, 2020. [Online]. Available: http://jmlr.org/papers/v21/20-074.html
N. Durrani and A. Halai, “Dynamics of Gender Justice, Conflict and Social Cohesion: Analysing Educational Reforms in Pakistan,” International Journal of Educational Development, 2018, doi: 10.1016/j.ijedudev.2017.11.010.
M. Isa, Sunan At-Tirmidzi. Riyadh: Maktabah Al-Ma’arif Li Al-Nashr wa Al-Tawzi’, n.d.
A. Aripin, S. A. Santoso, and H. Haryanto, “Mengoptimalkan Akurasi pada Klasifikasi Emosi Majemuk Berdasarkan Semantik Kalimat Menggunakan XLM-RoBERTa,” Jurnal Nasional Teknik Elektro dan Teknologi Informasi, vol. 12, no. 1, pp. 29–36, 2023, doi: 10.22146/jnteti.v12i1.6084.
N. Rahayu, S. I. Yani, M. Marwah, and A. Pratama, “Analisis Sentimen terhadap Terorisme pada Platform Twitter Menggunakan Support Vector Machine,” JUMISTIK, vol. 4, no. 1, pp. 430–441, 2025, doi: 10.70247/jumistik.v4i1.152.
M. M. Molero et al., “Anxiety and Depression From Cybervictimization in Adolescents: A Meta-Analysis and Meta-Regression Study,” European Journal of Psychology Applied to Legal Context, vol. 14, no. 1, pp. 42–50, 2022, doi: 10.5093/ejpalc2022a5.
M. S. Park et al., “Sociocultural Values, Attitudes and Risk Factors Associated With Adolescent Cyberbullying in East Asia: A Systematic Review,” Cyberpsychology: Journal of Psychosocial Research on Cyberspace, vol. 15, no. 1, 2021, doi: 10.5817/cp2021-1-5.
D. Sebastian and K. A. Nugraha, “Sistem Perbaikan Kata Tidak Baku Bahasa Indonesia Menggunakan Metode Crowdsourcing,” Jurnal Teknik Informatika dan Sistem Informasi, vol. 5, no. 3, 2020, doi: 10.28932/jutisi.v5i3.1983.
S. Hafifah and W. Marsisno, “Permasalahan dan Potensi dalam Diseminasi Official Statistics pada Badan Pusat Statistik,” Seminar Nasional Official Statistics, vol. 2022, no. 1, pp. 323–332, 2022, doi: 10.34123/semnasoffstat.v2022i1.1419.
M. H. Nguyen et al., “Staying Connected While Physically Apart: Digital Communication When Face-to-Face Interactions Are Limited,” New Media and Society, vol. 24, no. 9, pp. 2046–2067, 2021, doi: 10.1177/1461444820985442.
V. Boursier et al., “Facing Loneliness and Anxiety During the COVID-19 Isolation: The Role of Excessive Social Media Use in a Sample of Italian Adults,” Frontiers in Psychiatry, vol. 11, 2020, doi: 10.3389/fpsyt.2020.586222.
G. Gohal et al., “Prevalence and Related Risks of Cyberbullying and Its Effects on Adolescent,” BMC Psychiatry, vol. 23, no. 1, 2023, doi: 10.1186/s12888-023-04542-0.
A. Ragusa et al., “Impact of Cyberbullying on Academic Performance and Psychosocial Well-Being of Italian Students,” Children, vol. 11, no. 8, p. 943, 2024, doi: 10.3390/children11080943.
S. L. Yani, “Sarkasme pada Media Sosial Twitter dan Implikasinya terhadap Pembelajaran Bahasa Indonesia di SMA,” Tabasa: Jurnal Bahasa, Sastra Indonesia, dan Pengajarannya, vol. 1, no. 2, pp. 269–284, 2021, doi: 10.22515/tabasa.v1i2.2628.
S. Musyarrafah, A. Santoso, and G. Susanto, “Generation Z’s Prokem in Social Media: Language Transformation and Social Identity of Makassar Adolescents,” Satwika: Kajian Ilmu Budaya dan Perubahan Sosial, vol. 9, no. 1, pp. 17–28, 2025, doi: 10.22219/satwika.v9i1.37487.
A. C. B. Biringkanae, I. Garim, and N. Nurhusna, “Gaya Bahasa Sarkasme pada Akun Media Sosial TikTok,” Nuances of Indonesian Language, vol. 5, no. 1, pp. 50–61, 2024, doi: 10.51817/nila.v5i1.776.