Tviterasi, tviteraši or twitteraši? Producing and analysing a normalised dataset of Croatian and Serbian tweets

QR Code

Tviterasi, tviteraši or twitteraši? Producing and analysing a normalised dataset of Croatian and Serbian tweets

In this paper we discuss the parallel manual normalisation of samples extracted from Croatian and Serbian Twitter corpora. We describe the datasets, outline the unified guidelines provided to annotators, and present a series of analyses of standard-to-non-standard transformations found in the Twitte...

Full description

Saved in:

Bibliographic Details
Main Authors:	Maja Miličević, Nikola Ljubešić
Format:	Article
Language:	English
Published:	University of Ljubljana Press (Založba Univerze v Ljubljani) 2016-09-01
Series:	Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave
Subjects:	computer-mediated communication CMC corpora Twitter normalisation
Online Access:	https://journals.uni-lj.si/slovenscina2/article/view/7007
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Towards an Estonian dataset on document-level subjectivity
by: Karl Gustav Gailit, et al.
Published: (2025-06-01)

Normalisation of passivity during childbirth - positive experiences and trust in the healthcare system in serbia as generators of justifying passivity
by: Ninković Milica, et al.
Published: (2025-01-01)

People power: a content analysis of #VAWG tweets in India
by: Jayanthi Iyengar, et al.
Published: (2024-12-01)

An Extensible Schema For Building Large Weakly-labeled Semantic Corpora
by: S.Matthew English
Published: (2016-05-01)

Tagging Named Entities in Croatian Tweets
by: Krešimir Baksa, et al.
Published: (2016-06-01)

From Tweets to Trades: A Bibliometric and Systematic Review of Social Media’s Influence on Cryptocurrency
by: Sheela Sundarasen, et al.
Published: (2025-05-01)

Co-creation of social innovations for healthy ageing in rural Europe – a process evaluation of a volunteer-led guided conversation toolkit using Normalisation Process Theory (NPT)
by: Basharat Hussain, et al.
Published: (2024-12-01)

Why Did People Argue on Twitter? Scrutinizing Argumentation Strategies Used in the Tweets of #InternationalWomensDay
by: Nur Alfiana Isnaini
Published: (2023-11-01)

Hospital Tweets on H1N1 and Death Panels: Text Mining the Situational Crisis Communication Response to Health Crises and Controversies
by: Aimee Kendall Roundtree
Published: (2018-06-01)

Membandingkan Nilai Akurasi BERT dan DistilBERT pada Dataset Twitter
by: Faisal Fajri, et al.
Published: (2022-12-01)

Effective tweets classification for disaster crisis based on ensemble of classifiers
by: Christopher Ifeanyi Eke, et al.
Published: (2025-08-01)

Monitoring Land Surface Temperature on Urban Expansion using Normalised Difference Vegetation Index and Google Earth Engine
by: Nur Suhaili Mansor, et al.
Published: (2025-07-01)

Experimental Comparative Investigations to Evaluate Cavitation Conditions within a Centrifugal Pump Based on Vibration and Acoustic Analyses Techniques
by: Ahmed Ramadhan AL-OBAIDI
Published: (2020-07-01)

Annotated data for semantic role labeling of crisis events in Indonesian TweetsMendeley Data
by: Amelia Devi Putri Ariyanto, et al.
Published: (2025-08-01)

EVALUASI FISIK SEDIAAN SUSPENSI DENGAN KOMBINASI SUSPENDING AGENT PGA (Pulvis Gummi Arabici) DAN CMC-Na (Carboxymethylcellulosum Natrium)
by: Ni Made Dharma Shantini Suena
Published: (2020-04-01)

Navigating the manufacturing, testing and regulatory complexities of regulatory T cells for adoptive cell therapy
by: Larissa A. Pikor, et al.
Published: (2025-07-01)

A protective role of resveratrol against the effects of immobilization stress in corpora lutea of mice in early pregnancy
by: Saif ULLAH, et al.
Published: (2020-07-01)

Effect of Adding Hydrocolloid as A Stabilizer on The Rheological Properties and Total Lactic Acid Bacteria of Yogurt Drinks During Cold Storage
by: Nurul Latifasari, et al.
Published: (2024-12-01)

THE INFLUENCE OF DIGITAL COMMUNICATION COMPETENCE ON STUDENT MOTIVATION
by: Shilvia Rahma Pratiwi, et al.
Published: (2025-06-01)

State-of-the-art on monolingual lexicography for Croatia (Croatian)
by: Kristina Štrkalj Despot, et al.
Published: (2019-04-01)

A Thought Too Far: A Case for a Corpus Approach to Bad Knowledge in Old English Literature
by: Rían Boyle
Published: (2025-07-01)

The Incorporation of the Corpus-Based Approach in the Teaching of Second Language Writing
by: Abdeldjalil BOUGHEZAL
Published: (2020-06-01)

Bell’s Inequalities and Entanglement in Corpora of Italian Language
by: Diederik Aerts, et al.
Published: (2025-06-01)

Size of corpora and collocations: The case of Russian
by: Maria Khokhlova, et al.
Published: (2020-08-01)

Learning languages from parallel corpora
by: Johannes Graën
Published: (2022-12-01)

Le test de compréhension de l’IRonie et des Requêtes Indirectes – version courte (IRRI-C) : développement, validité de contenu et données normatives préliminaires.
by: Natacha Cordonier, et al.
Published: (2024-08-01)

Optimization of anticoagulant therapy in patients undergoing mechanical heart valve replacement
by: S. A. Tkachenko, et al.
Published: (2023-09-01)

Elaborating a Methodology for Gauging a Politician’s Communicative Personality
by: Denis S. Mukhortov, et al.
Published: (2025-12-01)

Emotion analysis in socially unacceptable discourse
by: Jasmin Franza, et al.
Published: (2022-12-01)

Copyright Infringement on Twitter: The Unauthorized Use of K-Pop Fan Photography by Fanfiction Author Azzamine
by: Fahmi Fairuzzaman, et al.
Published: (2025-07-01)

Introducción
by: Beatriz Gallardo Paúls, et al.
Published: (2024-12-01)

Accelerating the speed of innovative anti-tumor drugs to first-in-human trials incorporating key de-risk strategies
by: Yuqi Wang, et al.
Published: (2023-12-01)

Sensorimotor correlates of sit-to-stand in healthy adults
by: Caitlin McDonald, et al.
Published: (2025-07-01)

IMPLEMENTING EDUCATIONAL DIGITAL GAMES INTO TEACHER-TRAINING PRO- GRAMMES: A CASE-STUDY BASED ON GERMAN EXPERIENCE
by: Amélia Lopes, et al.
Published: (2025-06-01)

Speech Emotion Recognition Based on Voice Fundamental Frequency
by: Teodora DIMITROVA-GREKOW, et al.
Published: (2019-04-01)

The MuLeCo project: A learner corpus of L1 German learners of romance languages
by: Stephan Lücke, et al.
Published: (2025-12-01)

About Lexical Differences between Old Shtokavian Croatian and New Shtokavian Serbian Dialects
by: E. I. Yakushkina
Published: (2020-08-01)

Monitoring UK saltmarsh restoration using earth observation for national greenhouse gas accounting
by: Hannah Clilverd, et al.
Published: (2025-09-01)

Nicknames of English-Speaking Adolescent Users of Social Networks (on the Example of Twitter)
by: V. V. Kaziaba, et al.
Published: (2020-02-01)

Quando a lamentação leva à adesão: o Twitter enquanto suporte para a construção do ethos discursivo
by: Albylene da Silva
Published: (2021-12-01)