Tviterasi, tviteraši or twitteraši? Producing and analysing a normalised dataset of Croatian and Serbian tweets
In this paper we discuss the parallel manual normalisation of samples extracted from Croatian and Serbian Twitter corpora. We describe the datasets, outline the unified guidelines provided to annotators, and present a series of analyses of standard-to-non-standard transformations found in the Twitte...
Saved in:
Main Authors: | Maja Miličević, Nikola Ljubešić |
---|---|
Format: | Article |
Language: | English |
Published: |
University of Ljubljana Press (Založba Univerze v Ljubljani)
2016-09-01
|
Series: | Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave |
Subjects: | |
Online Access: | https://journals.uni-lj.si/slovenscina2/article/view/7007 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Towards an Estonian dataset on document-level subjectivity
by: Karl Gustav Gailit, et al.
Published: (2025-06-01) -
Normalisation of passivity during childbirth - positive experiences and trust in the healthcare system in serbia as generators of justifying passivity
by: Ninković Milica, et al.
Published: (2025-01-01) -
People power: a content analysis of #VAWG tweets in India
by: Jayanthi Iyengar, et al.
Published: (2024-12-01) -
An Extensible Schema For Building Large Weakly-labeled Semantic Corpora
by: S.Matthew English
Published: (2016-05-01) -
Tagging Named Entities in Croatian Tweets
by: Krešimir Baksa, et al.
Published: (2016-06-01)