Tviterasi, tviteraši or twitteraši? Producing and analysing a normalised dataset of Croatian and Serbian tweets

In this paper we discuss the parallel manual normalisation of samples extracted from Croatian and Serbian Twitter corpora. We describe the datasets, outline the unified guidelines provided to annotators, and present a series of analyses of standard-to-non-standard transformations found in the Twitte...

Full beskrivning

Sparad:

Bibliografiska uppgifter
Huvudupphovsmän:	Maja Miličević, Nikola Ljubešić
Materialtyp:	Artikel
Språk:	engelska
Publicerad:	University of Ljubljana Press (Založba Univerze v Ljubljani) 2016-09-01
Serie:	Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave
Ämnen:	computer-mediated communication CMC corpora Twitter normalisation
Länkar:	https://journals.uni-lj.si/slovenscina2/article/view/7007
Taggar:	Lägg till en tagg Inga taggar, Lägg till första taggen!

Internet

https://journals.uni-lj.si/slovenscina2/article/view/7007

Tviterasi, tviteraši or twitteraši? Producing and analysing a normalised dataset of Croatian and Serbian tweets

Internet

Liknande verk