The Serbian Movie Review Dataset collection consists of three movie review datasets in Serbian which were constructed for the task of sentiment analysis:
- Collected movie reviews in Serbian (ISLRN 252-457-966-231-5) – an imbalanced collection of 4725 movie reviews in Serbian.
- SerbMR-2C – The Serbian Movie Review Dataset (2 Classes) (ISLRN 016-049-192-514-1) – a two-class balanced dataset that contains 1682 movie reviews (841 positive and 841 negative).
- SerbMR-3C – The Serbian Movie Review Dataset (3 Classes) (ISLRN 229-533-271-984-0) – a three-class balanced dataset that contains 2523 movie reviews (841 positive, 841 neutral, and 841 negative).
Author
Vuk Batanović
Availability
All corpora with an extensive documentation can be downloaded from the SerbMR GitHub repository.
Publications
Vuk Batanović, Boško Nikolić, Milan Milosavljević (2016). Reliable Baselines for Sentiment Analysis in Resource-Limited Languages: The Serbian Movie Review Dataset. Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), pp. 2688-2696, Portorož, Slovenia. [Link] [.bib]