The materials from Regional variation in gender marking: a hands-on tutorial on extracting data from corpora are now available for download from https://github.com/clarinsi/workshop_reg_mark. These materials provide an introduction to the process of using corpora to study a linguistic (and not only linguistic) problem, with information on:
- how to find (comparable) South Slavic corpora in the CLARIN.SI repository
- how to explore corpora through the noSketchEngine and KonText concordancers
- how to study gender marking looking at frequencies of feminine and masculine nouns describing occupations, and at the distribution of feminine and masculine forms of different verbs
- how to draw conclusions about gender bias in society based on corpus results
The materials were prepared by Mirjana Starović and Tanja Samardžić as part of the online workshop held on 6 and 7 November 2021, organised by the University of Zurich – URPP “Language and Space”, the CLARIN knowledge centre for South Slavic languages – CLASSLA and the ReLDI centre. The programme also included a keynote talk by Yves Scherrer from the University of Helsinki, Darja Fišer’s presentation of opportunities for student presentations at the JTDH Language Technologies and Digital Humanities Conference, and an Interactive workshop on regional variation in text led by Sara Košutar, Larissa Schmidt and Leyla Feiner.
The workshop saw the participation of around 30 students and colleagues divided between GatherTown and Zoom, with lively and fun interactive sessions, and some surprising findings. A follow-up mentoring session for students took place on 16 December 2021.
For CLASSLA accounts of the workshop, see here: