Examples¶
This is an incomplete list of public Tsakorpus instances, sorted by language families and languages. Note that some corpora may use older or customized versions of Tsakorpus.
Afro-Asiatic
Chukchi–Kamchatkan
Chukchi: multimedia, Amguema dialect (obsolete version of the platform)
Indo-European
Albanian: written modern, historical
Luwian: written, historical
Contact-influenced Russian (obsolete version of the platform)
Tajik: written modern
Shugnhi: spoken/written
Slavic dialects of Albania: spoken
NE Caucasian
Dargwa: written modern, standard
Lezgi: spoken (Tsnal variety)
NW Caucasian
Adyghe: written modern
Sino-Tibetan
Tungusic
Turkic
Bashkir: spoken
Dolgan: spoken
Khakas: spoken
Sakha (Yakut): spoken (Yakut/Russian code switching)
Uralic
Beserman: multimedia
Enets: spoken
Erzya: written modern, social media
Kamas: spoken
Komi Zyrian: written modern, social media
Meadow Mari: written modern, social media
Moksha: written modern, social media
Nenets: spoken
Selkup: spoken
Udmurt: written modern, social media, spoken, spoken (Tatyshly dialect)
Multilingual