Download Taiga corpus:

Our corpus is now in beta-version. Feedback wanted!

All corpus (92 GB)

Our special collections for

All news

All social media

All subtitles

Small datasets


Fake news


Users are informed, that corpus data is for personal and research purposes.

Taiga corpus collective is not responsible of any user violations of these rules.

For authors:

If you have found your text in the corpus and do not want researchers to have it in their experimental data, please contact us, we will delete it.