Polish & English
Language Corpora
for Research
& Applications

Conversational Spoken Corpus of Polish

 The most recent version of our spoken-conversational corpus is available at: and

Offline time-aligned corpus

This citation is required to fulfill the CC attribution condition of the license.

  • Piotr Pęzik 2012 Język mówiony w NKJP. In Narodowy Korpus Języka Polskiego. Wydawnictwo Naukowe PWN, Warsaw. 2012.

More recent corpora

There are also a number of more recent corpora listed below which can be downloaded with recordings.

The following paper should be cited fulfill the CC attribution condition of the license for these resources:


A corpus of focused interviews (people reflecting upon their emotions).




A corpus of open interviews.




Samples of spoken parliamentary data.




A corpus of Polish emmigrants to Scotland.