Pelcra

Polish & English
Language Corpora
for Research
& Applications

PELCRA Learner English Corpus (PLEC)

The Polish-English Learner Corpus aims to investigate selected aspects of the English used by Polish speakers by applying a corpus linguistics methodology and to disseminate the results among teachers, authors of handbooks and in the Academia.

The empirical basis for this project is a 3-million word corpus of spoken (200,000 words) and written texts by Polish learners of English. The corpus is annotated with errors and linguistic phenomena typical for Polish speakers. The annotation is done partly automatically, by comparing the corpus with existing corpora of Polish and English;  some phenomena, such as phonetic errors, are annotated manually. The corpus will serve as a basis for an index of most frequent phonetic, linguistic and grammatical errors, linguistic transfers and fossilisations, which significantly hamper the communicative performance of Polish English speakers.

The project has been registered as a research project of the Ministry of Science and Higher Education no. N N104 205039.

Project home page: http://pelcra.pl/plec/