I am always looking for corpora of spoken French for my research so I was quite surprised to come across several freely available resources on the internet in the past week. Most of these corpora contain audio and/or video with transcripts of authentic and spontaneous spoken French – perfect for self-study or use in a language lab.
- SACODEYL (System-aided compilation: an open distribution of European youth language) is actually available in seven EU languages (English, French, German, Italian, Spanish, Romanian, and Lithuanian) and was designed specifically for teaching purposes. Click on Resources after choosing a corpus to access the learning packages.
- FLEURON (Français langue étrangère : ressources et outils numériques) is a collection of audio and video resources that cover aspects of student life in France. Captions and a glossary are also provided.
- CFPP2000 (Corpus de français parlé parisien des années 2000) contains several interviews of Parisians within the past decade. Audio files and transcripts are available for download.
- CFPQ (Corpus de français parlé au Québec) is a multimodal corpus that also includes information on non-verbal aspects of communication (such as gestures, facial movements, etc.) It also dates from the 2000’s; however, only PDFs of the transcripts are available.
Other corpora of spoken French or simply videos with transcripts that I’ve mentioned in the past include: