Date: 
03 March 2020 - 14:00 to 15:00
Location: 
Online

 

A high-quality orthographic transcript is the basis for all types of analyses of spoken language data. However, transcribing speech is a time-consuming and tedious task. But automatic speech recognition as well as NLP and text annotation tools can make this task much quicker and save you a lot of time and frustration.

In this first of a series of SSHOC webinars, organised by the consortium partner CLARIN ERIC, we will discuss the theoretical basis and the technology available for transcribing spoken language. In particular, we will focus on the role of automatic speech recognition – what are the opportunities, what are the pitfalls and to where can it be applied successfully.

ABOUT THE WEBINAR

Introduction to working with interview data. Henk van den Heuvel - Head of the Humanities Lab at the Faculty of Arts, Radbound University Nijmegen - will briefly introduce the topic of using interviews as research instrument, and the cross-disciplinary nature of using interview data. He will also give some background on the Transcription Chain initiative which originated from oral history research but has a much larger potential.        

Demonstration of automatic transcription of speech. Christoph Draxler - Researcher at the Institute of Phonetics and Speech Processing at Ludwig Maximilian University Munich - will demonstrate the web portal for the automatic transcription of speech. This portal currently supports three languages (English, German, Dutch), with Italian and Czech in the pipeline. The portal provides a user-friendly interface, so that researchers without a technical background may use state-of-the-art recognizers, optimized annotation editors and powerful segmentation services to result in high-quality time-aligned transcripts. These transcripts are the basis for the following in-depth scientific analysis, e. g. topic modeling, linguistic structures, named entity recognition.

ABOUT THE SPEAKERS

Henk van den Heuvel

  • since 1990, researcher at Radboud University Nijmegen;
  • since 2003, director of the Centre for Language and Speech Technology at Radboud University;
  • since 2016, also Head of the Humanties Lab at the Faculty of Arts;
  • since 2019, member of the CLARIN Knowledge Centre for Atypical Communication (ACE);
  • participation in various academic and industrial speech database projects, e.g. SpeechDat, OrienTel SALA;
  • involved in many projects with automatic speech recognition;
  • specialised in management of critical and sensitive research data.

Christoph Draxler

  • since 1991, researcher at the Institute of Phonetics and Speech Processing at Ludwig Maximilian University Munich since 1991;
  • since 1997, co-director of the Bavarian Archive for Speech Signals, a CLARIN-D centre since 2010;
  • participation in various academic and industrial speech database projects, e.g. Verbmobil, SpeechDat, Ph@ttSessionz, VOYS;
  • participation in the development of SpeechRecorder for scripted recordings, WebTranscribe and Octra for web-based transcription, Percy for online perception experiments.

 

Slides are available here: https://doi.org/10.5281/zenodo.3694223

 

See here the full recording of the webinar: