EUROCALL 2014

Full Program »

Partial and synchronized captioning: A new tool for second language listening development

This study investigates use of a novel method of captioning, partial and synchronized, as a listening tool for second language (L2) learners. In this method, the term partial and synchronized caption pertains to the presence of only a selected set of words in a caption where words are synched to their corresponding aural cues. This new approach relies on the latest advances in speech recognition technology, where a state-of-the-art automatic speech recognition system was trained using the desired corpora. Empowered by this technology, the system presents caption text word by word, aligned in precise timing with the speech signal of the respective words, which effectively shows the correspondence between words and the audio channel. The outcome of this process is used to generate partial captions by automatically selecting words and phrases which are likely to hinder learner’s listening comprehension. The selected words are presented in caption while the rest are masked by dots in order to make comprehension based more on listening to the speech rather than solely on reading the caption text. The criteria for selection are defined by two features, word frequency and speech rate. This method is based on the premise that occurrence of infrequent words and fast delivery of speech by the speaker attenuate L2 listening comprehension. Thus the learner’s vocabulary size and tolerable rate of speech were adopted as the basis for generating the captions. Partial and synchronized captioning is anticipated to be not only an assistive tool to enhance L2 learners’ listening comprehension skills but also a medium to decrease dependence on captions thus preparing learners for real-world situations.
To evaluate the system, an experiment was conducted with Japanese learners of English in a CALL class at Kyoto University. The performance of the participants on a listening comprehension test was assessed after they watched videos under three different conditions: no caption, full caption and partial-and-synchronized caption. Analysis of results revealed that the students’ performance on the proposed captioning method was as good as full caption and significantly higher than the no-caption condition. Immediately after finishing the test, the learners were asked to watch more of the same video without any captions (as in a real-world situation) and to take a comprehension test in a similar manner. The results of this part of the experiment showed a statistically significant improvement in learners’ performance when they first watched the video with the proposed captioning method as compared to the other two conditions. Although the participants were not familiar with this type of captioning, learner feedback on the method was positive. The findings of this study indicate that partial and synchronized captioning leads to the same level of comprehension as full captioning while presenting less than 30% of the transcript. The findings further suggest that this form of captioning can be effectively incorporated into CALL systems as an alternative method to enhance L2 listening comprehension.

Author(s):

Maryam Sadat Mirzaei    
Graduate School of Informatics
Kyoto University
Japan

Maryam Sadat Mirzaei received her B.A. in linguistics from Teacher Training University of Tehran in 2009 and obtained a master’s degree in informatics from Kyoto University in 2012. She is currently pursuing her Ph.D. in informatics with an emphasis on developing CALL systems for which she has received a scholarship by the Ministry of Education of Japan (MEXT). She holds a Diploma of Teacher Training Course from Iran Language Institute where she worked as an English instructor, 2006-2010. At the present time, her project focuses on a novel technique of captioning for L2 listening development. Her research interests lie in the fields of CALL, Educational Technology, Human Language Technology, Information and Communication Technology and Computer Mediated Communication.

Yuya Akita    
Graduate School of Informatics
Kyoto University
Japan

Yuya Akita received B.E., M.Sc. and Ph.D. degrees in 2000, 2002 and 2005, respectively, from Kyoto University. Since 2005, he has been an assistant professor at Academic Center for Computing and Media Studies, Kyoto University. His research interests include spontaneous speech recognition and spoken language processing. He is a member of IEICE, IPSJ, ASJ and IEEE.

Tatsuya kawahara    
Graduate School of Informatics
Kyoto University
Japan

Tatsuya Kawahara received B.E. in 1987, M.E. in 1989, and Ph.D. in 1995, all in information science, from Kyoto University, Kyoto, Japan. Currently, he is a Professor in the Academic Center for Computing and Media Studies and an Affiliated Professor in the School of Informatics, Kyoto University. He has also been an Invited Researcher at ATR and NICT. He has published more than 250 technical papers on speech recognition, spoken language processing, and spoken dialogue systems. He has been conducting several speech-related projects in Japan including free large vocabulary continuous speech recognition software (http://julius.sourceforge.jp/) and the automatic transcription system for the Japanese Parliament (Diet).

 

Powered by OpenConf®
Copyright ©2002-2013 Zakon Group LLC