“Low-Resource Spoken Language Understanding”¶

Progress in speech processing has been facilitated by shared datasets and benchmarks. Historically these have focused on automatic speech recognition (ASR), speaker identification, or other lower-level tasks. Interest has been growing in higher-level spoken language understanding (SLU) tasks, including using end-to-end models, but there are fewer annotated datasets for such tasks, and the existing datasets tend to be relatively small. At the same time, recent work shows the possibility of pre-training generic representations and then fine-tuning for several tasks using relatively little labeled data.

In this special session, we would like to foster a discussion and invite researchers in the field of SLU working on tasks such as named entity recognition (NER), sentiment analysis, intent classification, dialogue act tagging, or others, using either audio or ASR transcripts.

We invite contributions for any relevant work in low-resource SLU problems, which include (but are not limited to):

Training/fine-tuning approach using self/semi-supervised model for SLU tasks
Comparison between pipeline and end-to-end SLU systems
Self/semi-supervised learning approach focusing on SLU
Multi-task/transfer/student-teacher learning focusing on SLU tasks
Theoretical or empirical study on low-resource SLU problems

Resources

For this special session, we will provide support for several benchmark tasks using the new Spoken Language Understanding Evaluation (SLUE) benchmark suite (https://arxiv.org/abs/2111.10367). SLUE includes annotation for ASR, NER and sentiment analysis. We also provide a toolkit to pre-process and fine-tune scripts for baseline models. It is not mandatory for submissions to use SLUE, but we offer it as a well-defined experiment setting for low-resource SLU.

SLUE Dataset: \
- slue-voxceleb
- slue-voxpopuli
SLUE Toolkit: Github repo
SLUE Website: https://asappresearch.github.io/slue-toolkit

Note that there is no limitation on use of datasets/benchmarks for the special session. The other datasets/benchmarks we recommend are (alphabetical order)

ASR-GLUE
ESPnet-SLU
SLURP
SUPERB (limited to SLU-related tasks)
Timers and Such

Paper submission

Papers for Interspeech Special Session have to be submitted following the same schedule and procedure as regular papers of INTERSPEECH 2022. The submitted papers will undergo the same review process by anonymous and independent reviewers.

Submission URL : (TBA)

Important dates

Paper submission due     : Mar. 21, 2022
Paper update due     : Mar. 28, 2022
Acceptance notification date : Jun. 13, 2022
Final paper upload       : Jun. 23, 2022
Conference date          : Sep. 18-22, 2022

Organizers

Suwon Shon - ASAPP
Felix Wu - ASAPP
Pablo Brusco - ASAPP
Kyu J. Han - ASAPP
Karen Livescu - TTI at Chicago
Ankita Pasad - TTI at Chicago
Yoav Artzi - Cornell University
Katrin Kirchhoff- Amazon
Kaisheng Yao - Amazon
Samuel R. Bowman - New York University
Zhou Yu - Columbia University

Contact

low-resource-slu "at" googlegroups.com

SLUE Benchmark

“Low-Resource Spoken Language Understanding”¶