Publications

You can also find my articles on my Google Scholar profile.

Journal

Multi-attribute Learning for Multi-level Emotion Recognition from Speech

Authors: Y. Gao, H. Shi, C. Chu, T. Kawahara
Submitted to: IEEE Transactions on Affective Computing (under review)

Adversarial domain generalized transformer for cross-corpus speech emotion recognition

Authors: Y. Gao, L. Wang, J. Liu, J. Dang, S. Okada
Published in: IEEE Transactions on Affective Computing, 2023

Conference

Speech Emotion Recognition with Multi-level Acoustic and Semantic Information Extraction and Interaction

Authors: Y. Gao, H. Shi, C. Chu, T. Kawahara
Published in: Proc. INTERSPEECH, 2024

Enhancing Two-stage Finetuning for Speech Emotion Recognition Using Adapters

Authors: Y. Gao, H. Shi, C. Chu, T. Kawahara
Published in: ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech, and Signal Processing

Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition

Authors: H. shi, Y. Gao, T. Kawahara
Published in: IEEE-SLT, 2024

A Study on Multimodal Fusion and Layer Adaptor in Emotion Recognition

Authors: X, Shi, Y. Gao, J. He, J. Mi, X. Li, T. Toda
Published in: APSIPA ASC, 2024

Two-stage finetuning of wav2vec 2.0 for speech emotion recognition with asr and gender pretraining

Authors: Y. Gao, C. Chu, T. Kawahara
Published in: Proc. INTERSPEECH, 2023

Semi-supervised multimodal emotion recognition with consensus decision-making and label correction

Authors: J. Tian, D. Hu, X. Shi, J. He, X. L, Y. Gao, et al.
Published in: Proceedings of the 1st International Workshop on Multimodal and Responsible Affective Computing, 2023

Domain-invariant feature learning for cross corpus speech emotion recognition

Authors: Y. Gao, S. Okada, L. Wang, J. Liu, J. Dang
Published in: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech, and Signal Processing

Domain-adversarial autoencoder with attention based feature level fusion for speech emotion recognition

Authors: Y. Gao, J. Liu, L. Wang, J. Dang
Published in: ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech, and Signal Processing

Metric learning based feature representation with gated fusion model for speech emotion recognition.

Authors: Y. Gao, J. Liu, L. Wang, J. Dang
Published in: Proc. INTERSPEECH, 2021

Temporal attention convolutional network for speech emotion recognition with latent representation.

Authors: J. Liu, Z. Liu, L. Wang, Y. Gao, L. Guo, J. Dang
Published in: Proc. INTERSPEECH, 2020