#Expressive Speech

2개의 포스트

[논문리뷰] MoVE: Translating Laughter and Tears via Mixture of Vocalization Experts in Speech-to-Speech Translation

본 논문은 기존 S2ST 시스템이 의미론적 정확도는 높으나, 웃음이나 울음 같은 NVs를 보존하지 못해 실질적인 대화의 정서적 맥락을 상실하는 문제를 해결한다. 기존 시스템들은 고품질 NVs 데이터의 부족과, 복잡한 다중 감정 상태를 처리하기 어려운 모델 구조적 한계로 인해 표현력이 부족하다.

#Review #Speech-to-Speech Translation #Non-verbal Vocalizations #Mixture of Experts #AudioLLMs #Expressive Speech #Data Efficiency

2026년 4월 21일

[논문리뷰] NVSpeech: An Integrated and Scalable Pipeline for Human-Like Speech Modeling with Paralinguistic Vocalizations

본 연구는 자연스러운 음성 의사소통에 필수적인 웃음, 호흡, 감탄사 등의 비언어적 발성(paralinguistic vocalizations) 이 기존 ASR 및 TTS 시스템에서 간과되는 문제를 해결하고자 합니다.

#Review #Paralinguistic Vocalizations #Speech Recognition #Text-to-Speech #Speech Synthesis #Data Annotation #Mandarin Speech #Expressive Speech

2025년 8월 13일