πŸ“ Publications

πŸŽ™ Speech Synthesis

NeurIPS 2019
sym

FastSpeech: Fast, Robust and Controllable Text to Speech
Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu

Project

  • FastSpeech is the first fully parallel end-to-end speech synthesis model.
  • Academic Impact: This work is included by many famous speech synthesis open-source projects, such as ESPNet . Our work are promoted by more than 20 media and forums, such as ζœΊε™¨δΉ‹εΏƒγ€InfoQ.
  • Industry Impact: FastSpeech has been deployed in Microsoft Azure TTS service and supports 49 more languages with state-of-the-art AI quality. It was also shown as a text-to-speech system acceleration example in NVIDIA GTC2020.
ICLR 2021
sym

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren, Chenxu Hu, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu

Project

ICLR 2024
sym

Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis \ Ziyue Jiang, Jinglin Liu, Yi Ren, et al.

Project

  • This work has been deployed on many TikTok products.
  • Advandced zero-shot voice cloning model.
AAAI 2022
sym

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Jinglin Liu, Chengxi Li, Yi Ren, Feiyang Chen, Zhou Zhao

NeurIPS 2021
sym

πŸ‘„ TalkingFace & Avatar

ICLR 2024
sym

Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis, Zhenhui Ye, Tianyun Zhong, Yi Ren, et al. (Spotlight) Project | Code

πŸ“š Machine Translation

🎼 Music & Dance Generation

πŸ§‘β€πŸŽ¨ Generative Model

Others