's Avatar

@naoyukikandaslp

150
Followers
48
Following
4
Posts
20.11.2024
Joined
Posts Following

Latest posts by @naoyukikandaslp

Post image

I was just notified that our E2 TTS paper received the Best Paper Award at IEEE #SLT2024! Many thanks to all the remarkable collaborators who made this happen!

Paper: arxiv.org/abs/2406.18009
Demo: aka.ms/e2tts

05.12.2024 03:38 👍 5 🔁 2 💬 0 📌 0

Ah, no, TS3-Codec was trained with 10-second audio segments, while BigCodec-S was trained with 2.5-second audio segments (Section 4.5). This was a somewhat tricky (and perhaps debatable) part of the configuration, and we did our best to tune the hyperparameters within the constraints of GPU memory.

03.12.2024 06:18 👍 1 🔁 0 💬 0 📌 0

Thanks! To the extent that we checked, yes. The important point is limiting the attention window.

03.12.2024 06:04 👍 0 🔁 0 💬 1 📌 0

TS3-Codec: yet another audio codec from my former team—simple, fast, and high-quality.

Simple—just a stack of Transformer and linear layers; no convolutions.

Faster and better—superior audio reconstruction quality with fewer MACs compared to strong convolution-based baselines.

03.12.2024 03:53 👍 0 🔁 0 💬 1 📌 0
Preview
Research Scientist Intern, AI Research - Speech & Audio (PhD) Meta's mission is to build the future of human connection and the technology that makes it possible.

Our GenAI-Speech team at Meta is hiring RS interns for summer 2025 to work on speech, LLMs, dialog generation, and other exciting stuff! Check out the job posting here: www.metacareers.com/jobs/3841154...

22.11.2024 03:41 👍 9 🔁 1 💬 0 📌 0