Trending

#speechtech

Latest posts tagged with #speechtech on Bluesky

Latest Top
Trending

Posts tagged #speechtech

KittenTTS Nano, Small Text to Speech LLM That Runs on Standard CPUs https://softtechhub.us/2026/02/24/kittentts-nano-small-text-to/

#KittenTTS #TextToSpeech #TTSModel #SpeechSynthesis #OpenSourceAI #EdgeAI #LLM #AIModels #VoiceAI #MachineLearning #AIAudio #CPUBasedAI #AIInnovation #GenerativeAI #SpeechTech #AIForDevelopers #TechTrends #DeepLearning #LightweightAI #AIAssistants

KittenTTS Nano, Small Text to Speech LLM That Runs on Standard CPUs https://softtechhub.us/2026/02/24/kittentts-nano-small-text-to/ #KittenTTS #TextToSpeech #TTSModel #SpeechSynthesis #OpenSourceAI #EdgeAI #LLM #AIModels #VoiceAI #MachineLearning #AIAudio #CPUBasedAI #AIInnovation #GenerativeAI #SpeechTech #AIForDevelopers #TechTrends #DeepLearning #LightweightAI #AIAssistants

KittenTTS Nano, Small Text to Speech LLM That Runs on Standard CPUs softtechhub.us/2026/02/24/k...

#KittenTTS #TextToSpeech #TTSModel #SpeechSynthesis #OpenSourceAI #EdgeAI #LLM #AIModels #VoiceAI #MachineLearning #AIAudio #CPUBasedAI #AIInnovation #GenerativeAI #SpeechTech #AIForDevel

0 0 0 0
Preview
Mistral AI lance sa nouvelle génération de modèles de transcription vocale | LeMagIT Voxtral Transcribe 2 est la nouvelle famille de modèles de reconnaissance vocale de Mistral. L’offre se décline en une version batch et une version temps réel, publiée en open weights sous licence Apa...

🔊 Connaissez-vous Voxtral Transcribe 2 (Mistral AI) ? Deux modèles : batch (Voxtral Mini) et Realtime ⚡ <200 ms, open-weights, déployable on‑prem. Supporte 🇫🇷 🇬🇧 🇪🇸 🇩🇪, prix 0,003–0,006$/min. #SpeechTech https://bit.ly/4agUty2

0 0 0 0

Users reported mixed accuracy for the Mandarin tone correction tool, especially at conversational speeds and with tone transformations. This highlights the significant challenge in building models that capture the nuanced reality of spoken Mandarin. #SpeechTech 2/6

0 0 1 0
Preview
Microsoft Launches 'Mini' GPT Voice Models in Azure Foundry to Cut Latency and Cost - WinBuzzer Microsoft's new gpt-realtime-mini and gpt-4o-mini models in Azure AI Foundry offer 70% lower costs and 50% better accuracy, targeting enterprise voice agents.

winbuzzer.com/2025/12/18/m...

Microsoft Launches ‘Mini’ GPT Voice Models in Azure Foundry to Cut Latency and Cost

#AI #Microsoft #Azure #GenAI #OpenAI #VoiceAI #CloudComputing #SpeechTech #EnterpriseAI

1 0 0 0
Front page - Doktorska szkoła zimowa

The tech industry loves billion-dollar voices, but what about the other 7,000 languages?

Excited to keynote the Winter School of Innovation tmrw at Uni of Warsaw. We’ll be tracing the innovation pathway that enables #SpeechTech w/ minimal data.

szkolazimowa.szkolydoktorskie.uw.edu.pl/en/

1 0 0 0
Post image Post image

Heureux d'accueillir le prof. Renauld Govain dans le cadre du projet #ANR CREAM sur les langues créoles !
Le #LLL a remis un #corpus unique: 1400 heures de données orales en #kreyòl, transcrites et alignées automatiquement au caractère près 🎧✨
@univorleans.bsky.social
#créolehaïtien #speechtech #NLP

4 0 0 1
Preview
Far-Field Speech and Voice Recognition Market by Size, Share and Forecast by 2035 Far-Field Speech and Voice Recognition Market size is expected to grow to USD 32.49 Billion at a CAGR of 18.33% by 2035, Global Far-Field Speech and Voice Recognition Industry by Component, Microphone...

1 new message www.marketresearchfuture.com/reports/far-...
#FarFieldRecognition #VoiceRecognition #SpeechTech #SmartSpeakers #AIInteraction

0 0 0 0
Realistic AI voices on PNL Reader
Realistic AI voices on PNL Reader YouTube video by Programming N' Language

🚀 I can’t resist adding these two ‘Realistic AI’ voices to PNL Reader. Hey Swedes, how does this sound to you?

👉 github.com/pnlpal/pnl-r...

#TTS #VoiceAI #PNLReader #Sverige #svenska #Sweden #SpeechTech #webdev #devlog #buildinpublic #indiedev

www.youtube.com/watch?v=7nV0...

6 1 0 0

Our pick of the week by @zhihangxie.bsky.social: "#Speech Discrete Tokens or Continuous Features? A Comparative Analysis for Spoken Language Understanding in #SpeechLLMs" by Dingdong Wang, Junan Li, Mingyu Cui, et al. (#EMNLP2025)

aclanthology.org/2025.emnlp-m...

#SLU #SpeechTech

2 0 0 0

Our #PickOfTheWeek by @beomseok-lee.bsky.social: "Can Speech LLMs Think while Listening?" by Yi-Jen Shih, @rdesh26.bsky.social, Chunyang Wu, Wei Zhou, SK Bong, Yashesh Gaur, Jay Mahadeokar, Ozlem Kalinli, Mike Seltzer (2025).

#Speech #SpeechLLM #LLM #SpeechTech #AI

1 1 0 0
Post image

Marco Gaido and Roldano Cattoni presenting our SimulStream Demo at the DI Center Demo Day at FBK!

The open-source tool, which is going to be released soon, natively supports any speech-to-text #HuggingFace models! 🤖

#SpeechTech #Translation

1 0 0 0
ASR Review for African Low‑Resource Languages Highlights Gaps and Paths

ASR Review for African Low‑Resource Languages Highlights Gaps and Paths

A systematic review of ASR research (Jan 2020–Jul 2025) found 71 studies covering 74 datasets for 111 African languages and about 11,200 hours of speech. Read more: getnews.me/asr-review-for-african-l... #asr #africanlanguages #speechtech

0 0 0 0
Unsupervised CNN Learns Mandarin Tonal Categories Without Labels

Unsupervised CNN Learns Mandarin Tonal Categories Without Labels

A Wasserstein GAN separated the four Mandarin tones without labeled data, forming distinct clusters; training on male speech tokens consistently encoded tone. Read more: getnews.me/unsupervised-cnn-learns-... #speechtech #unsupervisedlearning

0 0 0 0

Comparing VibeVoice to ElevenLabs, Chatterbox, & Kokoro, the consensus is it's promising but has room to grow. The impressive multilingual capability, especially English/Mandarin, highlights its unique strength. #SpeechTech 3/6

0 0 1 0

Our pick of the week by @zhihangxie.bsky.social: "SimulMEGA: MoE Routers are Advanced Policy Makers for Simultaneous Speech Translation" by Chenyang Le, Bing Han, Jinshun Li, Songyong Chen, and Yanmin Qian (2025)

#Speech #Simultaneous #Translation #MOE #SpeechTech

0 0 0 0

Microsoft's expressive MAI-Voice-1 model now live in Copilot Daily & Podcasts enterprise AI expands! #AI #Microsoft #Copilot #SpeechTech

1 1 0 0
Preview
Announcements Keynote Speaker Announcement 🔊 30.07.2025 We are delighted to announce the keynote speech t`hat will happen at the special session! Speaker: Prof. Karen Livescu, Toyota Technological Institute at Ch...

📢 #SpeechTech & #SpeechScience researchers!
We are thrilled to announce that Prof. Karen Livescu will keynote our Special Session on Interpretable Audio and Speech Models at #Interspeech2025:
"What can interpretability do for us (and what can it not)?"
🗓️ Aug 18, 11:00
@interspeech.bsky.social

3 1 0 1
Post image

🔥 Is your real-time SimulST system REAL?

Our TACL paper analyzes 110 works and reveals:
🚫 Overreliance on short-form speech
🌀 Terminology chaos
📉 Real-world deployment gaps
We bring order-New taxonomy, trends & recommendations!

📍#ACL2025 Poster: Monday 11-12:30, Hall 4/5

#Speech #SpeechTech

5 0 1 0
Post image

Mistral Challenges OpenAI and Google with New Voxtral Open-Source Voice AI Model

#AI #MistralAI #Voxtral #OpenSource #VoiceAI #GenerativeAI #SpeechTech

winbuzzer.com/2025/07/15/m...

3 1 0 0
Post image

New paper in Interspeech 2025 🚨
@interspeech.bsky.social

A Robust Model for Arabic Dialect Identification using Voice Conversion

Paper 📝 arxiv.org/pdf/2505.24713
Demo 🎙️https://shorturl.at/rrMm6

#Arabic #SpeechTech #NLProc #AI #Speech #ArabicDialects #Interspeech2025 #ArabicNLP

1 2 1 0
Author Guidelines - ICMI 2025 :: 27th ACM International Conference on Multimodal Interaction

🚨 Call for Demos & Exhibits at #ICMI2025 🇦🇺
Showcase your innovative multimodal systems & interfaces in Canberra, Oct 13–17!

📝 Submit by: July 7

🌐 Details: icmi.acm.org/2025/call-fo...

#HCI #Multimodal #ACM #AI #AffectiveComputing #SpeechTech

1 1 0 0
Preview
IBM Pushes AI Boundaries With Granite 3.3 AI Models IBM releases Granite 3.3 AI models with powerful speech-to-text and translation abilities.

IBM launches Granite 3.3, a fresh open-source AI series focused on speech recognition and translation.

Real-world use, solid accuracy, and open collaboration.
Big step for AI transparency.

🔗 itmatterss.in/ibm-pushes-a...

#IBM #GraniteAI #SpeechTech #OpenSourceAI

1 0 0 0

Our pick of the week by @mgaido91.bsky.social: "OpenOmni: Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Real-Time Self-Aware Emotional Speech Synthesis" by Luo et al. (2025)

#SpeechProcessing #LLM #SFM #NLProc #speechtech #audio

3 0 0 0
Post image

OpenAI’s new audio models boost STT accuracy & TTS prosody—low-latency transcription & emotive voice gen for real-time apps. API-ready! openai.com/index/introd... #AI #MachineLearning #NLP #SpeechTech

0 0 0 0
Video

Just put our #VoiceAI through its paces with the fastest talking tv show we could find - Gilmore Girls.

Here's how we did (outperforming other #speechtech platforms by 33% 😏)

What other notoriously chatty content should we throw at it next? Drop your Speechamatics #AI versus suggestions below!

1 0 0 0
Preview
Kungliga biblioteket lanserar AI som transkriberar tal Slutet är nära för roliga och pinsamma feltranskriberingar signerad AI. Modellen KB-Whisper beskrivs som en milstolpe för taligenkänning på svenska.

Yay! KB-Whisper launched today - a freely available speech-to-text service able to transcribe many varieties of Swedish. Available now at Hugginface! #NLP #speechtech www.dn.se/kultur/kungl...

9 3 0 0

Our pick of the week by @sarapapi.bsky.social: "Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction" by StepFun (2025).

#speech #speechtech #audio

3 1 0 0
Preview
Associate/Assistant Professor: Language and Speech Technology | Radboud University Do you want to work as a Associate/Assistant Professor: Language and Speech Technology at the Faculty of Arts? Check our vacancy!

Come work with us! Our department is hiring an associate/assistant prof in language and speech technology www.ru.nl/en/working-a...
#interspeech #speech #SpeechTech #SpeechScience

44 47 1 2
Preview
Innovating Speech Tech for Smaller Languages New PhD position combining AI and bilingual speech research in the Netherlands

🎓 Open PhD Position: Speech Tech for Minority Languages

Working on Frisian-Dutch bilingual speech + AI at Fryske Akademy/Campus Fryslân. Fully funded, 4 years, starts Sept 2025.

More info ⬇️
open.substack.com/pub/voicetec...

#SpeechTech #PhD #LowResourceLanguages #AcademicJobs #AcademicSky

2 0 0 0