Trending

#speechToText

Latest posts tagged with #speechToText on Bluesky

Latest Top
Trending

Posts tagged #speechToText

Overall architecture of AlphaFlowTSE. Given a mixture waveform y and an enrollment utterance e, we compute complex STFT features and form the mixture feature Y and enrollment feature E (real/imaginary concatenation). During training, the backbone takes the current state feature zt; during inference we initialize z0 = Y . The enrollment feature is concatenated as a temporal prefix, yielding [E∥zt] (or [E∥z0] at inference), which is fed to the UDiT backbone. The backbone is conditioned via AdaLN on the absolute time t and the interval length ∆ = r − t (with r = 1 at inference), and predicts the mean velocity for finite-interval transport, denoted uθ(t, r, [E∥zt]). One-step inference (NFE= 1) produces an estimated complex STFT Sˆ = (SˆRe, SˆIm), which is converted to the
target waveform sˆ by iSTFT. The dashed module is an optional mixing-ratio predictor used only in the background-to-target ablation to predict the start coordinate

Overall architecture of AlphaFlowTSE. Given a mixture waveform y and an enrollment utterance e, we compute complex STFT features and form the mixture feature Y and enrollment feature E (real/imaginary concatenation). During training, the backbone takes the current state feature zt; during inference we initialize z0 = Y . The enrollment feature is concatenated as a temporal prefix, yielding [E∥zt] (or [E∥z0] at inference), which is fed to the UDiT backbone. The backbone is conditioned via AdaLN on the absolute time t and the interval length ∆ = r − t (with r = 1 at inference), and predicts the mean velocity for finite-interval transport, denoted uθ(t, r, [E∥zt]). One-step inference (NFE= 1) produces an estimated complex STFT Sˆ = (SˆRe, SˆIm), which is converted to the target waveform sˆ by iSTFT. The dashed module is an optional mixing-ratio predictor used only in the background-to-target ablation to predict the start coordinate

Imagine a noisy group call where 3 people talk at once.

This paper builds a model that can focus on a single speaker (using a short voice sample) and extract that voice.

This cleanup results in better, faster audio transcription.

Summary and full paper 👇

#AudioML #SpeechToText

1 0 1 0
Screenshot from Google Docs, with spelling suggestion of the word deaf in place of the word death

Screenshot from Google Docs, with spelling suggestion of the word deaf in place of the word death

Yes, I mean 'deaf', I always mean 'deaf', I am not talking about 'death' individuals or the 'death' community... #Google #VoiceTyping #SpeechToText #DeafNotDeath

0 0 0 0
Post image

Would anyone know of any good speech to text software/app that costs little to no money? Because the current ones I'm using are so useless that I have to end up typing to correct mistakes, which, on a bad day means I can't do anything anyways.
#disability #accessibilitytools #speechtotext #dystonia

2 0 0 0
Post image

🎤 Dictée vocale + IA

Comme pour d'autres tâches, l'IA vient renforcer l'efficacité de cette pratique en simplifiant le processus. ✍️

Murmure transcrit votre voix en texte, 100% hors ligne :
👉 www.it-connect.fr/murmure-dict...

#Murmure #IA #OpenSource #SpeechToText

1 0 0 0
Empty lecture hall with wooden seats, facing a chalkboard, featuring the text “The Top 75 Community College Titles” and a “CHOICE” logo.

Empty lecture hall with wooden seats, facing a chalkboard, featuring the text “The Top 75 Community College Titles” and a “CHOICE” logo.

This week on #LTIBlog
Our contributors from Ontario Council of University Libraries (OCUL) evaluate AI speech-to-text tools for access, discovery, and preservation. ow.ly/rLLv50Yl72V
#AI #Librarytech #speechtotext @ocul-libraries.bsky.social

0 0 0 0

Ideal for live subtitling and voice assistants! 🛠️⚡ #SpeechToText #TechNews #Mistral

0 0 0 0

I even setup satellite sites with just a simple landing page to drive traffic-

speechtotext.online
texttospeech.site

All forwarding traffic to voicetotextonline.com

#buildinpublic #speechtotext #voicetotext #texttospeech #STT #TTS #Productivity #Dictation #google

2 0 0 0

I developed VoiceToTextOnline dot com in a period of 3 months and now the site has started surfacing 200 users per day.
it ranks No.2 on Bing but its a far away story on Google.

#buildinpublic #speechtotext #voicetotext #texttospeech #STT #TTS #Productivity #Dictation #google

3 0 2 0
Preview
Voice to Text | Speech to Text Online Free | Transcribe Audio in 55+ Languages Convert speech to text instantly with our free online transcription tool. Real-time voice recognition in 55+ languages including Hindi, Spanish, Arabic, Russian, Turkish, Chinese. No signup required, ...

Free voice-to-text that actually works.

→ 55+ languages (Hindi, Spanish, Arabic, and more)
→ No signup required
→ Works right in your browser
→ Real-time transcription

Try it: voicetotextonline.com

#buildinpublic #speechtotext #voicetotext #texttospeech #STT #TTS #Productivity #Dictation

3 1 0 0
Video

🎙️ تران스크ريبشن 60 دقيقة اجتماع كامل… مرة واحدة! 😱
VibeVoice-ASR (من Microsoft) نموذج 9B فقط مفتوح المصدر… بيعمللك حاجات النماذج الغالية مش بتعملها!

ليه VibeVoice كنز للمطورين وصناع المحتوى؟ 👇

#VibeVoice #ASR #SpeechToText #Diarization #OpenSourceAI #MicrosoftAI #حسام_الدين_حسن #خبير_اونلاين

1 0 1 0
ace549ad-9d03-445b-be42-f65578069df3

ace549ad-9d03-445b-be42-f65578069df3

Most speech-to-text tools are privacy nightmares. Handy changes everything by keeping your voice on your computer.
It's open-source, free, and works with a simple keyboard shortcut. No cloud, no subscription, no nonsense.
Take your privacy back at handy.computer 🎙️
#SpeechToText #OpenSource

2 0 0 0
ace549ad-9d03-445b-be42-f65578069df3

ace549ad-9d03-445b-be42-f65578069df3

Most speech-to-text tools are privacy nightmares. Handy changes everything by keeping your voice on your computer.
It's open-source, free, and works with a simple keyboard shortcut. No cloud, no subscription, no nonsense.
Take your privacy back at handy.computer 🎙️
#SpeechToText #OpenSource

1 0 0 0
Preview
GitHub - chaosslabs/whisper-server-apple-silicon: small fast api server that serves OpenAI compatible API for audio transcriptions small fast api server that serves OpenAI compatible API for audio transcriptions - chaosslabs/whisper-server-apple-silicon

github.com/chaosslab... is a #FastAPI wrapper that server an #OpenAI compatible server to run #SpeechToText transcriptions in a machine in your network so you can use #Whisper to transform your audios into text :D

0 0 0 0
Preview
GitHub - silvabyte/Audetic Contribute to silvabyte/Audetic development by creating an account on GitHub.

https://github.com/silvabyte/Audetic

Looks interesting.

#Opensource #SpeechToText #VoiceCommands #Dictate

0 0 0 0

Whispr Flow together with Le Chat is a godsend. This works so nicely and dictation with Flow is really great in contrast to Apple's own onboard dictation feature

#ai #speechtotext #dictation #whisprflow #lechat

0 0 0 0
Preview
Letterly : L'apps pour dicter et rédiger par IA Transformer un mémo vocal en texte parfait ? C'est la promesse de Letterly. Analyse des fonctionnalités, prix et alternatives de ce SaaS montant.

Letterly : L'apps pour dicter et rédiger par IA#IAGénérative #Productivité #dictéeIA #Letterly #productivitémobile #speechtotext #transcriptionintelligente

0 0 0 0
Preview
Letterly : L'apps pour dicter et rédiger par IA Transformer un mémo vocal en texte parfait ? C'est la promesse de Letterly. Analyse des fonctionnalités, prix et alternatives de ce SaaS montant.

Letterly : L'apps pour dicter et rédiger par IA#IAGénérative #Productivité #dictéeIA #Letterly #productivitémobile #speechtotext #transcriptionintelligente

0 0 0 0

Overview: Hacker News explored "Handy," a free, open-source speech-to-text app. Users praise its speed, accuracy, and local processing, comparing it to paid options. The discussion also covered its potential integration with coding and LLMs. #SpeechToText 1/6

0 0 1 0
Post image

Academic success starts here 📚
Transcribe lectures and interviews easily with Transgate.

transgate.ai

#Transgate #SpeechToText #Students #Education #Translation #audiototext #Transcription #Transcribe

1 0 0 0

#TRANSKRIPTION #AUDIO #SPEECHTOTEXT #KI #EFFIZIENZ #CONTENTCREATION #ACCESSIBILITY #PODCASTING #DIGITALISIERUNG #UNTERTITEL #TRANSCRIPTION #AUDIO #SPEECHTOTEXT #AI #EFFICIENCY #CONTENTCREATION #ACCESSIBILITY #PODCASTING #DIGITALIZATION #SUBTITLES

3 0 0 0
Preview
Scribe v2 : transcription IA rapide Speech-to-Text Scribe v2 transforme la transcription en brique de production : captions, sous-titres et contenus réutilisables. Pensé pour créateurs, équipes contenu et produit, il met l’accent sur la lisibilité, la synchronisation fine et une intégration simple dans un workflow moderne.

✦ Scribe v2 : la transcription qui devient une brique produit. Et si tes sous-titres devenaient enfin un actif, pas une corvée ?

#KingLand #ElevenLabs #ScribeV2 #SpeechToText #Transcription #SousTitres #Audio #Podcast #IA #Productivité #Workflows

📌 Lire la fiche d’impact :…

0 0 0 0
Preview
Stop Using Your Keyboard and Start Using This Simple, Free Speech-to-Text App It’s called Handy, and it uses AI models to accurately convert your speaking voice into text—all for free.

www.wired.com/story/handy-...

Handy, a free Speech to text app that leverages AI, to make it simply dictate documents on your system. Simply hold Ctrl+Space while talking, and what you say will be transcribed onto the currently active text-box.

#Software #AI #Dictation #SpeechToText #Productivity

2 0 0 0
Preview
Top 58 AI Dictation Statistics, Data & Trends in 2026 Discover the top 58 AI dictation statistics, data, and trends in 2026 shaping accuracy, adoption, productivity, and voice-first workflows worldwide.

Discover the top 58 AI dictation statistics, data, and trends in 2026 shaping accuracy, adoption, productivity, and voice-first workflows worldwide.

blog.9cv9.com/top-58-ai-di...

#AIDictation, #SpeechToText, #VoiceAI, #AIDictationStatistics, #AIDictationTrends, #SpeechRecognitionAI,

1 0 0 0
Preview
Top 10 Best AI Tools For Dictation in 2026 Discover the top 10 AI dictation tools in 2026, comparing accuracy, features, pricing, enterprise use cases, and future voice technology trends.

Discover the top 10 AI dictation tools in 2026, comparing accuracy, features, pricing, enterprise use cases, and future voice technology trends.

blog.9cv9.com/top-10-best-...

#AItools2026, #DictationSoftware, #SpeechToText, #VoiceRecognition2026, #AItranscription,

0 0 0 0
Post image

Turn any audio or video into another language with AI using Transgate.
Fast and easy translation.

transgate.ai

#Transgate #SpeechToText #AITranslation #audiototext #translatevideo #videoTranslation

1 0 0 0