This AI tool just killed... | Rez Karim OKX Feed

This AI tool just killed customer call jobs overnight. Cartesia's Sonic 3: -handles 1,000+ simultaneous calls speaks 42 languages works 24/7, never stops costs 95% less than human agents The ROI is insane. How it works(+free credits)👇

Sonic 3 doesn’t sound like “IVR menu hell.” Talks like a real person: natural pacing, laugh, breathing, pausing, and even tone shifts mid-sentence. It can mirror human energy in a conversation. This is what lets you drop it into support, concierge, sales and people don’t hang up.

You get surgical control. This is the first TTS model where you can tune speed, volume, pacing, emphasis, even down to a single word in real-time, in production. You can tell it to “Repeat that slower” for legal terms or “Speed this up” to skip boilerplate nobody wants to hear. Add emotion tags in between texts to get the output exactly as you want.

One voice, 42 languages Sonic can mirror same personality, different language, no weird accent drift. That includes 9 major Indian languages. So you can have one support agent that handles global customers across time zones, in their native accent, 24/7. There are already companies doing millions of calls/month on top of this.

This thing is real time. We’re talking ~190ms latency end to end. Your brain can’t even detect the delay. Instead of Transformers (reading an entire book and comparing every word), Sonic uses State Space Models, it “reads page by page” like humans do. That’s why it responds 3-5x faster than OpenAI and more accurately than ElevenLabs, while staying stable on long calls.

Karan Goel

We've raised $100M from Kleiner Perkins, Index Ventures, Lightspeed, and NVIDIA. Today we're introducing Sonic-3 - the state-of-the-art model for realtime conversation. What makes Sonic-3 great: - Breakthrough naturalness - laughter and full emotional range - Lightning fast - 90ms model latency, 190ms end-to-end (fastest on market) - Supports 42 languages The difference: We build on State Space Models (SSMs) instead of Transformers. Transformers (what everyone else uses) are like rewatching the entire conversation from the start before saying each new word. Every word requires reviewing everything. SSMs (what Sonic-3 uses) are like humans, remembering the topic and vibe of the conversation. Enough context to speak naturally without replaying everything. My co-founder, Albert, and I pioneered the SSM paradigm at Stanford AI Lab (S4, Mamba), and it is now being adopted industry-wide. Thousands of businesses like ServiceNow, Cresta, and Decagon power millions of conversations monthly with Sonic. Try for free or book a demo here: If you're qualified and we can't make your voice AI better than what you're using now, I'll donate $5K to your chosen charity. As part of this launch, we cooked something super cool for you 👇🏻

Cloning. You can clone a voice in about 3 seconds of audio, fast and cheap. Not hours of studio-quality samples. Not expensive per custom voice. That means: • Your CEO can “personally” talk to every lead • Your in-game NPCs all get unique voices • Your clinic’s assistant sounds like the same warm receptionist every time Here I cloned SpongeBob's voice with just 3-5 seconds of audio instantly.

Cartesia is built for founders and builders. You can use the API to integrate Sonic 3 into your SaaS or in your N8N workflows. You can utilize their MCP to make it work in your AI workflow. You can see how simple it is to build an agent that transcribes your notes in Notion with Sonic 3. With Vapi, N8N, and Notion Connection.

This is what this means for businesses: - Hotel concierge that never sleeps - Healthcare assistant that can schedule you and explain billing without getting impatient - A support agent that handles 1000 calls at once, remembers policy, and still sounds empathetic - AI characters in games that improvise, banter, react Cartesia raised $100M to build exactly this and they already power companies like ServiceNow, Cresta, and Decagon.

🚨 Giveaway alert I’m also giving away: - a step-by-step guide to cloning your voice + spinning up your own AI voice agent - $100 in Cartesia credits Reply “VOICE” and I’ll send it to you. (Must be following me so I can DM)

内容来源

7,359

本页面内容由第三方提供。除非另有说明，欧易不是所引用文章的作者，也不对此类材料主张任何版权。该内容仅供参考，并不代表欧易观点，不作为任何形式的认可，也不应被视为投资建议或购买或出售数字资产的招揽。在使用生成式人工智能提供摘要或其他信息的情况下，此类人工智能生成的内容可能不准确或不一致。请阅读链接文章，了解更多详情和信息。欧易不对第三方网站上的内容负责。包含稳定币、NFTs 等在内的数字资产涉及较高程度的风险，其价值可能会产生较大波动。请根据自身财务状况，仔细考虑交易或持有数字资产是否适合您。