Seems similar to that Moshi model from 6 months ago, but this is more refined th... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		spyder on March 2, 2025 \| parent \| context \| favorite \| on: Crossing the uncanny valley of conversational voic... Seems similar to that Moshi model from 6 months ago, but this is more refined than that, Moshi is a little crazy, but still it was an impressive demo of how low latency responses, continuous listening and interruptions can improve the voice chat and make it more real or uncanny, (sometimes its "latency" is even too low because is interrupts you before you finish) https://www.youtube.com/watch?v=-XoEQ6oqlbE They even released some models on huggingface: https://huggingface.co/collections/kyutai/moshi-v01-release-...

lelag on March 2, 2025 [–]

Saying this is similar to Moshi is like saying GPT2 is similar to GPT4. You can't have any sort of conversation longer than 30s with moshi before it goes banana. You can talk to this model for an hour and it remains completely coherent.

Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact