Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Seems similar to that Moshi model from 6 months ago, but this is more refined than that, Moshi is a little crazy, but still it was an impressive demo of how low latency responses, continuous listening and interruptions can improve the voice chat and make it more real or uncanny, (sometimes its "latency" is even too low because is interrupts you before you finish) https://www.youtube.com/watch?v=-XoEQ6oqlbE

They even released some models on huggingface:

https://huggingface.co/collections/kyutai/moshi-v01-release-...



Saying this is similar to Moshi is like saying GPT2 is similar to GPT4. You can't have any sort of conversation longer than 30s with moshi before it goes banana. You can talk to this model for an hour and it remains completely coherent.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: