Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It’s nonzero, because they carry state while performing inference, and in the surrounding processes like chain-of-thought and mixture-of-experts.


I think they have working memory but not short term memory. I suppose that's pedantic or anthropomorphizing but it feels like I felt tbh




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: