Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Claude’s very quietly better on everything but pricing, for a while, it just got buried because they announced on “AI Tuesday” (iirc gpt4 and Bing announcement day)

The ChatGPT equivalent is 3x speed and was somewhere between ChatGPT and GPT4 on my TriviaQA benchmark replication I did

Couple tweets with data and examples. Note they’re from 8 weeks ago, I know Claude got a version bump, GPT3.5/4 accessible via API seem the same.

[1] brief and graphical summary of speed and TriviaQA https://twitter.com/jpohhhh/status/1638362982131351552?s=46&...

[2] ad hoc side by sides https://twitter.com/jpohhhh/status/1637316127314305024?s=46&...



> I know Claude got a version bump, GPT3.5/4 accessible via API seem the same.

GPT3.5 just got an update a few days ago that resulted in a pretty good improvement on its creativity. I saved some sample outputs from the previous March model, and for the same prompt the difference is quite dramatic. Prose is much less formulaic overall.


Meanwhile GPT 4 got lobotomised around the same time.

It can't even solve simplified versions of problems it had zero issues with just a week ago.


Wait, really? I've only been using GPT4 and it seemed like it's been getting incrementally better. Do you have any test cases?


Literally all of them! Comprehension, translation, problem solving, etc….

It’s worse at everything.


You know, I'm noticing it seems to work better on off-hours - I get better quality responses late at night. I wonder if they're using a lighter model when there's more demand on the system.


Nah


Thank you, every little comment I get from fellow boots on the ground is so valuable, lotta noise these days.

Random Q, I don’t use the ChatGPT front end much past month or two, used it a week back and it seemed blazingly faster than my integration: do you have a sense of if it got faster too?


Yeah I've noticed the front end for 3.5 responds almost instantly to complex questions. It'll pop out an entire page of code in under a couple seconds. I think API responses may be a bit faster than before, some of my queries that took 30s before now take around 20s, but they are obviously prioritizing their own site.


Is this update made visible somewhere? The language models offered on my Playground are still the ones from March, same with ChatGPT.


On chat.openai.com below text box there is a link to ChatGPT release notes. Current link text is "ChatGPT May 3 Version". The link leads to https://help.openai.com/en/articles/6825453-chatgpt-release-...


How is the code generation of Claude?


Note, all impressions based on Claude 1.2, got an email from Poe in the last week saying it was version bumped to 1.3 with a focus on coding improvements.

Impressions:

Bad enough compared to GPT-4 that I default to GPT-4. I think if I had api access I’d use it instead, right now it requires more coaxing, and using Poe.

I did find “long-term” chats went better, was really impressed with how it held up when I was asking it a nasty problem that was hard to even communicate verbally. Wrong at first, but as I conversed it was a real conversation.

GPT4 seems to circle a lower optima. My academic guess it’s what Anthropic calls it “sycophancy” in its papers, tldr GPT really really wants to do more like what’s in the context, so the longer the conversation with initial errors goes, it’s actually harder to talk it out of the errors.


I have access to claude. It's not bad, but decently behind gpt4 for code


And is code generation ability equivalent to code understanding and search ability?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: