If you were given the task "mimic the sound of Japanese as best you can", at first, you would just learn the basic phonology and just try and mimic the general sounds of the language. You would get good at that, and eventually you would be perfectly mimicking the pronunciation of Japanese phonemes. At some point, it would become worth your time to actually learn how different sounds are put together, e.g. Japanese can't have an S sound followed by a T sound. After that, you may start to learn how the different syllables interact. And at some point, you would learn how words are put together to form sentences.
At a certain point, there really is no difference between "being really good at mimicking the sound of Japanese" and actually knowing it, because in order to mimic it to a high level you will have to actually know it. "Mimic the sound of Japanese" is the equivalent task here to "predict the next token in this text".
Where that argument falls apart is when you’re talking about something that’s still bad a mimicry. ChatGPT still completely fails the touring test it’s not even vaguely close to that threshold.
For now it’s slightly better than fake crowd noise you’re hearing in movies, but still frequently just gibberish.
ChatGPT passes the Turing test and is not even the first bot to have done so. No idea why you're so keen on downplaying it but soon there won't be anyone left that you will be able to convince that this is no big deal. You're fighting an uphill battle.
The Touring test allows unlimited topics, time, etc. There are competitions that use rules heavily favor bots where bots have “won” in the past but they aren’t actually preforming the test.
ChatGPT seems amazing at first, but that’s because its flaws are so novel to them. People just aren’t used to looking for them so they can overlook how quickly it completely forgets about previous parts of a conversation etc.
And when it passes your current interpretation of the Turing test you will find another excuse why it still "completely fails" and is "not even vaguely close". That's called moving the goalposts.
Passing the Turing test does not imply general intelligence but saying that what it outputs is "just gibberish" is obviously just another hyperbole from you.
[edit: You edited your comment but it used to say it fails "the test as described" and that the competitions are invalid since they do not follow the rules. I presume you looked up the actual test afterwards and realized how wild your "completely fails" comments were - and did a 360 and rewrote the comment. Keeping my original response below.]
> the test as described
I hope you realize that the original Turing test is where you have a man and a woman trying to convince an interrogator that they are of the opposite sex. The test is to replace one with a machine and see if the interrogator would decide the wrong sex as often as when there's an actual human playing.
So if we're talking about the actual test, as described, the most basic bots have passed it a long time ago. If we're talking about the standard interpretation (convince the interrogator that the bot is human) it's a derived version that has no intrinsic rules and was not described by Turing.
You can read the original paper it’s clear in his version the goal for the computer is trying to convince someone communicating with them it’s human even though the form is to convince someone they are male. “The game may perhaps be criticised on the ground that the odds are weighted too heavily against the machine. If the man were to try and pretend to be the machine he would clearly make a very poor showing. He would be given away at once by slowness and inaccuracy in arithmetic.” https://redirect.cs.umbc.edu/courses/471/papers/turing.pdf
It’s also clear he’s referring to the spirit of the game not the specific details: “It might be urged that when playing the "imitation game" the best strategy for the machine may possibly be something other than imitation of the behaviour of a man. This may be, but I think it is unlikely that there is any great effect of this kind. In any case there is no intention to investigate here the theory of the game, and it will be assumed that the best strategy is to try to provide answers that would naturally be given by a man.”
He does give a benchmark of 70% accurate after five minutes of questioning, but that wasn’t success just a benchmark.
I was just going into excessive detail. My point was limitations stop following the spirit of the original.
I don’t specifically object to changing the judge from interrogation to observation of a conversation. But, it should be clear his version doesn’t have all the loopholes the modern interpretation does.
At a certain point, there really is no difference between "being really good at mimicking the sound of Japanese" and actually knowing it, because in order to mimic it to a high level you will have to actually know it. "Mimic the sound of Japanese" is the equivalent task here to "predict the next token in this text".