'Facts' aren't as black and white as people think.
"What does Charmander evolve into?"
"What does the spell 'avada kedavra' do?"
"What is the Sindarin word for 'friend'?"
"What are the names of Santa's reindeer?"
"Where did Robin Hood live?"
"Where did Achilles die?"
These are all 'factual questions' you can find answers to from reputable sources like Wikipedia. Google displays 'fact boxes' for several of them. Wolfram Alpha provides answers for three of them. The answers to some of these questions are part of what passes for 'general knowledge' in some societies.
It's no surprise that LLMs trained on human writings produce text which claims things that aren't true are facts. Humans do that all the time.
There are well attested reputable sources that will tell you Abraham Lincoln was a vampire hunter, others that say he was a Lego Master Builder, and others still will tell you that among his notable quotes is "Party on dudes - be excellent to each other". So what's an LLM to do when it's trying to extend a paragraph of information about Abraham Lincoln?
When an LLM is suggesting what might come next in a piece of text... it doesn't know if it's supposed to guess a probable word from a Wikipedia article, an Onion article, a Project Gutenberg manuscript, or an Archive Of Our Own fanfic. So you get a bit of all that.
Tangential: I was going to suggest "protocoli(s|z)ed" instead of protocollar, but I Googled "protocollar statements" just to check and found 2 things. First, this page was the top result! Second, "protocolar" (one ell) and "protocolary" are apparently real words. New to me, thanks.
You had me check a few sources for found expressions in use for the concept: you can find simply "protocols" (intending that), "protocol statements", the "protocol-sentence debate", "protocollar propositions"...
the use of '-ize' is (a graecism) indicated by the OED as International English, as opposed to British, American etc. In fact, some call International English "British spelling with -ize" - it is not exactly that but close. One exception is 'analyse', but that is because linguists compromised on the "difficult" original 'analysize'.)
I think protocollar is in this context a misspelling of protocolar - hence its high placement for protocollar statements, if I google protocolar statements this is the highest result (for me)
> When an LLM is suggesting what might come next in a piece of text... it doesn't know if it's supposed to guess a probable word from a Wikipedia article, an Onion article, a Project Gutenberg manuscript, or an Archive Of Our Own fanfic.
The obvious start seems to be having separate fiction and nonfiction LLMs and not training the nonfiction ones on Archive Of Our Own. People also end up confused about the truth when nobody points out the difference between fiction and nonfiction.
But there's a fundamental issue here. The real strength of LLMs is not just information retrieval, but being able to dynamically recombine that information. Of course that's also their weakness. The reason GPT will regularly produce code with nonexistent API calls is not because it's been trained on 'fictional APIs', but because it's combining various real calls to make new fictional ones.
The obvious answer then is to tell it to make sure that what it's finally outputting is really part of the "real" API, but I think it's safe to say there's some technical hitch there, as it's safe to say OpenAI probably spent quite a lot of energy trying to solve the code hallucinations, and ultimately was unable to do so. I'd guess that the more you restrict its recombination ability, the more you end up with it inappropriately (and incorrectly) just regurgitating large chunks of its training input verbatim. Basically it becomes more like a keyword hunting search engine, and less like a generative LLM.
Right. A lot of the magic of LLMs probably comes from the broader appreciation of language and cultural reference that they get from being trained on a diverse corpus, rather than just a bunch of dictionaries and reference books.
And anyway - answers to all my ‘fictional facts’ questions above can be sourced from Wikipedia - there’s tons of made up stuff on there.
Hopefully such statements are sufficiently rare that they don't get reinforced, I guess. I don't know. A very real problem occurs with people too when fictional things are repeated often enough without direct mention of their fictional nature.
Which of these is more true: a newspaper article about a battle in the War of 1812, or the Star-Spangled Banner, which was written by someone witnessing a battle in the War of 1812.
Hint: how many stadiums are filled with people standing up to recite a newspaper article about a battle in the War of 1812?
> it doesn't know if it's supposed to guess a probable word from a Wikipedia article, an Onion article, a Project Gutenberg manuscript, or an Archive Of Our Own fanfic. So you get a bit of all that.
This is true of base LLM models that are just trained on missing-word prediction on the training corpus, but one of the main points of RLHF[1] is to tune this model to make these kind of inferences the way a human would expect. For example if you asked an untuned model to write a poem in the style of ... etc., a valid internet response might be "hmm no thanks, you go first", you need to steer the model away from replying like this.
I'm not saying it's perfect, but it's wrong to say e.g. GPT-4 has had no information about the difference between a good and bad response and is just generating internet-like text at random, the big players have made progress on this already.
Reinforcement learning trains them that question and answer sessions contain answers which statistically correlate with factual statements in their broader learning corpus.
When formulating answers, this leads them to formulate answers that reflect the factual information on which they were trained.
My point is that the source data contains a far muddier range of information than just unarguable facts.
We largely want LLM based Q&A bots to answer questions about fictional or mythical characters in their own terms. As I said, those questions above all have reasonably ‘correct’ answers.
The fact that from all that LLMs do as well as they do is remarkable. But it also seems like it requires us to assume that LLMs are capable of a remarkable degree of cultural nuance, media literacy and contextual awareness for them to figure out the different authorship, salience, trustworthiness, agenda, biases, and assumptions of all the gigareams of text they’ve ingested.
“ When an LLM is suggesting what might come next in a piece of text... it doesn't know if it's supposed to guess a probable word from a Wikipedia article, an Onion article, a Project Gutenberg manuscript, or an Archive Of Our Own fanfic”
LLMs are very good at inferring context, so that only really applies if you’re using an un-RLHFed base model with no context given
Here, "supposed to guess" means "having the goal of..."
So no LLM knows what it's supposed to do. If you prefer, you could say it only ever has one goal: to generate a sequence of tokens which are jointly the most probable to occur along with the prompt tokens, given such probabilities in a historical corpus.
This imitates knowledge, goal-directness, "inferring context" etc. without doing any of those things. Consider what the aim of knowing, goal-directness, inferring , etc. is --- it is never "consistency with a historical text corpus".
For knowing: that beliefs correspond to the way the world is; for goal-directness that one's acts+desires can realise changes; for 'inferring context': that one is sensitive to reasons to speak outside of what is literally spoken.
LLMs are never sensitive to reasons to speak outside of what has been spoken.
What does RLHF do then? I feel like you completely ignored the central point of GP's comment.
RLHF is the difference between GPT-3.5 and ChatGPT, and it's the whole reason why LLMs are suddenly such a big deal. ChatGPT demonstrated that it's possible to give language models a goal beyond just "complete most likely next word" and that they can actually be somewhat competent at achieving those goals despite not being explicitly trained for them.
> competent at achieving those goals despite not being explicitly trained for them.
Well (1) it doesn't achieve goals, since a "goal" is observer-relative. We have goals, the LLM has a formal optimisation objective which gives it the appearence of goal-directed behaviour (in a similar way, eg., that it appears pens want to fall when dropped).
And (2), reading your "goal" here even in observer-relative ways, I don't think there's much evidence of this. These models are "trained" on everything ever written, include all of the internet and basically all digitised book. I don't see any evidence of much generalisation -- if you can find it by google, then the LLM has it stored compressed (ie., the "weights").
The innovation in LLMs is being able to compute `max P(answer|prompt, historical_corpus)` for increasingly longer prompts --- there's no innovation in goal-directed behaviour.
That's VC propangada to disguise the fact that LLMs are mostly an innovation in copyright laundering.
(1) This is a tired, pointless semantic argument. "It doesn't have a goal, it just acts like it has a goal for all intents and purposes. But, you see, it's actually a machine and not a human and therefore it can't really have goals according to my narrow definition of the term." Either point to an actually relevant difference in the resulting behavior or stop objecting when people use human behavioral terms to describe the behavior of machine learning systems. We're all well aware it's a program; that's not the point. (Sorry, just a frustration I have with the larger discussion around this topic.)
(2) "I don't see any evidence of much generalisation" Seriously? So when I tell ChatGPT to rewrite a paragraph in the style of Shakespeare and it does it, despite never being trained to do that, never seeing the source or target paragraph before, and having no information other than my text prompt and its past training, that's not evidence of generalization? And that's only one of millions of different possible tasks that the same model excels at, despite being trained on nothing but a bunch of unstructured text and a few examples indicating its goal should be to follow instructions given in the prompt text. Up until a couple years ago this level of flexibility in a machine learning model would have been considered science fiction by nearly everyone, and now it's "[not] evidence of much generalization". Okay.
Well (1), the reason this distinction is relevant is so we can separate out whether the system has developed a capacity or an apparent capacity.
Is the child a genius or are they just reading out of a textbook? Can the toddler really compose a sonata or did they just press play on the piano keyboard?
(2) This is indeed the power of interpolating between the data points of "everything ever written in human history" as digitised and compressed by ChatGPT.
If you have 1 billion circles of radii 0 to 1, it isn't generalisation for the machine to produce one with a radii 0.0000100003000001, ie., one not in the set but a mere interpolation of points within it.
It would be expensive, but imagining "reversing" ChatGPT from it's output to the sources which made a non-trivial difference to generating that output.
So the function there is: response -> verbatim text in the training corpus.
Then, maybe, "bolded" by how much each paragraph would "make a difference" to its output.
What you'd find is thousands of pages: all Shakespeare ever written, all papers about Shakespeare, all books about Shakespeare; and so on.
Then when it applied the bolding, and summarising it a little, the trick would be revealed: it would be apparent how a naive statistical interpolation between sequences of characters could produce the effect.
ChatGPT exists because of ebooks and social media: without it, it could do almost nothing. That is, the appearance of these capacities is strictly derivative of the work of a billion people who had them.
Without vast, unimaginable, amounts of work produced on Shakespeare this system wouldnt work. It's just a copyright laundering system. All the school essays on reddit, all the forum posts; all of usenet. All pdfs, all digitised works. All academic papers.
Is this generalisation? Is this a system which starts with little and makes a lot?
Or is it a system which is more like a child reading from a textbook? Ie., making a haphazard ability to repeat what's already written.
The size of the weights of a modern LLM are sufficient to compress everything ever written in human history: and that's exactly what they do.
It isn't apparent that anything you've just described is relevant. You've described how it works (in a highly simplified way), but that doesn't discredit the end result.
If there's truly a difference between "a capacity [and] an apparent capacity" then you should be able to point out what that difference actually is in practice. A child pressing play on a piano can only play one song. A LLM composing poems can compose billions upon billions of unique, never-before-seen poems about every conceivable topic. Whether under the hood it does that by "interpolating numbers in n-dimensional spaces" or "some incomprehensible arrangement of neurons linked together" or some other, yet to be invented process doesn't matter if the result is the same. The fact that you can explain how something works doesn't make it less real.
This is something which GPT generally isn't confused about though: it knows the answer to these questions and it knows that these are questions and statements about well-known works of fiction. I don't really think this is the source of the tendency for LLMs to make stuff up.
Mellon. The rest are left as an exercise to the reader.
It does always amaze me that we trained LLMs on a dump of the internet and then people are shocked that they're about as trustworthy as a random web page.
The issue here is one of semiotics and morphemology. Mapping meaning into a narrative and ontological protocol is going to be the requisite work if we want the engine to be "smart." As explored in the discussion at hand, tokenization creates a great mimic but it's a parlor trick. We must employ a robust thinking-thing that correlates not only a static, contextually indexed dictionary <lexicography>, we must also route that through a network to distill meaning itself into tokens. Perhaps languages which rely on morphemes for written language - a logosyllabary - are somewhat more or less suited for this task? I ask as a dummy.
There also exists the consideration of allographemical contextualization, the nature of relevance, pragmatics, conjunct identification of context, semantics. To be honest the linguistics side alone is vast. Knowledge and cognition however. . . A whole other ballgame. But the only tool we have to really get down to the bottom of how knowledge works is language, it's to epistemological pursuit what math is to physics.
While GPT is super impressive and can do a lot of quasi-brute-force things, we're only finding now the rudiments of the machined intelligence paradigm, and it will behoove any reader to brush up on their classics, true pursuants of philosophy and many order logic are about to be in high demand if I had to reckon.
I prefer to think that most humans actually distinguish the fictional context, and so should LLM. As such, if it is to be of any use, it'd better figure out it's fiction if someone's flying on a winged horse, levitating trolls or (obviously harder) running around a forest with a bow.
And when answering a question, unambiguously specify this fictional context, or at least indicate that it might be fiction if unsure.
I'm not sure a higher level "intelligence" (which some folks think AI is moving towards) should be overly-reliant on human "intelligence", lest it inherit flaws which may outnumber benefits. (Humans believe a variety of outlandish things, such as "Q-Anon has the real facts", etc.)
By this logic, "Is Moby Dick a sperm whale" also can't be answered factually because Moby Dick is a fictional creation and doesn't exist in the natural world?
It's no surprise that LLMs trained on human writings produce text which claims things that aren't true are facts. Humans do that all the time.
There are well attested reputable sources that will tell you Abraham Lincoln was a vampire hunter, others that say he was a Lego Master Builder, and others still will tell you that among his notable quotes is "Party on dudes - be excellent to each other". So what's an LLM to do when it's trying to extend a paragraph of information about Abraham Lincoln?
When an LLM is suggesting what might come next in a piece of text... it doesn't know if it's supposed to guess a probable word from a Wikipedia article, an Onion article, a Project Gutenberg manuscript, or an Archive Of Our Own fanfic. So you get a bit of all that.