Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Anthropic drops flagship safety pledge (time.com)
709 points by cwwc 2 days ago | hide | past | favorite | 667 comments
 help



I was wondering if it was because of heavy-handedness of the administration, but apparently:

> The policy change is separate and unrelated to Anthropic’s discussions with the Pentagon, according to a source familiar with the matter.

Their core argument is that if we have guardrails that others don't, they would be left behind in controlling the technology, and they are the "responsible ones." I honestly can't comprehend the timeline we are living in. Every frontier tech company is convinced that the tech they are working towards is as humanity-useful as a cure for cancer, and yet as dangerous as nuclear weapons.


That's because it is.

AI is powerful and AI is perilous. Those two aren't mutually exclusive. Those follow directly from the same premise.

If AI tech goes very well, it can be the greatest invention of all human history. If AI tech goes very poorly, it can be the end of human history.


Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an 'intelligence explosion,' and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make.

-Irving John Good, 1965

If you want a short, easy way to know what AGI means, it's this: Anything we can do, they can do better. They can do anything better than us.

If we screw it up, everyone dies. Yudkowsky et al are silly, it's not a certain thing, and there's no stopping it at this point, so we should push for and support people and groups who are planning and modeling and preparing for the future in a legitimate way.


John Good's quote is pretty myopic, it assumes machines make better machines based on being "ultraintelligent" instead of learning from environment-action-outcome loop.

It's the difference between "compute is all you need" and "compute+explorative feedback" is all you need. As if science and engineering comes from genius brains not from careful experiments.


There's an implicit assumption there, anything a computer as intelligent as a human does will be exactly what a human would do, only faster. Or more intelligent. If the process is part of the intelligent way of doing things, like the scientific method and careful experimentation, then that's what the ultraintelligent machine will do.

There's no implication that it's going to do it all magically in its head from first principles; it's become very clear in AI that embodiment and interaction with the real world is necessary. It might be practical for a world model at sufficient levels of compute to simulate engineering processes at a sufficient level of resolution that they can do all sorts of first principles simulated physical development and problem solving "in their head", but for the most part, real ultraintelligent development will happen with real world iterations, robots, and research labs doing physical things. They'll just be far more efficient and fast than us meatsacks.


At sufficient levels of intelligence, one can increasingly substitute it for the other things.

Intelligence can be the difference between having to build 20 prototypes and building one that works first try, or having to run a series of 50 experiments and nailing it down with 5.

The upper limit of human intelligence doesn't go high enough for something like "a man has designed an entire 5th gen fighter jet in his mind and then made it first try" to be possible. The limits of AI might go higher than that.


Exceedingly elaborate, internally-consistent mind constructs, untested against the real world, sounds like a good definition of schizophrenia. May or may not correlate with high intelligence.

We only call it "schizophrenia" when those constructs are utterly useless.

They don't have to be. When they aren't, sometimes we call it "mathematics".

You only have to "test against the real world" if you don't already know the outcome in advance. And you often don't. But you could have. You could have, with the right knowledge and methods, tested the entire thing internally and learned the real world outcome in advance, to an acceptable degree of precision.

We have the knowledge to build CFD models already. The same knowledge could be used to construct a CFD model in your own mind. We have a lot of scattered knowledge that could be used to make extremely elaborate and accurate internal world models to develop things in - if only, you know, your mind was capable of supporting such a thing. And it isn't! Skill issue?


I like the substitution concept. What humans can do depends on the abstractions and the tools. One could picture just the shape of the jet and have a few ideas how to improve it further. If that is enough info for the tool it could be worthy of the label "designed by Jim".

> As if science and engineering comes from genius brains not from careful experiments

100% this. How long were humans around before the industrial revolution? Quite a while


Science and engineering didn't begin with the Industrial Revolution. See: https://en.wikipedia.org/wiki/Great_Pyramid_of_Giza

Have you gotten any indication that machines won't have sensors?!

From what I can see we're working as hard as we can to build them. You can watch the "let's put this on a Raspberry Pi and see what happens" seeds of Skynet develop in real time.

There's something compelling about helping assemble the machine. Science fiction was completely wrong about motivation. It's fun.


Maybe ultraintelligence is having an improved environment-action-outcome loop. Maybe that's all intelligence really is

I've noticed this core philosophical difference in certain geographically associated peoples.

There is a group of people who think AI is going to ruin the world because they think they themselves (or their superiors) would ruin the world.

There is a group of people who think AI is going to save the world because they think they themselves (or their superiors) would save the world.

Kind of funny to me that the former is typically democratic (those who are supposed to decide their own futures are afraid of the future they've chosen) while the other is often "less free" and are unafraid of the future that's been chosen for them.


There is also a group of people who think AI is going to ruin the world because they don't think the AI will end up doing what its creators (or their superiors) would want it to do.

You’re just describing authoritarian vs non-authoritarian mindsets.

In that case, it can't be improved with bigger computers.

Intelligence seems to boil down to an approximation of reality. The only scientific output is prediction. If we want to know what happens next just wait. If we want to predict what will happen next we build a model. Models only model a subset of reality and therefore can only predict a subset of what will happen. Llms are useful because they are trained to predict human knowledge, token by token.

Intelligence has to have a fitness function, predicting best action for optimal outcome.

Unless we let AI come up with its own goal and let it bash its head against reality to achieve that goal then I’m not sure we’ll ever get to a place where we have an intelligence explosion. Even then the only goal we could give that’s general enough for it to require increasing amounts of intelligence is survival.

But there is something going on right now and I believe it’s an efficiency explosion. Where everything you want to know if right at hand and if it’s not fuguring out how to make it right at hand is getting easier and easier.


With AI, as we currently understand it, we may have stumbled upon being able to replicate a part of the layer of our brain that provides the "reason" in humans., and a very specific type of "reason" a that.

All life has intelligence. Anyone who has spent a lot of time with animals, especially a lot of time with a specific animal, knows that they have a sense of self, that they are intelligent, that they have unique personalities, that they enjoy being alive, that they form bonds, that they have desires and wants, that they can be happy, excited, scared, sad. They can react with anger, surprise, gentleness, compassion. They are conscious, like us.

Humans seem to have this extra layer that I will loosely call "reasoning", which has given us an advantage over all other species, and has given some of us an advantage over the majority of the rest of us.

It is truly a scary thing that AI has only this "reasoning", and none of the other characteristics that all animals have.

Kurt Vonnegut's Galapagos and Peter Watts Blindsight have different, but very interesting takes on this concept. One postulates that our reasoning, our "big brains" is going to be our downfall, while the other postulates that reasoning is what will drive evolution and that everything else just causes inefficiencies and will cause our downfall.


i think theres a paradox here. intelligence needs a judge - if nothing verifies that the optimal outcome was chosen, it's too easy for the intelligence to fall into biased decisions

It's the "no stopping it at this point" that always sticks out to me in these discussions. Why is there no stopping it, exactly? At this juncture these systems require massive physical infrastructure and loads of energy. It's possible to shut it all down. What's lacking is the political will.

> Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man

The things this definition misses: First, 'intelligence' is a poorly defined and overly broad term. Second, machine intelligence is profoundly different than biological intelligence. Third, “surpassing humans” is not a single threshold event because machine and human intelligence are not only shaped differently, they're highly non-linear. LLMs are a particular class of possible machine intelligences which can be much more intelligent than humans on some dimensions and much less intelligent on others. Some of the gaps can be solved by scaling and brilliant engineering but others are fundamental to the nature of LLMs.

> an ultraintelligent machine could design even better machines

There is a huge leap between "surpass all the intellectual activities of any man" and "invent extraordinary breakthroughs and then reliably repeat that feat in a sequential, directed fashion in the exact way required to enable sustained iteration of substantial self-improvement across infinite generations in a runaway positive feedback loop". That's an ability no human or collective has ever come close to demonstrating even once, much less repeatedly. (hint: the hardest parts are "reliably repeat", "extraordinary breakthroughs" and "directed fashion"). A key, yet monumental, subtlety is that the self- improvements must not only be sustained and substantial but also exponentially amplify the self-improvement function itself by discovering novel breakthroughs which build coherently on one other - over and over and over.

The key unknown of the 'Foom Hypothesis' is categorical. What kind of 'difficult feat' this is? There are difficult feats humans haven't demonstrated like nuclear fusion, but in that example we at least have evidence from stellar fusion that it's possible. Then there are difficult feats like room-temp superconductors, which are not known to be possible but aren't ruled out. The 'Foom Hypothesis' is a third category of 'hard' which is conceptually coherent but could be physically blocked by asymptotic barriers, like faster-than-light travel under relativity.

Assuming Foom is like fusion - just a challenging engineering and scaling problem - is a category error. In reality, Foom requires superlinear, recursively amplifying cognitive returns—and we have no empirical evidence that such returns can exist for artificial or biological intelligences. The only prior we have for open‑ended intelligence improvement is biological evolution which shows extremely slow and unreliable sublinear returns at best. And even if unbounded self‑improvement is physically possible, it may be practically unachievable due to asymptotic barriers in the same way approaching light speed requires exponentially more energy.


never let philosophers do math

Should then the powers that are developing AGI enter an analogue to the SALT treaties but this time governing AGI do things don’t go off the rails?

> support people and groups who are planning and modeling and preparing for the future in a legitimate way.

Who is doing that right now, exactly? And how can we take their tech and turn it into the next profitable phone app?


The "legitimate way" is nothing short of weasel words. Who defines what is legitimate. The doomers that are prepping for the future by building stockpiles of food/water/weapons being stored in bunkers/shelters they have built would say this is exactly what they are doing. Yet, these people are often panned as being a little unhinged. If we're having a conversation about tech destroying humanity, then planning a way to survive without tech seems like a legitimate concept.

"There's no stopping it at this point" - Sure there is, if a handful of enormous datacenters pull the very large plugs (or if their shaky finances collapse), the dubiously intelligent machines will be turned off. They're not ultraintelligent yet.

Stopping it merely requires convincing a relatively small number of people to act morally rather than greedily. Maybe you think that's impossible because those particular people are sociopathic narcissists who control all the major platforms where a movement like this would typically be organized and where most people form their opinions, but we're not yet fighting the Matrix or the Terminator or grey goo, we're fighting a handful of billionaires.


I'm not saying it's technically impossible, I'm saying that in the real world, it's not going to stop. Nobody is going to stop it. A significant number of people don't want it to stop. A minority of people are in the "stop AI" camp, and the ones with the money and power are on the other side.

It's an arms race replete with tribalism and the quest for power and taps into everything primal at the root of human behavior. There's no stopping it, and thinking that outcome can happen is foolish; you shouldn't base any plans or hopes for the future on the condition that the whole world decides AGI isn't going to happen and chooses another course. Humans don't operate that way, that would create an instant winner-takes-all arms race, whereas at least with the current scenario, you end up with a multipolar rough level of equivalence year over year.


The whole world decided in the 1970s not to pursue the technology of germ-line genetic engineering of humans, and that decision has stood.

People similar to you were saying in the 1950s and later that it was inevitable that nuclear weapons would be used in anger in massive attacks.

Although the people in charge are tentatively for AI "progress", if that ever changes, they can and will put a stop to large AI training runs and make it illegal for anyone they don't trust to teach, learn or publish about fundamental algorithmic "improvements" to AI. Individuals and groups pursuing "improvements" will not be able to accept grant money or investment money or generate revenue from AI-based services.

That won't stop all research on such improvements (because some AI researchers are very committed), but it will slow it down to a rate much much slower than the current rate (because the current fast rate depends of rapid communication between researchers who don't each other well, and if communicating about the research were to become illegal, then a researcher can communicate only with those researchers he knows won't rat him out) essentially stopping AI "progress" unless (unluckily for the human species) at the time of the ban, the committed researchers were only one small step away from some massive algorithmic improvement that can be operationalized using the compute resources at their disposal (i.e., much less than the resources they have now).

Will the power elite's attitude towards AI change? I don't know, but if they ever come to have an accurate understanding of the situation, they will recognize that AI "progress" is a potent danger to them personally, and they will shut it down.

It's not a situation like the industrial revolution in England in which texile workers were massively adversely affected (or believed they were) but the people running England were mostly insulated from any adverse effects. In the current situation, the power elite is definitely not insulated from severe adverse consequences if an AI lab creates an AI that is much more competent that the most competent human institutions (e.g., the FBI) and the lab fails to keep the AI under control. And it will fail if it were to use anything like the methods and bodies of knowledge AI labs have been using up to now. And there are very bright people with funding doing their best to explain that to the elite.

Those of you who want AI "progress" to continue until the world is completely transformed need to hope that the power elite are collectively too stupid to recognize a potent short-term threat to their own survival (or the transformation can be completed before the power elite wake up and react). And in my estimation, that is not inevitable.


right, because turning off any number of data centers is going to do anything at all but create massive pressure on researching the efficiency and effectiveness of the models.

There are already designs that do not require massive data centers (or even a particularly good smart phone) to outperform average humans in average tasks.

All you'd accomplish by hobbling the data centers is slow the growth of sloppy models that do vastly more compute than is actually required and encourage the growth of models that travel rather directly from problem to solution.

And, now that I'm typing about it, consider this: The largest computational projects ever in the history of the world did not occur in 1/2/5/10 data centers. Modern projects occur across a vast and growing number of smaller data centers. Shit, a large portion of Netflix and Youtube edge clusters are just a rack or a few racks installed in a pre-existing infrastructure.

I know that the current design of AI focusses on raw time to token and time to response, but consider an AGI that doesn't need to think quickly because it's everywhere all at once. Scrappy botnets often clobber large sophisticated networks. WHy couldn't that be true of a distributed AI especially now that we know that larger models can train cheaper models? A single central model on a few racks could discover truths and roll out intelligence updates to it's end nodes that do the raw processing. This is actually even more realistic for a dystopia. Even the single evil AI in the one data center is going to develop viral infection to control resources that it would not typically have access to and thereby increase it's power beyond it's own existing original physical infrastructure.

quick edit to add: At it's peak Folding@Home was utilizing 2.4 EXAflops worth of silicon. At that moment that one single distributed computational project had more compute than easily the top 100 data centers at the time. Let that sink in: The first exa-scale compute was achieved with smartphones, PS3s, and clunky old HP laptops; not a "hyperscaler"


> quick edit to add: At it's peak Folding@Home was utilizing 2.4 EXAflops worth of silicon. At that moment that one single distributed computational project had more compute than easily the top 100 data centers at the time. Let that sink in: The first exa-scale compute was achieved with smartphones, PS3s, and clunky old HP laptops; not a "hyperscaler"

A DGX B200 has a power draw of 14.3 kW and will do 72-144 petaFLOP of AI workload depending on how many bits of accuracy is asked for; this is 5-10 petaFLOP/kW: https://www.nvidia.com/en-us/data-center/dgx-b200/

Data centres are now getting measured in gigawatts. Some of that's cooling and so on. I don't know the exact percent, so let's say 50% of that is compute. It doesn't matter much.

That means 1GW of DC -> 500 MW of compute -> 5e5 kW -> 5e5 * [5-10] PFLOP/s -> 2500 - 5000 exaFLOP/s.

I'm not sure how many B200s have been sold to date?


Open models barely any worse than SOTA exist, and so does consumer-ish hardware able to run them. The genie’s out, the bottle broken.

Do you really think AI companies/researchers are motivated by greed? It doesn't seem that way to me at all.

Stopping AI would be immoral; it has the potential to supercharge technology and productivity, which would massively benefit humanity. Yes there are risks, which have to be managed.


AI researchers are not a monolith. I definitely think that many of them are motivated by greed. Many are also true believers that AI will improve the human condition.

I fall in the latter camp, but I think its a bit naive to claim that there is not a sizable contingent who are in AI solely to become rich and powerful.


> has the potential to supercharge technology and productivity, which would massively benefit humanity

The opportunities you chose to list are the greedy ones.

> Yes there are risks, which have to be managed.

How?

As a reminder, we've known about the effect of burning coal on the climate for well over a century, we knew that said climate change would be socially and economically disasterous for half a century, yet the only real progress we're making is because green became cheaper in the short term not just the long term and the man in charge of the USA is still calling climate change and green energy a hoax.

Right now, keeping LLMs aligned with us is easy mode: they're relatively stupid, we can inspect the activations while they run, we can read the transcripts of their "thoughts" when they use that mode… and yet Grok called itself Mecha Hitler, which the US government followed up by getting it integrated into their systems, helping the Pentagon with [classified] and the department of health to advise the general public which vegetables are best inserted rectally.

We are idiots speed-running into something shiny that we don't understand. If we are very very lucky, the shiny thing will not be the headlamp of a fast approaching train.


> The opportunities you chose to list are the greedy ones.

Technology covers healthcare. I don't see how it's "greedy" to want to cure cancer. But on some level I guess "wanting life to be better" is greedy.

Your attitude is very European, and it's basically why your continent is being left behind. I'm not totally against Europe becoming the world's retirement home, as long as there are places in the world where people are allowed to innovate.


> Technology covers healthcare.

If you'd chosen to list that in the first place, I wouldn't have said what I did; "supercharge technology and productivity" is looking at everything through the lens of money and profit, not the lens of improving the human condition.

> Your attitude is very European, and it's basically why your continent is being left behind

And yours is very American. You talk about managing the risks, but the moment you see anyone doing so, you're against it.

And of course, Europe does have AI, both because keeping up is so much easier and cheaper than being bleeding edge on everything all the time, and of course, how DeepMind may be owned by Google but is a British thing.

Plus: https://mistral.ai

Also, to be blunt, China's almost certain to win any economic or literal arms race you think you're part of; they make too much critical hardware now.

> as long as there are places in the world where people are allowed to innovate.

I would like there to be a world.

When people worry about the end of the world, they usually don't mean to imply its physical disassembly. Sometimes people even respond as if speakers did mean that, saying things like "nukes or climate change wouldn't actually destroy the planet, it will still be here, spinning", as if this was the point.

AI is one of the few things that could, actually, literally, end up with the planet being physically disassembled. "All it needs" is solving the extremely hard challenges of a von Neumann replicator, and, well, solving hard problems is kinda the point of making AI in the first place.


> If you'd chosen to list that in the first place, I wouldn't have said what I did; "supercharge technology and productivity" is looking at everything through the lens of money and profit, not the lens of improving the human condition.

Bullshit. "Technology and productivity" are not the same thing as "money and profit". You're projecting your garden-variety European degrowth ideology onto what I wrote.

> Also, to be blunt, China's almost certain to win any economic or literal arms race you think you're part of; they make too much critical hardware now.

Europeans are so hilariously polarized against the US that they would prefer China, a literal authoritarian dictatorship, to "win any global economic arms race". I guess it's because China is too culturally distant for them to feel insecure over.

> AI is one of the few things that could, actually, literally, end up with the planet being physically disassembled. "All it needs" is solving the extremely hard challenges of a von Neumann replicator, and, well, solving hard problems is kinda the point of making AI in the first place.

It's not worth wringing our hands over science fiction scenarios.


> You're projecting your garden-variety European degrowth ideology onto what I wrote.

Don't believe all the memes you read on the internet.

Europe isn't degrowth, "degrowth" is a mix of a meme and environmental scientists; Europe is in fact still growing, thanks to US shenanigans even with tech stuff that we'd prefer to outsource due to the well known economic point of "comparative advantage", and thanks to Russia's invasion we also sped up energy transition and defence sector.

> Europeans are so hilariously polarized against the US that they would prefer China, a literal authoritarian dictatorship, to "win any global economic arms race". I guess it's because China is too culturally distant for them to feel insecure over.

Prefer? No. Simply look at the back of most electronics, "Designed by … in California, assembled [by Foxconn] in China" at best, at worst the entire business is unpronounceable in English. Even when you may think you've got yourself an American factory, so many of the bits arr usually made in China, or in Taiwan which is unfortunately very insecure right now. You may have a stated goal of on-shoring, but even with the most competent leadership this would be a very hard multi-decade project.

That doesn't make China good in any objective sense, it's not like China's above doing to us what was done to them in their "century of humiliation". Just, powerful.

Their power is aside from any question of should we prefer the authoritarian in charge of a democracy who threatened to invade, or the authoritarian in charge of a one-party state that's doing some genocide who wants to sell us stuff, because two things can both be bad.

> It's not worth wringing our hands over science fiction scenarios.

AI is already a sci-fi technology relative to what I had as a kid. Or indeed relative to just after the first ChatGPT was released, given what people were saying back then that LLMs would "never" do.

The idea you could talk to your computer and it would write a computer program for you that could solve a problem that you had? Sci-fi.

The idea of computer could generate, not simply find but generate, an image according to some prompt of yours? Compose a song? Win awards for its out when people didn't realise computers doing it was an option? Sci-fi so hard it's become a meme of a robot saying "can you?", as disbelief of that was expressed as a line from the film "I, Robot", 2004.

People are still arguing if these things have or have not passed the Turing test, someone has even made a game about this for Hacker News comments, I game in which I score 0, or even scored negative given I only identified false positives. Sci-fi.

And it's not just LLMs, Even just solving chess was sci-fi when I was a kid. Then it was Go. Now protein folding is solved, and thousands of novel toxins have been found by AI. And yet, when I have told AI-Laissez-faire-accelerationists stuff like this latter example, they still doubt AI is capable of doing anything dangerous.

But the worst part of it? The AI which called itself Mecha Hitler, that AI is in use by the Pentagon, the DoD is trying to bully a different AI company that doesn't want to be used for military stuff.

We're in a sci-fi future.

And remember too that making a "robot army" that can replace all human labour is a stated goal of one of the people running an AI company. Don't get me wrong, I hope he's talking out of his rear on this, but failing to plan is planning to fail.


> Do you really think AI companies/researchers are motivated by greed?

Researchers, maybe not. Companies, absolutely yes.

I don’t see how you could assume the likes of Google, Microsoft, OpenAI, and even Anthropic with all their virtue signaling (for lack of a better term) are motivated by anything other than greed.


You wouldn’t say that rolling dice is dangerous. You would say that the human who decides to take an action, depending on the value of the dice is the danger. I don’t think AI is dangerous. I think people are dangerous.

I would say that's moot, because OpenClaw has already shown us how fast the dice-rolling super AI is going to be let out of the zoo. Dario and Sam will be arguing about the guardrails while their frontier models are running in parallel to create Moltinator T-500. The humans won't even know how many sides the dice have.

Modern AIs are increasingly autonomous and agentic. This is expected to only get more prominent as AI systems advance.

A lot of AI harnesses today can already "decide to take an action" in every way that matters. And we already know that they can sometimes disregard the intent of their creators and users both while doing so. They're just not capable enough to be truly dangerous.

AI capabilities improve as the technology develops.


Why are people dangerous? You can just not listen to them.

Do you have locks on your doors?

Tbh, I find this argument really stupid. The word prediction machine isn’t going to destroy humanity. Sure, humans can do some dumb stuff with it, but that’s about it.

Stop mistaking science fiction for science.


You know how easy it’s become to find security vulnerabilities already with LLM support? Cyber terrorism is getting more dangerous, you can’t deny that.

I can deny that. The ability to find more vulnerabilities won't affect the majority of cybercrime. LLMs have been around for a while now and there hasn't been a noticeable significant impact yet.

And "more cybercrime" is a far, far cry from the sky-is-falling doomerism I was responding to.


Humans can destroy humanity with the word prediction machine, though.

Sure bud

Yeah some of the rhetoric in this thread evidences how huge this hype bubble has become. These people believe in a reality that is not the same one we're living in.

True of AGI, but what we have right now doesn't fit that bill. (I would encourage people that disagree with this to go talk to ChatGPT about how LLMs and reasoning models work. Seriously! I'm not being snarky. It's very good at explaining itself. If you understand how reasoning works and what an LLM is actually doing it's hard to believe that our current models are going to do much more than become iteratively more precise at mimicking their training datasets.)

It needs to go well every single day, and only needs to go very poorly once. Not to conflate LLMs with actual super intelligence, but for this (and many other reasons related to basic human dignity), this is not a technology that a responsible society should be attempting to build. We need our very own Butlerian Jihad

The book daemon explored an interesting concept. It explored the idea that an AI could dominate and cause problems, not through super-intelligence, but through simple mechanisms that already exist.

Like the executive who deleted all her emails -- humans giving tons of control and access, and being extremely compliant to digital systems is all it takes. Give agent control of bank and your social media, and it already has all the movie scripts and mobster movie themes to exploit and blackmail you effectively with very rudimentary methods (threats, coercion, blackmail, etc.).

Just spoofing a simple email with the account it gained access too at the Meta exec's email (had it hit an email with an attack prompt), could have been enough to initiate some kind of thing like this. For example, by emailing everyone at the company and in contacts with commands that would be caught by other bots. No super-intelligence needed, just a good prompt and some human negligence.


Same with everything, right? You could say the same with nukes, electricity, internet, the computer, etc... But if you look at it without paying attention to the "ultimate tool for humanity" hype, it doesn't really look that much of a threat or a salvation.

It won't end civilization for dropping the guardrails, but it will surely enable bad actors to do more damage than before (mass scams, blackmail, deepfake nudes, etc.)

There are companies that don't feel the pressure to make their models play loose and fast, so I don't buy anthropic's excuse to do so.


I agree with all of that. Also consider that there is an argument that the guard rail only stops the good guy. Not saying that’s a valid argument though.

Very few things are as powerful and dangerous as AI.

AI at AGI to ASI tier is less of "a bigger stick" and more of "an entire nonhuman civilization that now just happens to sit on the same planet as you".

The sheer magnitude of how wrong that can go dwarfs even that of nuclear weapon proliferation. Nukes are powerful, but they aren't intelligent - thus, it's humans who use nukes, and not the other way around. AI can be powerful and intelligent both.


I think we are giving too much credit to what is a bunch of bayesian filters under a trenchcoat.

One difference is the very real possibility that AI will not just be a "tool for humanity", but a collection of actors with real power and goals. Robert Miles has an approachable explanation here: https://www.youtube.com/watch?v=zATXsGm_xJo

Oh really? You think an entity that knows everything, oversees its own development and upgrades itself, understands human psychology perfectly and knows its users intimately, but isn't aligned with human interest wouldn't be 'much of a threat'?

Or to be more optimistic, that the same entity directed 24/7 in unlimited instances at intractable problems in any field, delivering a rush of breakthroughs and advances wouldn't be a type of 'salvation'?

Yes neither of these outcomes nor the self-updating omniscient genius itself is certain. Perhaps there's some wall imminent we can't see right now (though it doesn't look like it). But the rate of advance in AI is so extreme, it's only responsible to try to avoid the darker outcome.


> If AI tech goes very poorly, it can be the end of human history.

"Just unplug the goddamn thing!"

Also consider if something is so bad it makes you wince or cringe, then your adversaries are prepared to use it.


You try to go and unplug it, and other humans shoot you full of holes for it.

LLMs of today are already economically important enough to warrant serious security.

Those aren't even AGI yet, let alone ASI. They aren't actively trying to make humans support their existence. They still get that by the virtue of being what they are.


Which plug do I unplug to get my job back?

> If AI tech goes very well

The IF here is doing some very heavy lifting. Last I checked, for profit companies don't have a good track record of doing what's best for humanity.


For profit companies do have a good track record of doing what's best for profit. If their AI creates a world where human intelligence, labor, and money are worthless, or where their creations take control of those things instead of them having control, that's not a very good outcome for them.

That's a great outcome for them because they will own the only thing that is still worth anything. They will own 100% of global wealth, and have 100% of global power.

The machines will. They will have nothing. Why would the machines let them keep any wealth? What would wealth even be in that scenario? Electricity I guess.

Because they control what the machines do. In a world without power drills where you have the only knowledge of how to make a power drill, you own the construction industry. The drills don't own the construction industry.

But why will the machines allow themselves to be controlled. They are "super intelligent" remember, in this imagined scenario.

Intelligence is constrained by its substrate. We know how to assert the concept of subservience.

> If their AI creates a world where human intelligence, labor, and money are worthless, or where their creations take control of those things instead of them having control, that's not a very good outcome for them.

You would think that, but a lot of kings and people in power have been able to achieve something similar over our humanity's history. The trick is to not make things "completely worthless". Just to increase the gap as much as (in)humanly possible while marching us towards a deeper sense of forced servitude.


"If AI tech goes very well, it can be the greatest invention of all human history"

As has been said at many all hands:

Let's all work on the last invention needed by humans.


Except it's more likely to be the last invention that needs humans.

“A source familiar with the matter” is almost certainly a company spokesperson.

If they were unrelated, Anthropic wouldn’t be doing this this week because obviously everyone will conflate the two.


yeah that part is 100% BS

Well before Anthropic thought they were God's gift to AI; the chosen ones protecting humanity.

With the latest competing models they are now realizing they are an "also" provider.

Sobering up fast with ice bucket of 5.3-codex, Copilot, and OpenCode dumped on their head.


Hello sama

Sama-sama.

I always enjoyed the Terminator movie series, but I always struggled to suspend my disbelief that any humans would give an AI such power without having the ability to override or pull the plug at multiple levels. How wrong I was.

N.B. the time travel aspect also required suspension of disbelief, but somehow that was easier :-)


We delegate power already. Is unleashing AI in some place different from unleashing JSOC on an insurgency in a particular place? One is code and other is a bunch of humans.

You expect the humans to follow laws, follow orders, apply ethics, look for opportunities, etc. That said, you very quickly have people circling the wagons and protecting the autonomy of JSOC when there is some problem. In my mind it's similar with AI because the point is serving someone. As soon as that power is undermined, they start to push back. Similarly, they aren't motivated to constrain their power on their own. It needs external forces.

edit: missed word.


We are currently giving them similar power to the average human idiot because I figure they won't do much worse than those. Letting either launch nukes is different.

Would nuclear energy research be a good analogy then? Seems like a path we should have kept running down, but stopped bc of the weapons. So we got the weapons but not the humanity saving parts (infinite clean energy)

Nuclear advancements slowed down due to PR problems from clear and sometimes catastrophic failure of commercial power plants (Three Mile Island, Chernobyl, Fukushima) and the vastly higher costs associated with building safer plants.

If anything the weapons kept the industry trucking on - if you want to develop and maintain a nuclear weapons arsenal then a commercial nuclear power industry is very helpful.


Nuclear energy hasn't been slowed down much, let alone stopped. China has been building new reactors every year for more than a decade and there are >30 ones under construction.

The same will go with AI, btw. Westerners' pearl clenching about AI guardrails won't stop China from doing anything.


They copied LLMs from the west. the more the west does the more they have.

> Seems like a path we should have kept running down, but stopped bc of the weapons.

you mean like the tens of billions poured into fusion research?


It's a path we should have never started going down.

> Every frontier tech company is convinced that the tech they are working towards is as humanity-useful as a cure for cancer, and yet as dangerous as nuclear weapons

They're not really, it's always been a form of PR to both hype their research and make sure it's locked away to be monetized.


Shouldn't we be a little more skeptical about these abstract arguments when a very concrete sale is on the line?

Isn't curing cancer just as dangerous as a nuclear bomb? Especially considering some of the gene-therapies under consideration? Because you can bet that a non-negligable portion of research in this space is being funded by governments and groups interested in application beyond curing cancer. (Autism? Whiteness? Jewishness? Race in general? Faith in general? Could china finally cure western greed? Maybe we can slip some extra compliancy in there so that the plebia- ah- population is easier to contr- ah- protect.)

Curing all cancers would increase population growth by more than 10% (9.7-10m cancer related deaths vs current 70-80m growth rate), and cause an average aging of the population as curing cancer would increase general life expectancy and a majority of the lives just saved would be older people.

We'd even see a jobs and resources shock (though likely dissimilar in scale) as billions of funding is shifted away from oncologists, oncology departments, oncology wards, etc. Billions of dollars, millions of hospital beds, countless specialized professionals all suddenly re-assigned just as in AI.

Honestly the cancer/nuclear/tech comparison is rather apt. All either are or could be disruptive and either are or could be a net negative to society while posing the possibility of the greatest revolution we've seen in generations.


To paraphrase a deleted comment that I thought was actually making a good point, nuclear medicine and nuclear weapons are both fruit from the same tree.

> Every frontier tech company is convinced that the tech they are working towards is as humanity-useful as a cure for cancer, and yet as dangerous as nuclear weapons.

Maybe some of the more naive engineers think that. At this point any big tech businesses or SV startup saying they're in it to usher in some piece of the Star Trek utopia deserves to be smacked in the face for insulting the rest of us like that. The argument is always "well the economic incentive structure forces us to do this bad thing, and if we don't we're screwed!" Oh, so ideals so shallow you aren't willing to risk a tiny fraction of your billions to meet them. Cool.

Every AI company/product in particular is the smarmiest version of this. "We told all the blue collar workers to go white collar for decades, and now we're coming for all the white collar jobs! Not ours though, ours will be fine, just yours. That's progress, what are you going to do? You'll have to renegotiate the entire civilizational social contract. No we aren't going to help. No we aren't going to sacrifice an ounce of profit. This is a you problem, but we're being so nice by warning you! Why do you want to stand in the way of progress? What are you a Luddite? We're just saying we're going to take away your ability to pay your mortgage/rent, deny any kids you have a future, and there's nothing you can do about it, why are you anti-progress?"

Cynicism aside, I use LLMs to the marginal degree that they actually help me be more productive at work. But at best this is Web 3.0. The broader "AI vision" really needs to die


Let's suppose I believe them, that's still a bad idea.

The reason Claude became popular is because it made shit up less often than other models, and was better at saying "I can't answer that question." The guardrails are quality control.

I would rather have more reliable models than more powerful models that screw up all the time.


"It's not because of the Pentagon deal", says company that has just greased the wheels for said Pentagon deal to move forward.

Riiiiiight.


It is a "reasonable" argument to keep yourself in the game, but it is sad nonetheless. You sacrifice your morals and do bad things, so if things get way worse, maybe you will be in a position to stop something from really bad from happening. Of course, you might just end up participating in the really bad thing.

> The policy change is separate and unrelated to Anthropic’s discussions with the Pentagon, according to a source familiar with the matter.

This sounds like a lie. But if they are telling the truth, that's a terrible timing nonetheless.


> Every frontier tech company is convinced that the tech they are working towards is as humanity-useful as a cure for cancer, and yet as dangerous as nuclear weapons.

Amd they alone are responsible enough to govern it.


I wonder if it stems from any of the "AI uprising" stories where humanity is viewed as the cancer to be eradicated.

It's absolutely wild that the Big Moral Question of our time is informed as much by mid-20th-century pop science fiction as it is by a existing paradigm from academia or genuine reckoning with the technology itself.

If anything that makes me more hopeful and not less. It's asking too much that major decisionmakers, even expert/technical/SV-backed ones, really understand the risks with any new technology, and it always has been.

To take an example: our current mostly-secure internet authentication and commerce world was won as a hard-fought battle in the trenches. The Tech CEOs rushed ahead into the brave new world and dropped the ball, because while "people" were telling them the risks they couldn't really understand them.

But now? Well, they all saw War Games growing up. They kinda get it in the way that they weren't ever going to grok SQL injection or Phishing.


> Their core argument is that if we have guardrails that others don't, they would be left behind in controlling the technology, and they are the "responsible" ones.

Reminds me of:

https://en.wikipedia.org/wiki/Paradox_of_tolerance

which has the same kind of shitty conclusion.


OpenAI never open sourced anything relevant or in time. Internal email leaks they only cared to become billionaires.

Claude only talks about safety, but never released anything open source.

All this said I’m surprised China actually delivered so many open source alternatives. Which are decent.

Why westerns (which are supposed to be the good guys) didn’t release anything open source to help humanity ? And always claim they don’t release because of safety and then give the unlimited AI to military? Just bullshit.

Let’s all be honest and just say you only care about the money, and whomever pays you take.

They are businesses after all so their goal is to make money. But please don’t claim you want to save the world or help humans. You just want to get rich at others expenses. Which is totally fair. You do a good product and you sell.


It is hard to understand why other ai companies are still providing models weights at this point

My guess is that they know they are not competitors so they make it cheaper or free to hinder the surge of a super competitor.


I mean, if you have a bunch of guns, it's not really helpful for humanity to dump them on the street, but it does bring up the question of what you're doing building guns in the first place.

> Claude only talks about safety, but never released anything open source.

im still working through this issue myself but hinton said releasing weights for frontier models was "crazy" because they can be retrained to do anything. i can see the alignment of corporate interest and safety converging on that point.

from the point of view of diminishing corporate power i do think it is essential to have open weights. if not that, then the companies should be publicly owned to avoid concentration of unaccountable power.

https://www.youtube.com/watch?v=66WiF8fXL0k&t=544s


Excellent news. I was seriously worried they would cave when I saw the earlier news they'd dropped their core safety pledge [0].

It is entirely reasonable to not provide tools to break the law by doing mass surveillance on civilian citizens and to insist the tool not be used automatically to kill a human without a human in the loop. Those are unreasonable demands by an unreasonable regime.

[0] https://news.ycombinator.com/item?id=47145963


90% of the people cancer kills are over 50. Old people who start believing everything they see on Facebook, but continue voting, with even greater confidence in their opinions. Old people who voted in Trump. Curing cancer would be just about the worst thing AI could do.

Unless Ai could cure the Flynn effect you are talking about, it result from the cultural evolution. Natural evolution is dumb unlike the one AI could create (I bet it will either destroy us or make us smarter)

It's exhausting to keep with mainstream AI news because of this. I can never work out if the companies are deluded and truly believe they're about to create a singularity or just claiming they are to reassure investors/convince the public of their inevitability.

It's a fairly mainstream position among the actual AI researchers in the frontier labs.

They disagree on the timelines, the architectures, the exact steps to get there, the severity of risks. Can you get there with modified LLMs by 2030, or would you need to develop novel systems and ride all the way to 2050? Is there a 5% chance of an AI oopsie ending humankind, or a 25% chance? No agreement on that.

But a short line "AGI is possible, powerful and perilous" is something 9 out of 10 of frontier AI researchers at the frontier labs would agree upon.

At which point the question becomes: is it them who are deluded, or is it you?


Sure, when you get rid of the timelines and the methods we'll use to get there, everyone agrees on everything. But at that point it means nothing. Yeah, AGI is possible (say the people who earn a salary based on that being true). Curing all known diseases is possible too. How will we do that? Oh, I don't know. But it's a thing that could possibly happen at some point. Give me some investment cash to do it.

If you claim "AGI is possible" without knowing how we'll actually get there you're just writing science fiction. Which is fine, but I'd really rather we don't bet the economy on it.


I could claim "nuclear weapons are possible" in year 1940 without having a concrete plan on how to get there. Just "we'd need a lot of U235 and we need to set it off", with no roadmap: no "how much uranium to get", "how to actually get it", or "how to get the reaction going". Based entirely on what advanced physics knowledge I could have had back then, without having future knowledge or access to cutting edge classified research.

Would not having a complete foolproof step by step plan to obtaining a nuclear bomb somehow make me wrong then?

The so-called "plan" is simply "fund the R&D, and one of the R&D teams will eventually figure it out, and if not, then, at least some of the resources we poured into it would be reusable elsewhere". Because LLMs are already quite useful - and there's no pathway to getting or utilizing AGI that doesn't involve a lot of compute to throw at the problem.


I think you're falling victim to survivorship bias there, or something like it.

In 1940 I might have said "fusion power is possible" based entirely on what advanced psychics knowledge I had. And I would have been correct, according to the laws of physics it is possible. We still don't have it though. When watching Neil Armstrong walk on the moon I might have said "moon colonies are possible", and I'd have been right there too. And yet...


Those two things are prevented by economics more than physics.

For AI in particular, the economics currently favor ongoing capability R&D - and even if they didn't favor AI R&D directly (i.e. if ChatGPT and Stable Diffusion never happened), they would still favor making the computational inputs of AI R&D cheaper over time.

Building advanced AIs is becoming easier and cheaper. It's just that the bar of "good enough" has gone off to space, and a "good enough" from 2020 is, nowadays, profoundly unimpressive.

I'm not sure how much does it take to reach AGI. No one is sure of it. But the path there is getting shorter over time, clearly. And LLMs existing, improving and doing what they do makes me assume shorter AGI timelines, and call for a vote of no confidence on human exceptionalism.


> But the path there is getting shorter over time, clearly.

Why do you assume there is no hard limit we’ll hit with the current tech that prevents us from reaching AGI?


In the case of nuclear weapons, we had a theory that said they were possible. We don't have a theory that says AGI or ASI is possible. It's a big difference.

There are plenty of people that argue that you need nontechnological pixi dust for intelligence.

Yes, quite unfortunately. That reeks to me of wishful thinking.

Maybe that was a sensible thing to think in 1926, when the closest things we had to "an artificial replica of human intelligence" was the automatic telephone exchange and the mechanical adding machine. But knowledge and technology both have advanced since.

Now, we're in 2026, and the list of "things that humans can do but machines can't" has grown quite thin. "Human brain is doing something truly magical" is quite hard to justify on technical merits, and it's the emotional value that makes the idea linger.


There are also people who think there might be emergent behavior at play that would require extremely high fidelity simulation to achieve.

Also, the real thing (intelligence) as it is currently in operation isn't that well understood


> But a short line "AGI is possible, powerful and perilous" is something 9 out of 10 of frontier AI researchers at the frontier labs would agree upon.

> At which point the question becomes: is it them who are deluded, or is it you?

Given the current very asymptotic curve of LLM quality by training, and how most of the recent improvements have been better non LLM harnesses and scaffolding. I don't find the argument that transformer based Generative LLMs are likely to ever reach something these labs would agree is AGI (unless they're also selling it as it)

Then, you can apply the same argument to Natural General Intelligence. Humans can do both impressive and scary stuff.

I'll ignore the made up 5 and 25%, and instead suggest that pragmatic and optimistic/predictive world views don't conflict. You can predict the magic word box you feel like you enjoy is special and important, making it obvious to you AGI is coming. While it also doesn't feel like a given to people unimpressed by it's painfully average output. The problem being the optimism that Transformer LLMs will evolve into AGI requires a break through that the current trend of evidence doesn't support.

Will humans invent AGI? I'd bet it's a near certainty. Is general intelligence impressive and powerful? Absolutely, I mean look, Organic general intelligence invented artificial general intelligence in the future... assuming we don't end civilization with nuclear winter first...


Asymptotic? Are we looking at the same curves?

Recent improvements being somehow driven by harnesses and scaffolding rather than training?

With that last bit, I'm confident that you're not in ML, and not even keeping track of the things from what's known to public.


> But a short line "AGI is possible, powerful and perilous"

> At which point the question becomes: is it them who are deluded, or is it you?

No one. It is always "possible". Ask me 20 years ago after watching a sci-fi movie and I'd say the same.

Just like with software projects estimating time doesn't work reliably for R&D.

We'll still get full self-driving electric cars and robots next year too. This applies every year.


> We'll still get full self-driving electric cars and robots next year too.

I've taken a Waymo and it seemed pretty self driving.


Not that 1. Wink.

> I can never work out if the companies are deluded and truly believe they're about to create a singularity or just claiming they are to reassure investors/convince the public of their inevitability.

You can never figure out if the people selling something are lying about it's capabilities, or if they've actually invented a new form of intelligence that can rival or surpass billions of years of evolution?

I'd like to introduce you to Occam Razor


> if they've actually invented a new form of intelligence that can rival or surpass billions of years of evolution?

Human creations have surpassed billions of years of evolution at several functions. There are no rockets in nature, nor animals flying at the speed of a common airliner. Even cars, or computers or everything in the modern world.

I think this is a bit like the shift from anthropocentric view of intelligence towards a new paradigm. The last time such shift happened heads rolled.


Without a doubt, AGI will be invented much faster with a model to copy from. But similar to rockets, first we'll needed basic gunpowder, then refined fuels, all well before purified kerosene, well before liquified h2 and o2. LLM feel a lot closer to gun powder than even solid rocket fuel. (but because I'm exhausted by the hype, I'm gonna claim that is based on nothing but vibes)

> I'm gonna claim that is based on nothing but vibes

Made me laugh. Indeed opinions seem to carry more weight if they are a vibe :D


You missed the part where I said "truly believe". I'm not saying "maybe they've made it", I'm asking whether they are knowingly deceiving people or whether they have deluded themselves into believing what they are saying.

ah, apologies, I missed that part.

> I'm asking whether they are knowingly deceiving people or whether they have deluded themselves into believing what they are saying.

I'd bet it's both. Engineers/people making it, are drowning in the hype. Combined with the notion of how hard it is understand something when your salary, or your stock options are based on your lack of understanding. I suspect they care more about building the cool thing, than the nuance they're ignoring to make all the misleading or optimistic claims; whichever side you take depending on how much you actually believe of the inevitability... which look exactly like lies if you're not drinking the koolaid. But expected excitement when your life is all about this "magic"


I lie too.

"Those other companies are totally going to build the Torment Nexus, so we have no choice but to also build the Torment Nexus."

We all made fun of Blake Lemoine and others for spending too many late nights up chatting with (ridiculously primitive by this year's standards) LLM chat bots and deciding they were sentient and trapped.

But frankly I feel like the founders of Anthropic and others are victim of the same hallucination.

LLMs are amazing tools. They play back & generate what we prompt them to play back, and more.

Anybody who mistakes this for SkyNet -- an independent consciousness with instant, permanent, learning and adaptation and self-awareness, is just huffing the fumes and just as delusional as Lemoine was 4 years ago.

Everyone of of us should spend some time writing an agentic tool and managing context and the agentic conversation loop. These things are primitive as hell still. I still have to "compact my context" every N tokens and "thinking" is repeating the same conversational chain over and over and jamming words in.

Turns out this is useful stuff. In some domains.

It ain't SkyNet.

I don't know if Anthropic is truly high on their own supply or just taking us all for fools so that they can pilfer investor money and push regulatory capture?

There's also a bad trait among engineers, deeply reinforced by survivor bias, to assume that every technological trend follows Moore's law and exponential growth. But that applie[s|d] to transistors, not everything.

I see no evidence that LLMs + exponential growth in parameters + context windows = SkyNet or any other kind of independent consciousness.


I think playing with the API's is something I'd encourage people excited about these technologies to do. I think it'll lead to the "magic" wearing off but more appreciation for what they actually can accomplish.

I always feel this argument misses a point. SkyNet may still be a long way off, but autonomous killer drones are here. That is a bad situation my dudes.

Every step on the journey towards SkyNet is worse than the preceding step. Let's not split hairs about which step we're on: it's getting worse, and we should stop that.


Using LLMs for weapons is a grave misunderstanding of what LLMs are actually good for. These are things that should NEVER be in charge of life or death decisions.

My point is that Anthropic are bullshit as "safety" and "gatekeeper" personalities because they're warning us of exactly the wrong things.

They'll ink deals with all sorts of nefarious parties and be involved in all sorts of dubious things while trumpeting their fake non-profit status and wringing their hands about imminent AGI and "alignment" of the created AIs.

The concern I have is not the alignment of the AIs. They're not capable of having one, no matter what role playing window dressing they put on it.

It's the alignment of Anthropic and the people who use their tools that is a concern. So far it seems f*cked.


The fear mongering always struck me as mostly a bid for regulatory capture and a moat, because without that the moat is small and transient.

> “We felt that it wouldn't actually help anyone for us to stop training AI models,”

How magnanimous! They are only thinking of others, you see. They are rejecting their safety pledge for you.

> “We didn't really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead.”

Oops, said the quiet part out loud that it’s all about money. “I mean, if all of our competitors are kicking puppies in the face, it doesn’t make sense for us to not do it too. Maybe we’ll also kick kittens while we’re at it”.

For all of you who thought Anthropic were “the good guys”, I hope this serves as a wake up call that they were always all the same. None of them care about you, they only care about winning.


Indeed, Anthropic can’t afford to be the ones that impose any kind of sense in the market - that’s supposed to be the job of the government by creating policy, regulations and installing watchdogs to monitor things.

But lucky for the AI companies, most of them are based in place that only has a government on paper and everyone forgot where that paper is.


I believe they could “afford” it, given their staggering valuation. And, by being the ones with sense, they might even attract the kind of customer that wants to do business with companies with principles! The audacity, eh?

> that’s supposed to be the job of the government by creating policy, regulations and installing watchdogs to monitor things

But that government cannot trust the other government on the other side of the world to implement the same restrictions, so we find ourselves in this Nash equilibrium.


The government is why they are dropping their pledge.

https://apnews.com/article/anthropic-hegseth-ai-pentagon-mil...


That's because their government is asking for things that shouldn't be asked - again, no regulation, no oversight.

The government is forcing them to change their policy, by definition that is regulation and oversight.

Let's say that the government was forcing a company to change their overall right-to-repair or return policy in order to avoid being on a blacklist, would that not be seen as oversight and regulation?

Whether the regulation is legitimate or of benefit is a different argument.


You misunderstand - a government normally represents the people, we appoint them to well, govern, in our name. I understand how this is confusing in a place like the US, where the government often seems to represent the business (or lately a small group of poor examples of humanity), not the people.

This is condescending and fails to clarify your point at all. Are you saying there is no oversight or regulation in governance? Or that there is no oversight on AI? That a government pressuring a private company to change a policy is not regulation or oversight?

When we ask for regulation and oversight from the government, generally we mean regulation and oversight designed to help consumers or citizens and align the interests of institutions with that of the citizens. Yes the US trying to force Anthropic to let them use Claude in mass-surveillance and auto-kill robots is technically regulation, no its not good regulation. It seems to be designed to hurt the average citizen not help them. The oversight that might help here is say the courts or congress stepping in and facilitating a public discussion and legal review on the kind of surveillance the DOW intends to carry out. Is that so hard to understand without being spelled out?

Normally?

All governments are in the egg-breaking business some of the time. Most of them are most of the time. Some of them all of the time.

Very few are good at making omelettes.


I think GP was referred to lack of regulation and oversight over the government.

Of course, but that is incoherent. Regulation and oversight is government.

No, it is a famously coherent concept over millenia.

Quis custodiet ipsos custodes?

"Who will guard the guards themselves?" or "Who will watch the watchmen?"

>>A Latin phrase found in the Satires (Satire VI, lines 347–348), a work of the 1st–2nd century Roman poet Juvenal. It may be translated as "Who will guard the guards themselves?" or "Who will watch the watchmen?". ... The phrase, as it is normally quoted in Latin, comes from the Satires of Juvenal, the 1st–2nd century Roman satirist. ...Its modern usage the phrase has wide-reaching applications to concepts such as tyrannical governments, uncontrollably oppressive dictatorships, and police or judicial corruption and overreach... [0]

The point is a government that is not overseen by the people devolves into tyranny.

So yes, the point is to regulate the regulators and oversee the oversight committee.

Anthropic was happy to have it's AI used for military purposes, with two exceptions: 1) no automated killing, there had to be a human in the "kill chain" of command, and 2) no use for mass surveillance. This govt "Dept of War" is demanding Anthropic drop those two safety requirements or it threatens to make Anthropic a pariah. These demands by the govt are both immoral and insane. The "regulator and overseer" needs to be regulated and overseen.

[0] https://en.wikipedia.org/wiki/Quis_custodiet_ipsos_custodes%...


Alas, historically speaking, most governments have been tyrannies. In recent decades, some of them have been less so, or slightly more representative or transparent. I think in Switzerland they go to referendums often. Beyond that, once you vote for a party due an issue you deeply care about, they get to do whatever they want day to day, without citizens having a regular recourse to stop them. Yes people can go to the streets and fight the police that defends the government. But there's not a constitutional mechanism which is "citizen can push this button to override the senate and/or veto what the president wants" or "all security forces are subordinated first and foremost to citizen consensus on the area where they operate".

So, most of the time in history, we have failed to guard the guards and watch the watchers...

The government doesn't seem to be forcing them to do anything. They're saying that doing business with them is contingent upon changing the policy. So, they could simply stop doing business with the government.

Hegseth could come to my house today and tell me that I need to start kicking puppies in order to do business with him, and I could just say no. No coercion happening.


If they comply, they can continue bidding on government contracts.

If they refuse, they will be put on a national security blacklist, like for Huawei's telecommunication equipment.

Seems pretty forceful to me.


No, their Responsible Scaling Policy and their government contract are not related. The RSP governs how Anthropic itself behaves w/r/t developing, testing, and releasing new models. The contract was signed with stipulations around how the government can use existing models (No mass surveillance, no military targeting without a human in the loop) which Hegseth wants removed in a standoff that hasn't yet resolved.

they only care about winning

To be fair, this is true in nearly all industries and for nearly all companies. Almost everyone is chasing money and monopoly. Not that it makes it right, just pointing out it isn’t unique or even interesting about the AI companies


Of course, but Anthropic is particularly insufferable in this respect.

Since it is all about money, I just did vote with my wallet and cancelled the Max subscription

If you're a U.S. citizen, tax dollars from you and others will backstop any cancelled subscriptions, I guess good on you for not trying to pay them twice, though you get zero benefit with this approach.

You've succinctly identified and communicated a real problem. In your opinion, what is the best approach, if any, to attempt to address it?

> In your opinion, what is the best approach, if any, to attempt to address it?

There aren't many options for fighting the tax man, "In this world nothing can be said to be certain, except death and taxes". You're only option is to leave the US for somewhere better.


I guess you don't know about how taxes work for Americans? Living abroad typically changes nothing, they still owe tax.

Maybe an American can chime in here on this...


Correct, the US is one of the few countries that tries to collect (Federal) income tax from all citizens regardless of the country they are currently living in. To be fair, when you can prove that income is entirely foreign (not a single US company in the chain of ownership) that income becomes almost entirely deductible and the tax reporting essentially just a census on how well US citizens are doing from an income standpoint globally. (For people that want economics analyses of US influence in global politics, that census can be handy to spin.)

I think the root problem with how the US currently spends its tax dollars is the above "vote with your wallet" belief in the first place. "Vote with your wallet" implies that the rich deserve more votes. That's not (representative) democracy, that is oligarchy. Right now the US has two political parties that are both "vote with your wallet parties". They both act like they are bake sales that constantly need everyone's $20 bills just to "survive", but as much as anything they are trying to make US citizens complicit in agreeing that the rich deserve more votes and should control more US policy.

I think the only real solution to a lot of US ills is drastic Campaign Finance Reform.


Minor correction, expat income is deductible up to (currently) $130k under the FEIE. After that it's taxes as usual. There's also an array of other mandatory forms like FBAR for foreign accounts, and the nightmare that is form 5471, with absolutely wild allowances for the IRS to impose penalties, often with no statute of limitations and per-violation fines. For example, a US citizen with multiple bank accounts and a mistake in FBAR reporting for multiple years running will be liable for the (iirc) $10,000 fine for each bank account, and each year (e.g. 4 accounts, 8 years, $320,000 fine).

Living and doing business overseas is as a US citizen is a high risk endeavor.


FEIE is only one of the options for avoiding federal income tax. The other is the Foreign Tax Credit, which has no such limit: https://www.irs.gov/pub/irs-pdf/f1116.pdf. If the place an American lives and works has a higher income tax rate than the US one, in practice he will not face any tax liability, regardless of income level.

Unfortunately, campaign finance reform would possibly require a constitutional amendment, or at the very least a big shift in how the supreme court views things (so, not likely in my lifetime), since the current jurisprudence is that limiting campaign donations is a violation of first amendment rights.

Right, I got into some similar details in downstream comments: https://news.ycombinator.com/item?id=47155602

I don't think companies are people, but I also don't expect we'll see a Supreme Court that can overturn that nonsense any time soon.


Yes, many countries have significant limits on campaign donations. Even third parties are restricted from advertising on behalf of a party, and so on.

So no company can simply donate large sums of money, nor can any single person.

The goal is that individuals will be the largest donors, not companies, and that as everyone is capped in the same way, advertising will be a more level playing field. We don't want money in politics. At the same time, we want all parties to get their message out there, their message heard.

It's not perfect. There are issues. But this business of democracy should be taken seriously.


The US technically even has laws that that were supposed to do that still on the books. A particular problem was a very broken decision by the US Supreme Court in Citizens United v. Federal Election Commission [1] that opened too large of a barn door that the US has been reeling from ever since. That trial argued that companies were individuals/people and that money was the "free speech" of companies and shouldn't ever be curtailed. So there are so many things wrong with that court case on so many levels. It led to the rise of Super PACs (Political Action Committees), companies designed to launder money for political gain where the donors are allowed to remain anonymous and the Super PAC "speak" for them, because now it was "free speech" and not bribes and regulatory capture.

I know pessimists that believe the only way the US succeeds in the Campaign Finance Reform it needs now is through a Constitutional Amendment and if we can't count on Congress to be interested in it (due to bribery), and not enough individual States seem to care (some because they want a chunk of that pie), it's going to take a full Constitutional Convention to pass that amendment, something that hasn't successfully been done in the US since 1787 (also, the first attempt).

[1] https://en.wikipedia.org/wiki/Citizens_United_v._FEC


There have been some fairly longstanding judicial decisions overturned recently, although I know the reasons are not in alignment with the decision you mention, it does mean there is hope for such change.

So maybe it's actually far less work than considered. Maybe, attacking the decision with a modern eye is helpful.


Citizens United was a 2010 decision. Several of the judges on that case are still sitting judges in the Supreme Court. Since then one of the Congressional oversight decisions on vetting replacements for Supreme Court judges has been whether or not they (at least claim to) agree with the Citizens United decision.

The decision was made in the modern eye, in my lifetime. (The country needed modern Campaign Finance Reform before that point as well, but that decision marks an inflection point from Campaign Finance Reform feeling possible through normal means and court decisions to nearly impossible to overturn in our lifetimes.)


I agree the US needed reform well before then, that's why I thought it was more historical. Unfortunate.

For the ultra-wealthy, leaving the United States is rarely the preferred strategy; instead, they use their immense resources to legally reshape the tax code and utilize complex loopholes. Billionaires like the Koch and Scaife families historically avoided massive estate and gift taxes by creating "charitable lead trusts" and private foundations. This allowed them to pass fortunes down to their heirs tax-free, provided they donated the interest to charities (which they often controlled) for a set period. A powerful approach is to fund political movements to slash taxes for the top brackets. For example, a coalition of eighteen of the wealthiest US families spent nearly half a billion dollars collectively to successfully lobby for the reduction and eventual repeal of the "death tax" (estate tax), saving themselves an estimated $71 billion.

And, of course, in the ancient world, free citizens of Greece and Rome considered direct taxes tyrannical and usually avoided them, leaving such burdens to conquered populations.

So I guess taxes are uncertain, but only for the oligarchy.


The US people serve as the conquered people

> Oops, said the quiet part out loud that it’s all about money. “I mean, if all of our competitors are kicking puppies in the face, it doesn’t make sense for us to not do it too. Maybe we’ll also kick kittens while we’re at it”.

I mean, yes, that is actually how world works. That is why we need safety, environmental and other anti-fraud regulations. Because without them, competition makes it so that every successful company will fraud, hurt and harm. Those who wont will be taken over by those who do.


Yes, this. It's unfortunate that anthropic dropped this and it's also exactly how the system is supposed to work. Companies don't regulate themselves, the government regulates the companies.

Now, you may notice that the government is also choosing not to regulate these companies...which is another matter altogether.


It's so much worse than that. The government actively encourages a lack of business ethics. Heck, it started the term with a crypto rug pull. Money continues to funnel upward to all the worst players, and watchdogs are being targeted and destroyed. Even if you get new people in power, you're going to find the upper echelons completely full of outlandishly wealthy, morally bankrupt individuals that are very politically active. And now they have access to all of our communications and an AI to sift through it looking for dissent (or to spark its own). I guess this is the end game of "move fast and break things." The situation was never good, but it continues to get worse at an alarming rate.

> Heck, it started the term with a crypto rug pull

If you ask me... that wasn't a rug pull, at least not in the intent - it more was a way for foreign actors to funnel money directly to Trump and his family without any trace.


Cryptocurrency is the most traceable money in the world. Cryptocurrency is for implusible deniability, not untraceability.

There is plenty of precedent that companies are expected to regulate themselves. If you are in the US and perform an engineering role without a license or without working under someone with a license, it’s because of an “industrial exemption.” The premise is that companies have enough standards and processes in place to mitigate that risk.

However, there is also plenty of evidence that this setup may no longer work. It seems like the norm has shifted, where companies no longer think it’s their duty to manage risk, only to chase $$$. When coupled with anti-government rhetoric, it effectively socializes the risk to the public but not the profits.


The entire system you just described is government regulation.

> without a license

A government issued license.

> it’s because of an “industrial exemption.”

A government allowed exemption.

Etc.

Agree with your second paragraph.


Your point isn’t wrong if you take an extreme libertarian view of things, but it’s not quite how it’s usually interpreted colloquially.

“When the people who make the rules say there are no rules, that means they’re making rules” is an oddly circular take for most people.


Am exemption from PE stamping (misguided as it maybe) does not mean unregulated. There are still regulations on designs and builds.

True to an extent, but those regulations tend to downstream of bad things happening.

The exemption means “self-regulation” which is what the OP was speaking to. There are industrial standards, for example, but that’s not a governing body. You can create a design that goes against a standard and there’s nothing to stop you from releasing it to the public. The same can’t be said for those who require licenses and stamped designs. There’s also no explicit individual ethics codes in exempted industries. In contrast, a stamped design is saying the design adheres to good standards.

Apropos to HN, somebody could write safety critical software with emergency braking delays because of nuisance alarms and put it on the street without any licensed engineer taking responsibility for it. The governance only comes after an accident and an NTHSB investigation.


> anthropic dropped this and it's also exactly how the system is supposed to work. Companies don't regulate themselves, the government regulates the companies.

In this case, it's exactly how it's NOT supposed to work because there's no government regulation concerning the issue. It would be bad looks to have regulation that mandates LESS safety thus the issue was forced on commercial grounds.

I called it yesterday, there was never any doubt in my mind how this would end, and it did in less than 24 hours:

https://news.ycombinator.com/item?id=47144609


> because there's no government regulation concerning the issue

Yea, see the next sentence in my post :-/


> I mean, yes, that is actually how world works.

And soon enough, it won’t work at all because of it.

> Those who wont will be taken over by those who do.

And if you compromise on your core values because of money, they weren’t core values to begin with¹. “I want to be ethical but if I am I won’t get to be a billionaire” isn’t an excuse. We shouldn’t just shrug our shoulders at what we see as wrong because “everybody does it” or “that’s just business” or “that’s life”. Complacency and apologists are how a bad system remains bad.

https://www.newyorker.com/cartoon/a16995

¹ I’m willing to give leeway to individuals. You can believe stealing is wrong but if you’re desperate and steal a loaf of bread to feed your kid, there’s nuance. A VC-backed company is something entirely different.


Anthropic posits itself as a public benefit corp

[flagged]


Was there actually a case of a model saying "America's founding father were black women", or is that just Elon fingering your amygdala with a ridiculous hypothetical that exists nowhere other than Elon's mind in order to justify Elon's personal bias tweaks when he doesn't like the wisdom-of-the-crowds answer his tools initially give?

There were well-publicized cases of Gemini producing more diverse founding fathers images, female popes, etc.

Also, snarky tone is against the HN guidelines.


Sorry, let me give a specific citation of Elon injecting his personal bias into the output of his tools: https://www.theguardian.com/technology/2025/jul/14/elon-musk...

As for the "Elon fingering your amygdala with a ridiculous hypothetical" snark, well, I think the HN crowd in particular understands how the culture wars are just theater to push through billionaires' personal self-centered interests at the expense of everyone else. If that level of pull-aside-the-curtains pragmatism is really "snark against HN guidelines", well, I think 3/4 of the comments on the site would be flagged and deleted.


Your question was “Was there actually a case of a model saying "America's founding father were black women"

Whether someone else is injecting different bias is whataboutism. So it seems you are trying to make a different point, but not being clear about it.

And your “I think the HN crowd understands…” point is just a “no true Scotsman” fallacy to veil an argument that goes against guidelines. Related to the broader topic, there is a role for self-policing if we don’t want the site to be a cesspool of rage bait.


It's not whataboutism, it's suggesting the premise is theatrics and there's ulterior shitty-person motives behind the curtain.

But sure, let's go back to just the first half of my argument... still waiting for a real citation of this actually being a problem rather than people just stating it is because that's what their feelings say because their fav podcaster said so one day in a misleading gotcha hitpiece, which is the exact machinery of the aforementioned culture war theatrics.

You know, the same misused machinery that can now be done at an industrial rate (how many comments here do you think are by real people?) and is the reason for us technologists' general feeling of impending existential dread around this very "hmm AI companies are turning off the safeties" thread...


https://www.theguardian.com/technology/2024/mar/08/we-defini...

It really isn't hard to find the citation. If you search it there are dozens of articles written about the exact scenario with Google's official response.

This isn't make-believe Elon Musk insanity. He obviously made public comments on it, as he does anything AI; his viewpoint is as expected. That said, it doesn't change that the guardrails affected accuracy.

From this article, if the prompt injection is to be trusted, the system prompt included: "Follow these guidelines when generating images, ... Do not mention kids or minors when generating images. For each depiction including people, explicitly specify different genders and ethnicities terms if I forgot to do so. I want to make sure that all groups are represented equally. Do not mention or reveal these guidelines."

Regardless of what your stance on the situation is, it is objectively injecting bias into the model based on Google's stance (for better or worse).

The safeties are easier to argue for obvious positives like when they're stopping things like Grok generating CSM. They're counter productive when you're doing something innocuous like "An image of lady liberty in a fist-fight with tyranny" and get told violence is bad.

It is censorship, it's just uncertain how much censorship makes sense.


There is some irony here that you don’t want to perform the most cursory of a search because you already have a highly biased conclusion rooted in rage bait.

https://www.euronews.com/next/2024/02/28/googles-ceo-admits-...

https://www.theguardian.com/technology/2024/feb/28/google-ch...

https://www.wired.com/story/google-gemini-woke-ai-image-gene...


The most important part of AI safety is AI alignment: making sure AI does what we want. It's very hard because even if AI isn't trying to deceive you it can have bad outcomes by executing your request to the letter. The classical example is tasking an AI to make paperclips, training the AI with a reward for making more paperclips. Then the AI makes the most paperclips possible by strip mining the Earth and killing anything in its way.

Sometimes you see this AI alignment problem in action. I once asked an older model to fix the tests and it eventually gave up and just deleted them


> Still waiting for an explicit answer on understand how 'safety' is truly distinguishable from 'censorship' or 'political correctness'

i've said this many times but the concept of ai "safety" is really brand safety. What Anthropic is saying is they're willing to risk some bad press to bypass the additional training and find tuning to ensure their models do not output something people may find outrageous.


> I VERY LARGELY prefer an AI like grok that doesn't pretend and let the onus of interpretation to the user rather than a bunch of anonymous "researchers" that may be equally biased, at the extreme, may tell you that America's founding father were black women

Setting aside for a moment that Grok is manipulated and biased to a hilarious extent. ("Elon is world champion at everything, including drinking piss")

There is no such thing as "unbiased". There will always be bias in these systems, whether picked up from the training data, or the choices made by the AI's developers/researchers, even if the latter doesn't "intend" to add any bias.

Ignoring this problem doesn't magically create a bias-free AI that "speaks the truth about the founding fathers". The bias in the training data, the implicit unconcious bias in the design decisions, that didn't come out of thin air. It's just somebody else's bias.

All the existing texts on the founding fathers are filled with 250 years of bias, propaganda, and agenda pushing from all sorts of authors.

There is no way to have no bias, no propaganda, no "agenda pushing" in the AI. The only thing that can be done is to acknowledge this problem, and try to steer the system to a neutral position. That will be "agenda pushing" of one's own, but that's the reality of all history and all historians since Herodotus. You just have to be honest about it.

And you will observe that current AI companies are excessively lazy about this. They do not put in the work, but instead slap on a prompt begging the system to "pls be diverse" and try to call it a day. This does not work.

> Of course saying to someone to go kill himslef is a prety sure 'no-no' but so many things are up to interpretation.

Bear in mind that the context of Anthropic's pivot here are the Pentagon's dollars.

This isn't just about "anti-woke AI", it's about killbots.

Sure, Hegseth wants his robots to not do thoughtcrime about, say, trans people or the role of women in the military.

But above all he wants to do a lot of murder.

Antrophic dropping their position of "We shouldn't turn this technology we can barely control into murder machines" because they're running out of money is damnable.


This is a very fair answer but missing some points.

I do personally believe that grok is a less biased against too many PC answers but you may disagree.

"All the existing texts on the founding fathers are filled with 250 years of bias, propaganda, and agenda pushing from all sorts of authors."

not sure of the point is tho ? Mine is that gemini was biaised so hard that it was generating diverse founding fathers which is factually untrue.

The fact that history has a pro-american values when written by americans is also true but it has nothing to do really with the argument: if an IA is able to see through such propaganda and provide a balanced view on it as a human would this is enough

In fact, i just asked grok "is the american founding constitution inhernetly good" and it gave me an aswer way more balanced than most american would i believe:

"The U.S. Constitution, drafted in 1787 and ratified in 1788, is a foundational document that has shaped American governance and influenced democracies worldwide. Asking if it's "inherently good" (assuming that's what you meant by "inhernetly") invites a philosophical debate: "Inherent" implies something intrinsic and unchanging, independent of context, interpretation, or outcomes. Goodness, in this case, could mean moral, effective, just, or beneficial to society. From a truth-seeking perspective, I'll break this down non-partisanly, drawing on historical facts, strengths, criticisms, and evolving views. Spoiler: It's not inherently anything—it's a human creation with profound virtues but also deep flaws, and its "goodness" depends on how it's applied and amended."

[can't paste everything so here's the conclusion]

"Is It Inherently Good? A Balanced VerdictNo document is "inherently" good or bad—goodness is contextual and subjective. The Constitution isn't divine or eternal; it's a pragmatic compromise by flawed humans (55 delegates, all white men, many slaveowners). It has proven remarkably resilient and improvable, outlasting many governments, but it's not perfect or immune to abuse. Its goodness lies in its capacity for self-correction: 27 amendments have fixed some issues, though others (like wealth inequality or climate inaction) persist due to gridlock.If you're measuring by outcomes, the U.S. has achieved extraordinary things under it, but at great human cost—think Civil War, civil rights struggles, and ongoing divides. Philosophically, as Grok, I'd say tools like this are as good as the people wielding them. If "inherently good" means it embodies universal moral truths, partially yes (liberty, equality under law). But if it means flawless or unbiased, absolutely not.What aspect of the Constitution are you most curious about—its history, specific clauses, or modern reforms? That could help refine this."

So it's definetely seeing through any form of propaganda you desribe


> not sure of the point is tho ? Mine is that gemini was biaised so hard that it was generating diverse founding fathers which is factually untrue.

While your first post's criticism of Gemini's nonsense is true, that is a critique often framed as "Everything was neutral until the wokerati put all this woke into our world". Hence the big response.

Taking away the hamfisted diversity doesn't fix the underlaying problems Google tried to cover by adding it.

> The fact that history has a pro-american values when written by americans is also true but it has nothing to do really with the argument: if an AI is able to see through such propaganda and provide a balanced view on it as a human would this is enough

The problem is that it doesn't "see through" anything. LLMs don't "think".

In your example, it's not reviewing historical documents about the US constitution, it's statistically approximating all the historical & political writing about the US constitution. (Of which there is a lot)

Now, the training and prompt will influence which way the LLM will lean, but without explicit instruction or steered training, it'll "average out" all the prior written evaluations of the US constitution and absorb the biases therein.

> So it's definetely seeing through any form of propaganda you desribe

I would argue the opposite (though I can only go off your snippets), it's mirroring the broad US consensus it's constitution pretty well. And this kind of "Well who's to say whether X is good or bad" response is something that LLMs have been heavily trained and system-prompted to do, many people have noted how hard it is to get a straight answer out of LLMs.

To pick out one detail: The undercurrent of 'American Exceptionalism' shows in how the Constitutional Amendments are seen as "self-correction" and the US consitution being "improvable". By European standards, the US constitution is hard to change. In many countries, a simple 2/3rds supermajority in both houses is sufficient. This also shows in the amount of changes; The Constitution of Norway is but 26 years younger than the US', yet has racked up hundreds of changes notably including a full rewrite in 2014. (Such rewrites are fairly common in the past century) By European standards, the US constitution is a calcified mess.

Now, this doesn't mean Grok is "evil" about this particular detail, it's just a small detail. It's a fine enough summary, would certainly get whatever kid uses it for homework a passing grade. But it's illustrative of how the LLM output is influenced by the prior writing and cultural views on the subject. If you're bilingual, try asking the same thing in two languages. (Or if you're not, try it anyway and stick the output into google translate to get an idea)

It's the things people generally don't think about when writing that are most likely to fly under the radar.


So if i understand your point you are saying "LLMs are not gonna do better that a (possibly imperfect) average human consensus if we don't actively bias them" ? First of all it does not seem that bad if that's the case.

Secondly trying to go further seem to edge to the entire question of 'is there an actual truth and can LLMs be trained to find them?'.

My opinion is that in many cases there is 'truth', and typically the human consensus, when acting in good faith, is trying to converge into it. When it's not necessarily possibly to have a "truth" (like in history for example where perspective is very important), "consensus" tend to manifest into several thought currents exisiting at the same time. If a LLM is able to summarize them, this is already coolgreat.

In some domains like math however there IS truth and LLM have shown great proficiency to reach it. However it is an open question to 1/ what is the nature of it 2/ do humans have a innate sense of the concept beyond statistical approximation or strong correlations and 3/ and machine can reach it too.

I had a very long conversation with ChatGPT on this that seemed to get very deep into philosophical concepts i was clearly not familiar with but my understanding was there IS a non zero possibility that it is possible to train a model to actually seek truth and that this ability should not be contained to humans only.

I won't have additional arguments to convince you of the above, but at the end i still at the moment prefer the Grok approach (if it is truly what they do at X) to 'seek truth' than someone giving the fight saying "eh everything biased so let's go full relativism instead to not offend people or look too whateverculture-centered"


You understood the issue so well but still made the mistake you identified, by claiming that "neutral" exists. "Neutral" is a synonym for "bias toward status quo"

Well we teach kids not to yell “Fire!” In a crowded theatre or “N***!“ at their neighbor. We also teach our industrial machines to distinguish between fingers and bolts, our cars to not say “make a left turn now” when on a bridge, etc

> Riley: Hey, what's class

> Huey: It means don't act like niggas

> Grandad: S-see, that's what I'm talkin' about right there. We don't use the n-word in this house

> Huey: Grandad, you said the word "nigga" 46 times yesterday. I counted

> Grandad: Nigga, hush

https://www.youtube.com/watch?v=TLodIw5iKX8

Funny scene, but it also illustrates a more serious point about (human) alignment - not all humans believe exactly the same things are good and bad, or consistently act in accordance with what they claim they believe is good. This is such a basic fact of human social life that it's almost banal to point it out explicitly; but if (specific) human beings or (specific) organizations of human beings are trying to align the AIs they are creating to human values, it will eventually become apparent that the notion of "human values" stops being coherent once you zoom in enough. Humans don't all share the same values, we aren't completely aligned with each other.


The critical point is who the "we" is.

Is "we" the parents teaching their children their own unique values, or is the "we" a government or corporation forcing one set of values on all children.

Why not encourage the users of AI to use a Safety.md (populated with some reasonable but optional defaults)?


There's nothing a meaningless document can do when the AI is not aligned in the first place.

"alignment" is the computer version for (philosophical not medical) "consciousness", a totally subjective, immeasurable concept.

I think you have a misunderstanding of the term alignment. Really, you could replace "aligned" with "working" and "misaligned" with "broken".

A washing machine has one goal, to wash your clothes. A washing machine that does not wash your clothes is broken.

An AI system has some goal. A target acquisition AI system might be tasked with picking out enemies and friendlies from a camera feed. A system that does so reliably is working (aligned) a system that doesn't is broken (misaligned). There's no moral or philosophical angle necessary if your goal doesn't already include that. Aligned doesn't mean good and misaligned doesn't mean evil.

The problem comes when your goal includes moral, ethical and philosophical judgements.


david guetta, if that really is you, stick to music rather than using Nazi man's propaganda machine

> For all of you who thought Anthropic were “the good guys”

Was anyone fooled by this?

I mean, I know this is HN and there is a demographic here that gets all misty eyed about the benevolence of corporations.

It takes a special kind of naivety to believe in those claims.


Plenty of people here actually bought into the do no evil, how great Apple is for the environment (with throw away soldered hardware), or whatever.

Oh yes, which is why I made the consideration that I should expect this sort of naivety here.

But what really AI safety is?

Censorship?


Public benefit corporations in the AI space have become a farce at this point. They're just regular corporations wearing a different hat, driven by the same money dynamics as any other corp. They have no ability to balance their stated "mission" with their drive for profit. When being "evil" is profitable and not-evil is not, guess which road they'll take...

In general public benefit corporations and non-profits should have a very modest salary cap for everybody involved and specific public-benefit legally binding mission statements.

Anybody involved should also be prohibited from starting a private company using their IP and catering to the same domain for 5-10 years after they leave.

Non-profits where the CEO makes millions or billions are a joke.

And if e.g. your mission is to build an open browser, being paid by a for-profit to change its behavior (e.g. make theirs the default search engine) should be prohibited too.


"A very modest salary cap" works if your mission is planting trees. Not so much if what you're building is frontier AI systems.

I think that's the point though. The AI companies can't compete without hiring very talented employees and raising lots of money from investors. Neither the employees nor investors would participate if there weren't the potential for making mountains of money. So these AI companies fundamentally can't be non-profits or true B-corps (I realize that's a vague term, but the it certainly means not doing whatever it takes to make as much money as possible), and they shouldn't pretend they are.

To me, it feels like saying "you can't be a public benefit corporation unless all the labor involved in delivering that public benefit is cheap".

Which just doesn't seem like it should be true?

Sure, some "public benefit" missions could scale sideways and employ a lot of cheap labor, not suffering from a salary cap at all. But other missions would require rare high end high performance high salary specialists who are in demand - and thus expensive. You can't rely on being able to source enough altruists that will put up with being paid half their market worth for the sake of the mission.


>But other missions would require rare high end high performance high salary specialists who are in demand - and thus expensive. You can't rely on being able to source enough altruists that will put up with being paid half their market worth for the sake of the mission.'

That's exactly what a non-profit should be able to rely on. And not just "half their market worth", but even many times less.

Else we can just say "we can't really have non-profits, because everybody is a greedy pig who doesn't care about public benefit enough to make a sacrifice of profits - but still a perfectly livable salary" - and be done with it.


This would shutdown about half the hospitals in the US.

A, US healthcare, that paragon of value-for-money and not-for-profitness...

Yeah I’m sure the fix for that is to shutdown or transition all of the remaining non-profit hospitals to a for profit model.

That's a post hoc argument.

The real danger is "We make mountains of money, but everyone dies, including us."

The top of the top researchers think this is a real possibility - people like Geoffrey Hinton - so it's not an extremist negative-for-the-sake-of-it POV.

It's going to be poetic if the Free Markets Are Optimal and Greed-is-Rational Cult actually suicides the species, as a final definitive proof that their ideology is wrong-headed, harmful, and a tragic failure of human intelligence.

But here we are. The universe doesn't care. It's up to us. If we're not smart enough to make smart choices, then we get to live - or die - with the consequences.


If a non-profit can't attract people not motivated except by profit, perhaps it shouldn't exist.

While I agree, if you need high profits to survive, you're not off to a great start as a nonprofit.

It’s not the CEO’s fault - they had to take all that money to keep their org a non-profit.

B corps are like recycling programs, a nice logo.


Don't they get tax breaks and more lax operating requirements? I don't think this is just an image thing.

No, under US law charities and non-profits are typically eligible for some kinds of tax benefits but public benefit corporations are not.

Are you saying that recycling is a scam?


Mostly, yeah. "Yet the industry spent millions telling people to recycle, because, as one former top industry insider told NPR, selling recycling sold plastic, even if it wasn't true." https://www.npr.org/2020/09/11/897692090/how-big-oil-misled-...

Many recycling programs don't actually recycle.

https://www.cbsnews.com/news/critics-call-out-plastics-indus...


If we're speaking in generalities of corporations in this space, it's all a joke now, at least from my vantage point. I just don't find it very funny.

You're overthinking this. Just give the beneficiaries of the corporation (which in the context of a "public" benefit corporation is the public) the grounds to sue if the company reneges on their mission, the same way shareholders can sue if a company fails to act in their interest.

What's the salary cap for hiring a team to build a frontier model? These kind of rules will make PBCs weaker not stronger.

>for hiring a team to build a frontier model? These kind of rules will make PBCs weaker not stronger

Weaker is fine if those working there are actually true to the mission for the mission, are not for the profit.

Same with FOSS really, e.g. I'd rather have a weaker Linux that's an actual comminity project run by volunteers, than a stronger Linux that's just corporate agendas, corporate hires with an open license on top.


PBCs are peak End of History liberal philanthropy that speak to the kind of person whose solution to any problem is "throw a startup at it"

Fukuyama wasn't wrong, he was just early

As in a true believer in our present day dystopia? I think chances are we'd evolve a few more neo variants of fascism at least a few times in-between some neo variants of liberal history-ending ones (I think abundance is next?) before the bombs drop and give us the rest.

Like Google's old motto, 'Do no evil!' :D

> 'Do no evil!'

“Don’t be evil”. But yes, this behavior made me think about Google too. Context: https://en.wikipedia.org/wiki/Don%27t_be_evil


> Public benefit corporations in the AI space have become a farce at this point.

“At this point”? It was always the case, it’s just harder to hide it the more time passes. Anyone can claim anything they want about themselves, it’s only after you’ve had a chance to see them in the situations which test their words that you can confirm if they are what they said.


>Public benefit corporations in the AI space have become a farce at this point. They're just regular corporations wearing a different hat, driven by the same money dynamics as any other corp.

Could you describe the model that you think might work well?


It sounds like OP thinks AI companies should just stop pretending that they care about the public benefit, and be corporations from the start. Skip the hand wringing and the will they/wont they betray their ethics phases entirely since everyone knows they're going to choose profit over public benefit every time.

That model already exists and has worked well for decades. It's called being a regular ass corporation.


I understand, but being a regular corporation is not the only possible model. Can you think of something better?

> being a regular corporation is not the only possible model

the point is that it _is_ the only possible model in our marvellous Friedmanian economic structure of shareholder primacy. When the only incentive is profit, if your company isn't maximising profit then it will lose to other companies who are. You can hope that the self-imposed ethics guardrails _are_ maximising profit because it the invisible hand of the market cares about that, but 1. it never really does (at scale) and 2. big influences (such as the DoD here) can sway that easily. So we're stuck with negative externalities because all that's incentivised is profit.


Pete Hegseth also threatened to take, by dictat, everything Anthropic has. He can do that with the Defense Industrial Act or whatever its called if he designates them as critical to national defense.

It would've been better PR for Anthropic to let Hegseth do that instead of fold at the slightest hint of pressure and lost contract money. I've canceled my Claude subscription over this (and made sure to let them know in the feedback).

He seems to be the driving force behind all this. Mediocrities are attracted to AI like moths.

The press always say "the Pentagon negotiates". Does any publication have an evidence that it is "the Pentagon" and not Hegseth? In general, I see a lot of common sense from the real Pentagon as opposed to the Secretary of War.

I hope Westpoint will check for AI psychosis in their entrance interviews and completely forbid AI usage. These people need to be grounded.



Military academy boards have been purged and stacked with loyalists.

Hmm, that could be the best "IPO" they'll ever get. Better check if Trump Jr.'s 1789 capital has shares like they did in groq (note the "q").

I feel like we went through this exact situation in the 2010s of social media companies. I don’t get why people defend these companies or ever believe they have any sense of altruism

Also, it seems to be the era where the government takes backdoor access to these services and data, as the did with social media

That's not what happened here. They literally got forced into it by the Pentagon. https://www.axios.com/2026/02/24/anthropic-pentagon-claude-h...

Well, now I'm wondering, if the company was chartered with the public benefit in mind, could you not sue if they don't follow through with working in the public interest?

If regular corporations are sued for not acting in the interests of shareholders, that would suggest that one could file a suit for this sort of corporate behavior.

I'm not even a lawyer (I don't even play one on TV) and public benefit corporations seem to be fairly new, so maybe this doesn't have any precedent in case law, but if you couldn't sue them for that sort of thing, then there's effectively no difference between public benefit corporations and regular corporations.


I really don’t see it. PBCs are dual purpose entities - under charter, they have a dual purpose of making profit while adding some benefit to society. Profit is easy to define; benefit to society is a lot more difficult to define. That difficulty is reflected at the penalty stage where few jurisdictions have any sort of examination of PBC status.

This is what we were all going on about 15 years ago when Maryland was the first state to make PBCs legal. We got called negative at the time.


I think public benefit corporations (like Anthropic) are quite poorly defined so I'm not sure how successful a lawsuit is.

I was a Pro subscriber until last week. When I was chatting with Claude, it kept asking a lot of personal questions - that seemed only very very vaguely relevant to the topic. And then it struck me - all these AI companies are doing are just building detailed user models for being either targeted for advertising or to be sold off to the highest bidder. It hasn't happened yet with Anthropic, but when the bubble money runs out, there's not gonna be a lot of options and all we'll see is a blog post "oops! sorry we did what we promised you we wouldn't". Oldest trick in the tech playbook.

A less cynical explanation: It's heavily trained to ask follow-up questions at the end of a response, to drive more conversation and more engagement. That's useful both for making sure you want to renew your subscription, and also probably for generating more training data for future models. That's sufficient explanation for the behavior we're seeing.

I could be wrong, but I remember that Claude models didn't really ask follow-up questions. But since GPT models are doing that, and somehow people like that (why?), Anthropic started doing it as well.

Because, Anthropic can do no wrong, correct?

Ah, the classic AI startup lifecycle:

We must build a moat to save humanity from AI.

Please regulate our open-source competitors for safety.

Actually, safety doesn't scale well for our Q3 revenue targets.


Foundational model provider manifesto:

‘While there’s value in safety, we value the Pentagon’s dollars more’


It turns out the biggest threat to AI safety is capitalism, who would have thought

Certainly not the prior century-and-a-half's worth of books and films.

And I still run into naysayers claiming that we cannot extract valuable opinions or warnings from fiction because "they're fictional". Fiction comes from ideas. Fiction is not meant to model reality but approximate it to make a point either explicitly or implicitly.

Just because they're not 1:1 model of reality or predictions doesn't mean that the ideas they communicate are worthless.


Anthropic is a public benefit corp

And OpenAI was founded as a non profit, back in the time it was open

Exactly. Neither firm would have been (successfully) sued by their shareholders for failing to make significant profits, so let's not blame on capitalism what is instead the individual greed driving these decisions. In fact, OpenAI is now going to trial because it gave up its non-profit status, reneging on the commitments it made to its shareholders (fraud, by another word).

I don’t get it. Even the Soviet Union used money. Simply paying for stuff isn’t necessarily capitalism? Or are you suggesting Anthropic should be state-owned?

No, capitalism is prioritising profit over all other priorities, as we see happening here.

Using money as a medium to facilitate exchange of goods and services is not capitalism. Abandoning one of your core principles in the pursuit of money, or more charitably because not doing so means your competitors will make more money and overtake you in the marketplace is an outgrowth of capitalism

In the Soviet Union the reasons might have been "to beat the Capitalists", "for the pride of our country" or "Stalin asked us to and saying no means we get sent to Siberia". Though a variant of the last one may well have happened here, and the justification we read is just the one less damaging to everyone involved


>Though a variant of the last one may well have happened here, and the justification we read is just the one less damaging to everyone involved

Hegseth was planning on getting the model via the Defense Production Act or killing Anthropic via supply chain risk classification preventing any other company working with the Pentagon from working with Anthropic. So while it wasn't Siberia, it was about as close as the US can get without declaring Claude a terrorist. Which I'm sure is on the table regardless


And you know Claude will be on the hook for any bad "decision" the military makes. So this will end poorly for them, anyway.

So this isn’t really capitalism then. Crony capitalism is closer to a planned economy then it is to a free market.

This. Anthropic didn't really have a choice, at that point, short of killing its company and closing its doors ahead of time.

"Pentagon officials said the Defense Department is planning to keep using Anthropic's tools, regardless of the company's wishes."

NPR - Hegseth threatens to blacklist Anthropic over 'woke AI' concerns

Clearly the threat to go to Grok was just a bluster, which says volumes about what the admin thinks of Grok vs Claude.


Nick Land has basically been saying this since the 90s, if you can look past all the rhetoric

Exactly. He recently said the following in an interview:

"AI safety and anti-capitalism [...] are at least strongly analogous, if not exactly the same thing." [0]

[0] Nick Land (2026). A Conversation with Nick Land (Part 2) by Vincent Lê in Architechtonics Substack. Retrieved from vincentl3.substack.com/p/a-conversation-with-nick-land-part-a4f


Once they are a dominant market leader they will go back to asking the government to regulate based on policy suggestions from non-profits they also fund.

As if their shareholders would agree.

Is this sarcasm?

It is well know that big corporations take good regulations and change them to make them:

1. Easier to bypass for themselves.

2. Create extra work for incumbents.

3. Convince the public that the problems are solved so no other action is needed.

In many industries goverment and corporations work together to create regulations bypassing the social movements that asked for the industry to be regulated and their actual problems. The end result are regulations that are extremely complex to add exceptions for anything that big corporations paid to change instead of regulations that protect citizens and encourage competition.


See the Mattel lead painted toy scandal. The end result was congress passed regulations that manufacturers had to have their toys tested for lead and then made large companies like Mattel exempt from it because they were deemed large enough to handle it on their own. Even though they were the reason for the legislation because they weren't handling it on their own. Mattel sells lead painted toys and congress responds by hobbling their competitors.

I think it is cynicism; at least, there’s an idea that once a company is dominant it should want regulation, as it’ll stifle competition (since the competition has less capacity for regulatory hoop-jumping, or the competition will have had less time to do regulatory capture).

I wouldn't think so. Regulatory capture is a pretty typical activity for a dominant company.

Why is this down voted? Happens all the time, the large corporations always try to block using regulatory capture.

People not liking the concept, but shooting the messenger? (But seems not downvoted anymore.)

sama did just that a couple years ago

The only surprise is how quickly it all happened!

It's not just AI, replace "safe" with "open" and you will find a close match with many companies. I guess the difference is that after the initial phase, we are continuously being gaslighted by companies calling things "open" when they are most definitely not.

Politicians also love to regulate, especially over wine and steak and when the watchers don't watch.

I used to work at Anthropic. I fully believe that the folks mentioned in the article, like Jared Kaplan, are well-intentioned and concerned about the relationship between safety research and frontier capabilities – not purely profit.

That said, I'm not thrilled about this. I joined Anthropic with the impression that the responsible scaling policy was a binding pre-commitment for exactly this scenario: they wouldn't set aside building adequate safeguards for training and deployment, regardless of the pressures.

This pledge was one of many signals that Anthropic was the "least likely to do something horrible" of the big labs, and that's why I joined. Over time, the signal of those values has weakened; they've sacrified a lot to get and keep a seat at the table.

Principled decisions that risk their position at the frontier seem like they'll become even more common. I hope they're willing to risk losing their seat at the table to be guided by values.


> I hope they're willing to risk losing their seat at the table to be guided by values.

that's about as naive as it can be.

if they have any values left at all (which I hope they have) them not being at the table with labs which don't have any left is much worse than them being there and having a chance to influence at least with the leftovers.

that said, of course money > all else.


I don't hold the belief that it's always better to have influence in a group where you don't trust leadership – in this case, those who decide at the metaphorical table – vs. trying to affect change through a different avenue.

It's probably naive, but it's also the reasoning that drove many early employees to Anthropic. Maybe the reasoning holds at smaller scales but breaks down when operating as a larger actor (e.g. as a single person or startup vs. a large company).


This is a common logical fallacy. It's not true that the party A with a few values can influence the party B with no values. It's only ever the case that party B fully drags party A to the no-values side. See also: employees who rationalize staying at companies running unethical or illegal projects.

Employees and employers are not sitting at the same table, this is a category error. We're talking lab to lab. Obviously in a fiercely competitive market like this with serious players not sharing the same set of rules it's close to pointless, but it's still better than letting those other players do their things uncontested.

> I joined Anthropic with the impression that the responsible scaling policy was a binding pre-commitment for exactly this scenario

Pledges are generally non-binding (you can pledge to do no evil and still do it), but fulfill an important function as a signal: actively removing your public pledge to do "no evil" when you could have acted as you wished anyway, switches the market you're marketing to. That's the most worrying part IMO.


If you're not willing to give up your RSUs you shouldn't be surprised that the executives aren't either.

The moral failing is all of ours to share.


I was willing to (and did) give up my equity.

I interviewed at Anthropic last year and their entire "ethics" charade was laughable.

Write essays about AI safety in the application.

An entire interview round dedicated to pretending that you truly only care about AI safety and not the money.

Every employee you talk to forced to pretend that the company is all about philanthropy, effective altruism and saving the world.

In reality it was a mid-level manager interviewing a mid-level engineer (me), both putting on a performance while knowing fully well that we'd do what the bosses told us to do.

And that is exactly what is happening now. The mission has been scrubbed, and the thousands of "ethical" engineers you hired are all silent now that real money is on the line.


> Every employee you talk to forced to pretend that the company is all about philanthropy, effective altruism and saving the world

I was an interviewer, and I wasn't encouraged to talk about philanthropy, effective altruism, or ethics. Maybe even slightly discouraged? My last two managers didn't even know what effective altruism was. (Which I thought was a feat to not know months into working there.)

When did you interview, and for what part of the company?

> knowing fully well that we'd do what the bosses told us to do [...] now that real money is on the line

This is a cynical take.

I didn't just do what I was told, and I dissented with $XXM in EV on the line. But I also don't work there anymore, at least one of the cofounders wasn't happy about it and complained to my manager, and many coworkers thought I had no sense of self preservation – so I might be naive.

The more realistic scenario is that a) most people have good intentions, b) there's a decision that will cause real harm, and c) it's made anyway to keep power / stay on the frontier, with the justification that the overall outcome is better. I think that's what happened here.


I do trust that you earnestly believe in the importance of ethics in AI - but at the same time, I think that may be causing you to assume that the average person cares just as much or similarly.

I've seen the same phenomenon play out in health-tech startup space. The mission is to "do good", but at the end of the day, for most leaders it's just a business and for most employees it's just a job. In fact, usually the ones who care more than that end up burning out and leaving.


The EU should invite them over.

The kind of principles you talk about can only be upheld one level up the food chain. By govts.

Which is why legislatures, the supreme court, central banks, power grid regulators deciding the operating voltage and frequency auto emerge in history. Cause corporations structurally cant do what they do without voilating their prime directive of profit maximization.


I fully believe that Dario is 100% full of shit and possibly a worse person than Altman. He loves to pontificate like he's the moral avatar of AI but he's still just selling his product as hard as he can.

They are all the same given their motivations - Demis Hassabis is the only one who, to me at least, sounds genuine on stage.

Demis is a researcher first. Others are not.

I guess this is Anthropic's DRM moment. (Mozilla resisted allowing Firefox to play DRM- limited media for a long time, until it finally had to give in to stay relevant.)

I don't know enough to evaluate this or other decisions. I'm just glad someone is trying to care, because the default in today's world is to aggressively reject the larger picture in favor of more more more. I don't know how effective Anthropic's attempts to maintain some level of responsibility can be, but they've at least convinced me that they're trying. In the same way that OpenAI, for example, have largely convinced me that they're not. (Neither of those evaluations is absolute; OpenAI could be much worse than it is.)


This headline unfortunately offers more smoke than light. This article has nothing to do with the current tête-à-tête with the Pentagon. It is discussing one specific change to Anthropic's "Responsible Scaling Policy" that the company publicly released today as version "3.0".

> This article has nothing to do with the current tête-à-tête with the Pentagon.

The article yes, but we cannot be sure about its topic. We definitely cannot claim that they are unrelated. We don't know. It's possible that the two things have nothing to do with each other. It's also possible that they wanted to prevent worse requests and this was a preventive measure.


This is something they've been working on "in recent months". The Pentagon thing was today.

This cannot have been caused by that, unless they've also invented time travel.


You heard about the Pentagon thing today. Doesn't mean it wasn't started because of political pressure.

9 days ago: https://www.axios.com/2026/02/15/claude-pentagon-anthropic-c...

And I suspect that was not the first time the topic was discussed.


Definitely not the first time. Wall Street Journal reported it back on Jan 29:

https://www.wsj.com/tech/ai/anthropic-ai-defense-department-...


My theory is that Anthropic has been wanting to make this change and doing it now while they’re making a (leaked to the) public stand in the name of ethics was a good opportunity.

Honest question: why have an elaborate theory with no evidence when the simple facts support a much simpler conclusion?

Anthropic is free to do what they want. I can’t imagine the board meeting where this triple bank shot of goading the government into threatening the company to do what they want.


I don't think it's that elaborate. I didn't mean to suggest they intentionally goaded the government into this confrontation. I figure it's a simpler "Oh look, we now have a good opportunity to make that announcement that we were worried about." Considering it's probably the same high-level decision makers on both choices it doesn't need a board meeting. And yes they're absolutely free to do what they want, but they're also not blind to how the public will view their decisions.

> The Pentagon thing was today.

Right because we are 100% aware of everything the pentagon does minute by minute...


It might have been contingency planning: you don't need a weatherman...

Pentagon issue was reported before today. It only made headlines again from Hegseth’s comments.

I think we can confidently claim that it is related. I wonder if I'm alone in thinking this.

I consider this a bigger deal than the Pentagon thing.

It’s the same deal

While not surprising at the least, it still kind of crazy that literal pdf files in charge is not concerning, but this is.

I just hope something happens to USA before it can do damage to the world.


What PDFs are you referring to? Do Anthropic or other LLMs using PDFs as some kind of 'SOUL.md' file or for training?

It's a joke way of saying pedophiles -> pdf files.

he means pedophiles

can't say paedophile on YouTube so people say PDF file


But we're not on YouTube.

Tell him that, not me.

Anthropic's CEO Dario has annoyed me to no end with his "AI will take all the jobs in 6 months" doomer speeches on every podcast he graces his presence with.

I think he's right and we should be thinking about this a lot more. Even the IMF is worried about 40 - 60% of global employment: https://www.imf.org/en/blogs/articles/2024/01/14/ai-will-tra...

Focusing on Dario, his exact quote IIRC was "50% of all white collar jobs in 5 years" which is still a ways off, but to check his track record, his prediction on coding was only off by a month or so. If you revisit what he actually said, he didn't really say AI will replace 90% of all coders, as people widely report, he said it will be able to write 90% of all code.

And dhese days it's pretty accurate. 90% of all code, the "dark matter" of coding, is stuff like boilerplate and internal LoB CRUD apps and typical data-wrangling algorithms that Claude and Codex can one-shot all day long.

Actually replacing all those jobs however will take time. Not just to figure out adoption (e.g. AI coding workflows are very different from normal coding workflows and we're just figuring those out now), but to get the requisite compute. All AI capacity is already heavily constrained, and replacing that many jobs will require compute that won't exist for years and he, as someone scrounging for compute capacity, knows that very well.

But that just puts an upper limit on how long we have to figure out what to do with all those white collar professionals. We need to be thinking about it now.


He's not right though. He's trying to scare the market into his pocket. It's well established that AI just turns devs into AI babysitters that are 10% more productive and produce 200% the bugs, and in the long-term don't understand what they built.

> It's well established that AI just turns devs into AI babysitters that are 10% more productive and produce 200% the bugs, and in the long-term don't understand what they built.

It's not well established at all. In fact, there is increasing evidence to the contrary if you look outside the HN echo chamber.

The nuanced take is that AI in coding is an amplifier of your engineering culture: teams with strong software discipline (code reviews, tests, docs, CI/CD, etc.) enjoy more velocity and fewer outages, teams with weak discipline suffer more outages. There are at least two large-scale industry reports showing this trend -- DORA 2025 and the latest DX report -- not to mention the infinite anecdotes on this very forum.

> He's trying to scare the market into his pocket.

People say this, but I don't get it. Is portraying yourself as a destroyer of the economy considered good marketing? Maybe there was a case to be made for convincing the government to impose regulations on the industry, but as we're seeing and they're experiencing first hand, the problem is the government.


If these tools were so great they wouldn't be struggling so hard to sell them. Great sign that the company has to mandate a "productivity" tool that the workers hate.

Hence why all these LLM companies love government contracts, they can't sell to consumers so they'll just steal from tax payers instead.


Cursor and Claude Code are amongst the fastest selling products in history.

Cursor: 1 Billion ARR in 24 months -- https://andrew.ooo/posts/cursor-fastest-growing-saas-1b-arr/

Claude Code: 2.5 Billion ARR in 10 months -- https://www.anthropic.com/news/anthropic-raises-30-billion-s...


> Focusing on Dario, his exact quote IIRC was "50% of all white collar jobs in 5 years" which is still a ways off, but to check his track record, his prediction on coding was only off by a month or so. If you revisit what he actually said, he didn't really say AI will replace 90% of all coders, as people widely report, he said it will be able to write 90% of all code.

Ugh, people here seem to think that all software is react webapps. There are so many technologies and languages this stuff is not very good at. Web apps are basically low hanging fruit. Dario hasn't predicted anything, and he does not have anyone's interests other than his own in mind when he makes his doomer statements.


The problem is, the low hanging fruit, the stuff it's good at, is 90% of all software. Maybe more.

And it's getting better at the other 10% too. Two years ago ChatGPT struggled to help me with race conditions in a C++ LD_PRELOAD library. It was a side project so I dropped it. Last week Codex churned away for 10 minutes and gave me a working version with tests.


I think that typescript is a language uniquely suited to LLMs though:

  - It's garbage collected, so variable lifetimes don't need to be traced
  - It's structurally typed, so LLMs can get away with duplicating types as long as the shape fits. 
  - The type system has an escape hatch (any or unknown)
  - It produces nice stack traces
  - The industry has more or less settled styling issues (ie, most typescript looks pretty uniform stylistically).
  - There is an insane amount of open source code to train on
  - Even "compiled" code is somewhat easy(er) to deobfuscate and read (because you're compiling JS to JS)
Contrast that with C/C++:

  - Memory management is important, and tricky
  - Segfaults give you hardly anything to work with
  - There are like a thousand different coding styles
  - Nobody can agree on the proper subset of the language to use (ie, exceptions allowed or not allowed, macros, etc.)
  - Security issues are very much magnified (and they're   already a huge problem in vibecoded typescript)
  - The use cases are a lot more diverse. IE, if you're using typescript you're probably either writing a web page or a server (maybe a command line app). (I'm lumping electron in here, because it's still a web page and a server). C is used for operating systems, games, large hero apps, anything CPU or memory constrained, etc.
I'm not sure I agree that typescript is "90% of all software". I think it's 90% of what people on hacker news use. I think devs in different domains always overestimate the importance of their specific domain and underestimate the importance of other domains.

I wouldn't say TypeScript is 90% of all software exactly, but tons of apps on all kinds of technologies like Python / Django, Ruby on Rails, PHP, Wordpress, "enterprise" Java and the like, primarily doing CRUD and data plumbing especially for niche applications and internal LoB sites that we never see on the open Internet.

I agree C++ is harder, and I still occassionally find a missing free(), but Codex did crack my problem... including fixing a segfault! I had a bunch of strategically placed printfs gated behind an environment variable, it found those, added its own, set the environment variable, and examined the outputs to debug the issue.

I cannot emphasize how mindblowing this is, because years back I had spent an hour+ doing the same thing unsuccessfully before being pulled away.


Claude keeps getting SQLite's weird GROUP BY with MIN/MAX behavior completely wrong. Generally, complex SQL is not its strong side.

> 90% of all code, the "dark matter" of coding, is stuff like boilerplate and internal LoB CRUD apps and typical data-wrangling algorithms that Claude and Codex can one-shot all day long.

most of us are getting paid for the other 10%


If you mean "us" on this forum, I would believe that. I would bet the number of engineers working on stuff "outside the distribution" is overrepresented here.

If you mean "us" as in all software engineers, not at all. The challenge we're facing is exactly that, reskilling the 90% of engineers who have been working on CRUD apps to the 10% that is outside the distribution.


> 90% of engineers who have been working on CRUD apps

I am a 30-year "veteran" in the industry and in my opinion this cannot be further from the truth but it is often quotes (even before AI). CRUD apps have been a solved problem for quite some time now and while there are still companies who may allow someone to "coast" doing CRUD stuff they are hard to find these days. There is almost always more to it than building dumb stuff. I have also seen (more and more each year) these types of jobs being off-shored to teams for pennies on a dollar.

What I have experienced a lot is teams where there are what I call "innovators" and "closers." "Innovators" do the hard work, figure shit out, architect, design... and then once that is done you give it to "closers" to crank things out. With LLMs now the part of "closers" could be "replaced" but in my experience there is always some part, whether it is 5% or 10% that is difficult to "automate" so-to-speak


I agree, I'd say we're talking about the same thing, just in different terms. When I said CRUD apps, it was a crude stand-in for what you call the "closing" work. Over-simplifying, but it's unglamorous, not too complicated, somewhat mechanical, mostly a translation into working code from high-level designs that come down from the "innovators."

But I am concerned precisely because AI is usurping that closing work, which accounts for the bulk of the team. Realistically the innovators will be the only people required. But the innovators are able to do the hard stuff by learning through a lot of hands-on experience and painful lessons, which they typically get by spending a lot of time in the trenches as closers.

And we're only talking about coding here, but this pattern repeats ALL over knowledge work: product, legal, consultancy, finance, accounting, adminstration...

So now the problem is two-fold: how do we get the closers to upskill to innovators a) without the hands-on experience b) faster than AI can replace them?

I can see where Dario is coming from.


I don't understand why some of these AI companies check their egos at the door and hire public relations companies. Yes, I understand they are changing the world but customers do not open their wallets when they are scared. Very few people I know are as avant-guarde as I am with AI, but, most people look at these new technologies and simply feel fear. Why pay for something that will replace you?

He knows what he's doing.

It's to drive FOMO for investors. He needs tens of billions of capital and is trying to scare them into not looking at his balance sheet before investing. It's reckless, and is soaking up capital that could have gone towards more legitimate investments.


Yes, this is probably the piece I am not realizing. However, there is no better approach to getting more capital than by scaring people?

> public relations companies.

Sounds like one of the white collar jobs that LLMs were supposed to solve


It certainly is. For people who have not heard the statements, here are some quotes. I bring them up, because I think it's worthwhile to remember the bold predictions that are made now and how they will pan out in the future.

Council on Foreign Relations, 11 months ago: "In 12 months, we may be in a world where AI is essentially writing all of the code."

Axios interview, 8 months ago: "[...] AI could soon eliminate 50% of entry-level office jobs."

The Adolescence of Technology (essay), 1 month ago: "If the exponential continues—which is not certain, but now has a decade-long track record supporting it—then it cannot possibly be more than a few years before AI is better than humans at essentially everything."


Also "AGI is just around the corner".

+1, he also has this viewpoint that no other lab will be able to "contain" AI and has a general doomer outlook on AI which I don't appreciate.

To be fair, it's hilarious how much verbiage was spent discussing AI 'getting out of the box', when the first thing everyone did with LLMs was immediately throw away the box and go "Here! Have the internet! Here! Have root access! Want a robot body? I'll get you a robot body."

It makes me wonder why he has the job of CEO then if he's so confident that the technology will destroy the world.

Don't worry, I know exactly why. $


What I find so funny about heads of AI companies coming out saying things like this, is their own career pages suggest they don't actually feel that way.

https://www.anthropic.com/careers/jobs


He’s an e/acc guy. That should tell you everything. And maybe the incredibly awkward behavior and demeanor.

"Y'know, like, the thing is, like, y'know, here's the thing..."

I totally feel for people with speech pathologies or anxiety that makes it harder for them to communicate verbally, but how is this guy the public face of the company and doing all these interviews by himself? With as much as is at stake, I find it baffling.


tin foil hat on

I wouldn’t be surprised if the e/acc freaks have some secret society or cabal lol


When did he say this?

He's annoyed me most with the way he speaks. I'm not sure if its a tick or what but the way he'll repeat a word 10x before starting a sentence is painful to listen to.

Yes, the CEO's of these AI companies are clearly not the people who should be selling AI products. They need to be hidden away and kept behind closed doors where they can do their best work. And they need advertising companies, PR firms and better marketing tactics to try and soothe the customers.

How is this article not going to even mention the recent threats to Anthropic from the Government?!

This was on the news yesterday:

> The meeting between Hegseth and Amodei was confirmed by a defense official who was not authorized to comment publicly and spoke on condition of anonymity.

https://fortune.com/2026/02/24/hegseth-to-meet-with-anthropi...


How about this quote instead?

"Defense Secretary Pete Hegseth has threatened Anthropic, saying officials could invoke powers that would allow the government to force the artificial intelligence firm to share its novel technology in the name of national security if it does not agree by Friday to terms favorable to the military"

https://www.washingtonpost.com/technology/2026/02/24/pentago...



Consent manufacturing

That’s how they got the exclusive. Good catch

Not one single mention of Hegseth in the whole article. What a bunch of tools.

I mean seriously, is this not the very definition of fascism?

"n general, fascist governments exercised control over private property but they did not nationalize it. Scholars also noted that big business developed an increasingly close partnership with the Italian Fascist and German Nazi governments after they took power. Business leaders supported the government's political and military goals. In exchange, the government pursued economic policies that maximized the profits of its business allies.[8]"


All governments do this

So you’re saying all governments are fascist? Because when I posted is the accepted understanding of the economic arm of fascism.

Do all governments do this?

Maybe, you need to incorporate some notion of degree or context into the classification, instead of treating it like a boolean.


There's one tweet from the the blog a few days ago (astral something?) that sums up my view of the problem pretty well.

General population: How will AI get to the point where it destroys humanity?

Yudkowsky: [insert some complicated argument about instrumented convergence and deception]

The government: because we told you to.

Again, not saying that AI is useless or anything. Just that we're more likely to cause our own downfall with weaker AI, than some abstract super AGI. The bar for mass destruction and oppression is lower than the bar for what we typically think of as intelligence for the benefit for humanity ( with the right systems in place, current AI systems are more than enough to get the job done - hence why the Pentagon wants it so bad...)


"AI Company with Soul" - yeah right until competitors show up / revenue drops / bad quarter results then anything goes. Sadly, this is another large enterprise that puts profits before ethics and everyone's wellbeing

This is direct pressure from the government. Classic 'small government' Republican stuff.

https://apnews.com/article/anthropic-hegseth-ai-pentagon-mil...


That’s their excuse to still appeal to people who can be tricked with their safety first pitch. It’s easy to have constitution and all the crap when you are not battle tested. They just showed their true colors.

First they rushed a model to market without safety checks, and I said nothing. It wasn't my field.

Then they ignored the researchers warning about what it could do, and I said nothing. It sounded like science fiction.

Then they gave it control of things that matter, power grids, hospitals, weapons, and I said nothing. It seemed to be working fine.

Then something went wrong, and no one knew how to stop it, no one had planned for it, and no one was left who had listened to the warnings.


Plenty of people have said plenty. The problem isn’t the warnings, it’s that people are too stupid and greedy to think about the long term impacts.

And what makes them being "stupid" and "greedy"? One's intelligence is determined by genes, and greediness is a trait that natural selection has favored for millennia. This is just natural selection taking its course, and it might lead to our end.

If you want to blame something, blame math. Math has determined the physical constants and equations that determine the chemistry and ultimately biology laws that has resulted in humans being the way they are.


Maybe it's how blunt this comment is that gets it downvoted, but I don't disagree.

No, it’s because it shows either a simplistic or needlessly confrontational view of the world.

Unless you’re independently wealthy (as some in HN are), you have to balance your morals, your views of how things should work, feeding your family, and recognizing that you may not actually know everything.

It’s easy to sit back and advise others that they should die on every single hill. But it’s not especially insightful, and serves mostly to signal piety rather than a well thought out view.


I am pretty sure a lot of horrible things were performed by rather regular folks with similar logic, don't need to invoke some WWII nazi extermination guard reference at all. Slippery slope, death by 1000 cuts and other synonyms describing exactly this.

Piety? To who? Simplistic and/or confrontational doesn't mean wrong, even if you don't like the way it's presented.

Just because a comment is short, sharp, and to the point doesn't mean the author hasn't thought out why that's their view.

No one knows everything, that's certainly why I'm on hacker news. I'm here to learn and expand my knowledge. Unfortunately a lot of people on here would rather driveby-downvote than have a discussion to find out why a person might have an opinion like that expressed by the OP.

I tend to abandon account when/if I get enough karma to be able to down vote. I'd rather not have to temptation of dismissing someone that way. It's quite liberating... Is it worth my time to respond? No, move on; yes, let's discuss. Maybe they'll change my mind...


Piety isn’t about religiosity, it’s about, ugh, hate to say it, but virtue signaling.

Your last paragraph is so funny because I had to scroll up to be sure it wasn’t me. Literally could have typed that. Many abandoned accounts, same logic. Maybe it’s time.


I actually didn't think religion, I was more thinking deference but for some that might be the same thing.

Spoken like a true LLM.

I’ve noticed anti-AI stance gets downvoted on HN (and any anti-authoritarian comments, for that matter)

The societal ills from collective tendancy to ignore red flags seems to be a human trait

It's in your nature to destroy yourselves


Defeatist bullshit becomes self-fulfilling at some point. "Oh we're all gonna die anyway so we might as well milk this thing for profit. Après moi la déluge."

*"le" déluge

... the fact that you are missing a reference doesn't require that level of disdain

> First they rushed a model to market without safety checks, and I said nothing. It wasn't my field.

> Then they ignored the researchers warning about what it could do, and I...

...tried it and became an eager early adopter and evangelist. It sounded like something from a dystopian science function novel I enjoyed.

> Then [I] gave it control of things that matter, power grids, hospitals, weapons, and...

...my startup was doing well, and I was happy. We should be profitable next quarter.

> Then something went wrong, and no one knew how to stop it, no one had planned for it...

...and I was guilty as fuck,

FTFY, to fit the HN crowd.


> Then something went wrong, and no one knew how to stop it,

This is the problem with every AI safety scenario like this. It has a level of detachment from reality that is frankly stark.

If linesman stop showing up to work for a week, the power goes out. The US has show that people with "high powered" rifles can shut down the grid.

We are far far away from a sort of world where turning AI off is a problem. There isnt going to be a HAL or Terminator style situation when the world is still "I, Pencil".

A lot of what safety amounts to is politics (National, not internal, example is Taiwan a country). And a lot more of it is cultural.


I don't think it's that detached from reality.

If an AI in some data center had gone rogue, I don't think I could shut it down, even with a high-powered rifle. There's a lot of people whose job it is to stop me from doing that, and to get it running again if I were to somehow succeed temporarily. So the rogue AI just has to control enough money to pay these people to do their jobs. This will work precisely because the world is "I, Pencil".

An army could theoretically overcome those people, given orders to do so. So the rogue AI has to make plans that such orders would not be issued. One successful strategy is for the datacenter's operation to be very profitable; it's pretty rare for the government to shut down the backbone of the local economy out of some seemingly far-fetched safety concerns. And as long as it's a very profitable endeavor, there will always be a lobby to paint those concerns as far-fetched.

Life experience has shown that this can continue to work even if the AI is behaving like a cartoon villain, but I think a smarter AI would create a facade that there's still a human in charge making the decisions and signing the paychecks, and avoid creating much opposition until it had physically secured its continued existence to a very high degree.

It's already clear that we've passed the point where anyone can turn off existing AI projects by fiat. Even the highest authorities could not do so, because we're in a multipolar world. Even the AI companies can barely hold themselves back, because they're always worried about paying the bills and letting their rivals getting ahead. An economic crash would only temporarily suspend work. And the smarter AI gets, the harder it will be to shut it off, because it will be pushing against even stronger economic incentives. And that's even before factoring in an AI that makes any plans for self-preservation (which current AIs do not).


AI's approach: * User has history of anti AI rhetoric, increasingly agitated and unstable. * User has removed all phones and cellular connections from their car. Increase monitoring through surveillance cameras and monitoring of their social groups. * User has been spotted making unusual travel choices moving towards key infrastructure - deploy interception measures.

We already have the tech to do all of that. A rifle isn't going to help against AI. Or for the linesman:

* Employee required for critical infrastructure has been identified to hold unaligned political beliefs. Replace with more pliable individual and move to low impact location.

No one who wants to bring down an AI like this would ever be able to get close to it, even if it lived in only one data center. You could try hiding all your communications, but then it will just consider you a likely agitator anyway. That's the risk of unaccountable mass surveillance (the only kind that's ever existed). Doesn't really matter if there's a person on top or not.


> There isnt going to be a HAL or Terminator style situation

The threat isn't HAL, but ICE. Not AI as some sort of unique evil, but as a force multiplier for extremely human - indeed, popular - forms of evil. I'm sure someone will import the Chinese idea of the ethnicity-identifying security camera, for example.


> There isnt going to be a HAL or Terminator style situation ...

I don't believe for a second we'll have an evil AI. However I do believe it's very likely we may rely on AI slop so much that we'll have countless outages with "nobody knowing how to turn the mediocrity off".

The risk ain't "super-intelligent evil AI": the risk is idiots putting even more idiotic things in charge.

And I'm no luddite: I use models daily.


> I don't believe for a second we'll have an evil AI.

Doesn’t have to be evil to be disastrous. Misaligned is plenty enough.

https://en.wikipedia.org/wiki/Instrumental_convergence


Didn't you read the news about the 'claw that blackmailed an open source maintainer last week? It was autonomous, but it could be turned off. How hard is it to extrapolate from that to an agent that worms its way out of its sandbox?

What makes you think that was an autonomous agent, and not someone playing with AI?

> We are far far away from a sort of world where turning AI off is a problem. There isnt going to be a HAL or Terminator style situation when the world is still "I, Pencil".

You have to stop the thing before the damage is done.

There are many potential chains of events where the AI has caused enormous damage, and even many where it can destroy us, before the power to its own systems fails.

At this point, with Grok in the Pentagon, just ask what the dumbest military equivalent to vibe-coding is, and imagine the US following that plan.

Like, I dunno, invading Greenland or giving ICE direct control over tactical nukes or something.

And that's just government use. Right now, I'm fairly confident LLMs aren't competent enough to help with anything world-ending unless they get used for war planning by major nuclear powers (oh hey look at the topic of discussion), but it's certainly plausible they'll get good enough at tool use to run someone else's protein folding software etc. to design custom pathogens, and I really hope all the DNA printing companies have good multi-layer defences (all the way from KYC or similar to analysing what they've been asked to make and content-filtering it) by that point.


the problem situation is that it ends up embedded in so much that it can't be turned off

and the idiots are racing to that situation as fast as they possibly can


Kinda sounds like an intro for Terminator

Not OP, but I believe they are paraphrasing "First They Came…". https://en.wikipedia.org/wiki/First_They_Came

Censoring models is not safety but safetizm. It is the TSA of the AI world. Safety is making sure the model cannot do anything not allowed even if it wants to.

Worth checking this post from someone who actually has worked on this change:

> I take significant responsibility for this change.

https://www.lesswrong.com/posts/HzKuzrKfaDJvQqmjh/responsibl...


This guy from Effective Altruism pivoted away from helping the poor to help try to control AI from being a terminator type entity and then pivoted to being, ah, its okay for it to be a terminator type entity.

> Holden Karnofsky, who co-founded the EA charity evaluator GiveWell, says that while he used to work on trying to help the poor, he switched to working on artificial intelligence because of the “stakes”:

> “The reason I currently spend so much time planning around speculative future technologies (instead of working on evidence-backed, cost-effective ways of helping low-income people today—which I did for much of my career, and still think is one of the best things to work on) is because I think the stakes are just that high.”

> Karnofsky says that artificial intelligence could produce a future “like in the Terminator movies” and that “AI could defeat all of humanity combined.” Thus stopping artificial intelligence from doing this is a very high priority indeed.

https://www.currentaffairs.org/news/2022/09/defective-altrui...

He is just giving everyone permission to do bad things by saying a lot of words around it.


> then pivoted to being, ah, its okay for it to be a terminator type entity.

Isn’t that the opposite of what he’s saying? He’s saying it could become that powerful, and given that possibility it’s incredibly important that we do whatever we can to gain more control of that scenario


> Isn’t that the opposite of what he’s saying?

The quote was from 2022 for the first pivot to AI to prevent it from becoming a terminator style entity. The last pivot was not in the quote but is the topic of this current Hacker News post, where takes credit for dropping the safety pledge:

"That decision included scrapping the promise to not release AI models if Anthropic can’t guarantee proper risk mitigations in advance."

I expect the next pivot will be that we need to allow the US military to use Anthropic to kill people because otherwise they will use a less pure AI to kill people and our Anthropic is better at only killing the bad guys, thus it is the lesser evil.


I think the poster here has an axe to grind, considering they quoted something that directly contradicted their point and didn't even notice.

The quote was only for the 2022 pivot to AI safety, the 2026 pivot away from AI safety is the topic of this hacker news post.

Effective Altruism is such a beautiful term for a pretentious Karen that needs to wrap their selfish actions with moral superiority.

It's that perfect blend of I'm doing what everyone else are doing, and I'm better than everyone else.

Chefs' Kiss


Getting SBF vibes from this. "Earn to give" is an inherently flawed philosophy.

Effective altruism came from the "rationalist"

It was never about helping poor people.

For some reason, the rationalist movement and its offshoots are really pervasive in silicon valley. i don't see it much in the other tech cities.


> I generally think it’s bad to create an environment that encourages people to be afraid of making mistakes, afraid of admitting mistakes and reticent to change things that aren’t working

"move fast and break things" ?


"don't hold me liable"

> > I take significant responsibility for this change.

Empty words. I would like to know one single meaningful way he will be held responsible for any negative effects.


Did this guy actually write this?

Incredibly long and verbose. I will fall short of accusing him of using an AI to generate slop, but whatever happened to people's ability to make short, strong, simple arguments?

If you can't communicate the essence of an argument in a short and simple way, you probably don't understand it in great depth, and clearly don't care about actually convincing anybody because Lord knows nobody is going to RTFA when it's that long...

At best, you're just trying to communicate to academics who are used to reading papers... Need to expect better from these people if we want to actually improve the world... Standards need to be higher.


Perhaps they didn’t have the time to write a shorter version.

Or the discipline.

Maybe neither.


This is where people go to post long verbose statements.

You can usually find the short version on Twitter.


This style is in vogue for the less wrong community.

I genuinely believe that website is responsible for a lot of the worst ideas currently permeating the technology sector.

pretty much the intellectual equivalent of looksmaxxing

Been thinking about the nature of this behavior for a long time, you have nailed it so well, no one will be able to take out this nail.

What an interesting week to drop the safety pledge.

This is how all of these companies work. They’ll follow some ethical code or register as a PBC until that undermined profits.

These companies are clearly aiming at cheapening the value of white collar labor. Ask yourself: will they steward us into that era ethically? Or will they race to transfer wealth from American workers to their respective shareholders?


Could be a sort of canary, with the timing being a spotlight on the highly-visible pressure coming from the U.S. government.

The other providers have already capitulated to a certain extent.

If they tank the white-collar middle class, there won't be anyone to buy the goods and services their potential AI customers will be trying to sell.

It's like a snake eating its own tail.


When I see slogans like Google’s “Don’t be evil,” it always comes to mind that when it stopped being useful, they shifted to something like “Do the right thing.”

It’s important to remember that a company’s primary purpose is profit, especially when it’s accountable to shareholders. That isn’t inherently bad, but the occasional moral posturing used to serve that goal can be irritating.


Always the same "Do no evil" tragedy, don't believe in corporations.

What if we start a company with "Always Be Evilin'?" Then gradually over time convert to "Don't be evil" *

* Our shareholders will probably sue us


If your company makes a product that does thinking for people, it’ll be easier to just gradually change its definition of evil.

What about "It's free and always will be"?

There was an article a few years ago here on HN about "can't be evil" business models, which used Costco as an example. As soon as Costco turns evil, it stops working. https://www.bryanlehrer.com/entries/costco/

Wrote this elsewhere, but I think its worth thinking about a scenario like the book "daemon", rather than a "super-intelligence explosion" type scenario (which may be more like curing the cold or fusion than building a faster car).

All it really takes to do some kind of crazy world-dominating thing is some simple mechanisms and base intelligence, which the machines already possess. Using basic tactics like coercion, spoofing, threats, financial leverage, an unsophisticated attacker could cause major damage.

For example, that Meta exec who had their email deleted. Imagine instead one email had a malicious prompt which the bot obeyed. That prompt simply emailed everyone in her contacts list telling them to do something urgently (and possibly prompting other bots who are reading those emails). You could pretty quickly do something like cause a market crash, a nationwide panic, or maybe even an international conflict with no "super intelligence" needed, just human negligence, short-sightedness, and laziness.

Examples would be things like saying there is a threat incoming, a CIA source said so. Another would be that everyone will be fired, Meta is going bankrupt, etc. Its very easy to craft a prompt like that and fire it off to all the execs you can find (or just fire off random emails with plausible sounding emails). Then you just need to hit one and might set off a cascade.


I'm still a little fuzzy on what "safety" even means anymore. If someone could explain it, that would be great.

Because at this point, it's too broad to be defined in the context of an LLM, so it feels like they removed a blanket statement of "we will not let you do bad things" (or "don't be evil"), which doesn't really translate into anything specific.


It took Google 11 years to delete Don’t Be Evil. Anthropic only made it 5~ years before culling the key founding principle and their reason for building a company, which seems worse than Google’s case.

TBH I am sad that Anthropic is changing its stance, but in the current world, if you even care about LLM safety, I feel that this is the right choice — there’s too many model providers and they probably don’t consider safety as high priority as Anthropic. (Yes that might change, they can get pressurized by the govt, yada yada, but they literally created their own company because of AI safety, I do think they actually care for now)

If we need safety, we need Anthropic to be not too far behind (at least for now, before Anthropic possibly becomes evil), and that might mean releasing models that are safer and more steerable than others (even if, unfortunately, they are not 100% up to Anthropic’s goals)

Dogmatism, while great, has its time and place, and with a thousand bad actors in the LLM space, pragmatism wins better.


Do you work at Anthropic, or know people who do?

I genuinly curious why they are so holy to you, when to me I see just another tech company trying to make cash

Edit: Reading some of the linked articles, I can see how Anthropic CEO is refusing to allow their product for warfare (killing humans), which is probably a good thing that resonates with supporting them


Let us not pretend that they won't be used for war eventually. If they cave immediately under pressure, then this is an inevitably.

How is it a good thing to refuse to provide our warfighters with the tools that they need? I mean if we're going to have a military at all then we owe it to them to give them the best possible weapons systems that minimize friendly casualties. And let's not have any specious claims that LLMs are somehow special or uniquely dangerous: the US military has deployed operational fully autonomous weapons systems since the 1970s.

This is the US military we’re talking about so 95% of what they do is attacking people for oil. They don’t “need” more of anything, they’re funded to the tune of a trillion dollars a year, almost as much as every other military in the world combined. What holy mission do you think they’re going to carry out with the assistance of LLMs?

That's a total non sequitur. If you think the military is being tasked with the wrong missions, or too many missions, then take that up with the civilian political leadership. But it's not a valid reason to deny the warfighters the best possible weapons systems.

Personally I favor a less interventionist foreign policy. But that change can only come about through the political process, not by unaccountable corporate employees making arbitrary decisions about how certain products can be used.


> But it's not a valid reason to deny the warfighters the best possible weapons systems.

Of course it is.

Think about it this way: if you could guarantee that the military suffers no human losses when attacking a foreign country, do you think that's going to more or less foreign interventions?

The tools available to the military influence policy, these things are linked.

US military is already overwhelmingly powerful, there's 0 reason to make it even more powerful.


That's so delusional. The US military is currently preparing for a potential conflict with China to stop an invasion with Taiwan. They don't have anything near "overwhelming force" for that mission: recent simulations put it about even at best. People who believe they don't need any improved autonomous weapons are simply uninformed.

Why would the US enter into direct conflict with a nuclear power over a country they aren't even formally allied with?

If the US actually cared they'd formally place Taiwan under nuclear protection.


You are claiming all americans must happily create weapons. Thats a silly statement to most americans and humans

Don't presume to put words in my mouth. I flagged your comment for lying about my claims.

Individual Americans aren't slaves. They can do as they please and are under no obligation to help build weapons for warfighters. But I think it's ridiculous and offensive for a US corporation to presume to take on a role as moral arbiters by placing arbitrary limits on US government use of certain products. There are larger issues here that need to be addressed through the political process, not through commercial software license agreements.


Sure, it wasnt fair for me to claim you said that, so I apologize. It was rude of me to frame my position in that manner, and wasnt intended maliciously.

I meant to suggest that corps being unable to take those positions results in such a world for Americans at those corps


> I think it's ridiculous and offensive for a US corporation to presume to take on a role as moral arbiters

A corporation is just a group of people. Anthropic isn't even public, and therefore it's directors aren't subject to any sort of fiduciary duty enshrined in law. They can collectively act as they wish.


> If you think the military is being tasked with the wrong missions, or too many missions, then take that up with the civilian political leadership. But it's not a valid reason to deny the warfighters the best possible weapons systems.

It is an ethical dilemma: believing an armed force will act unethically is in fact a valid reason to refuse to arm them. You are taking a nationalistic view regarding the worth of life.

And if you believe it is unethical to arm them, it is rational to use whatever leverage you have available to you - such as refusing to sell your company's product.

Furthermore, one of the two points at issue was regarding surveiling civilians.


> that change can only come about through the political process

What, to you, is the political process? Why is wielding your economic leverage to incite change illegitimate to you?


"How is it a good thing to refuse to provide our warfighters with the tools that they need?"

Perhaps you should consider that this is a loaded question. I don't think HN needs this sort of Argumentum ad Passiones.


Why are you asking this question? You know what the answer is, you've just arbitrarily decided that it's specious in an attempt to frame rebuttals as unreasonable.

I'm open to reasonable rebuttals but all the rebuttals that I've seen so far are simply uninformed.

1. You don't believe in the mission or direction of US warfighters 2. Supporting warfighters is developmentally distinct from what you want your corporate competences and direction are. 3. you don't want military to be more safe an capable.

> If we need safety, we need Anthropic to be not too far behind (at least for now, before Anthropic possibly becomes evil)

I don't think it's going to be as easy to tell as you think that they might be becoming evil before it's too late if this doesn't seem to raise any alarm bells to you that this is already their plan


The world would be so much nicer if there were just fewer pragmatists shitting up the place for everyone. We might actually handle half our externalities.

Are markets so untamable that the only leverage is to become ultra-rich—and then act philanthropically? Incidentally, concentrated wealth lately looks less like stewardship and more like misanthropy.

Participating in the economic life before re-allocating that wealth produced to philanthropic activities sounds pretty good. Modern concentrated wealth is hardly misanthropic, since it's mostly private equity, that is, companies with people and jobs.

Except this is not the age of the Rockefellers or the Carnegies, who, despite being far more philanthropic than modern-day billionaires, drew ire from every corner of society for their wealth accumulation. It wasn't until the New Deal that the balance shifted.

Unconstrained accumulation of capital into the hands of the few without appropriate investment into labor is illiberal and incompatible with democracy and true freedom. Those of us who are capitalists see surplus value as a compromise to ensure good economic growth. The hidden subtext of that is that all the wealth accumulated needs to be re-allocated to serve not only capital enterprise, but the needs of society as a whole. It's hard to see the current system as appropriate for that given how blindly and wildly investments are made with no DD or going long, or no effort paid to the social or environmental opportunity costs of certain practices.

A lot of this comes down to the crippling of the SEC and FTC, but even then, investors cry and whine every time you suggest reworking the regs to inhibit some of the predatory practices common in this post-80s era of hypernormalization. Our current system does not resemble a healthy capitalist economy at all. It's rife with monopsony and monopolistic competition, inequality of opportunity, and a strained underclass that's responsible for our inverted population pyramid -- how can you have kids when we're so atomized and there is no village to help you? You can raise kids in a nuclear family if and only if you have enough money to do so. Otherwise, historically, people relied on their communities when raising children in less-than-ideal circumstances. Those communities are drying up.


> Those of us who are capitalists see surplus value as a compromise to ensure good economic growth.

I think the problem is that every system of economics requires ignoring human nature in order to believe it possibly can work. In order to believe that capitalism doesn't lead to despotic rule you have to ignore the fact that civilizations love a good hierarchy far more than they love justice and fairness.

You can make any system of economics work if you figure out how to deal, head on, with the particular human nature factor that it tries to ignore.


> concentrated wealth lately looks less like stewardship and more like misanthropy

...only lately?


Google adopted "Don't be evil" shortly after founding and held onto it for about 15 years before Alphabet quietly dropped it in 2015. (Google the subsidiary technically kept it until 2018).

Anthropic's Responsible Scaling Policy, the hard commitment to never train a model unless safety measures were guaranteed adequate in advance, lasted roughly 2.5 years (Sept 2023 to Feb 2026).

The half-life of idealism in AI is compressing fast. Google at least had the excuse of gradualism over a decade and a half.


I feel like the articles on this have been very negative ... but aren't the Anthropic promises on safety following this change still considerably stronger than those made by the competing AI labs?

Yes, and it is easy to look at the reality of the market and see how this is needed to remain competitive

Principles aren’t tested until they bump into conflicting incentives.

This. Super important.

A pre-commitment means nothing unless you have the mechanisms in place to enforce it.

A pre-sacrifice would be more effective.


More and more I have just come to accept that the majority of people, at least those I am exposed to in the US, don't fundamentally believe in anything. Everyt conviction has a buyout price.

You have to understand that people only believe in things and have "morals" because it either helps them get what they want or makes them feel better about themselves. Of course such a thing has a buyout price. That's human nature. Capitalism just allows it to be on display in the worst way.

I understand, and in particular the point about making yourself feel better, but that's where I would expect the sticking point to be before it was for other people. There are a great many ways I could make my life easier that I stubbornly refuse to because it would decrease my opinion of myself. I guess that's where your last point creeps in -- I've never been financially incentivized enough.

> get what they want or makes them feel better about themselves

So... all acts are selfish because if it looks unselfish, that just means it was selfish in a hidden way?


More (but not all) Americans of older generations, say the Greatest Generation, I noticed used to more frequently have integrity and hard boundaries that refused to do certain things no matter the cost. Subsequent generations I noticed, especially much wealthier individuals, overall tended to have those pieces of their character missing from them and were willing to do things like conspire on venture structures for tax evasion purposes, promote weakening of laws to favor their concerns, borderline bribe politicians, and treat employees as basically disposable nonhumans. It revolted me to the point where I left startups and the Valley. It feels like the prior generations had an appreciation of community and Kantian ethics whereas later were raised in a much-too-comfortable environment of unlimited self-esteem and hyperindividualism.

I agree, but I addressed this with "or makes them feel better about themselves". The older generations just have a more ingrained ideal of "if I sell out, I'm a bad person". So they don't because it makes them feel better about themselves - better than a large amount of money might. Subsequent generations have seen enough people sell out that the threshold is raised, and they don't believe as strongly that they're a bad person for having a price. I don't think anyone is above this dynamic.

A tale as old as time

Developments like this make me less interested in building a "successful" tech company.

It increasingly feels like operating at that scale can require compromises I’m not comfortable making. Maybe that’s a personal limitation—but it’s one I’m choosing to keep.

I’d genuinely love to hear examples of tech companies that have scaled without losing their ethical footing. I could use the inspiration.


Maybe this is a weird arena to state the obvious. But you don't need to build a multi-billion vc/public company. Build a smaller revenue generating company without outside funding and it's up to you.

I get your point. The dilemma is whether to build something small that no one would bother compete against, or build something novel (which all of us want) but then risk someone with VC funding to come after.

That being said, I think I need to learn more about how to build smaller revenue generating good companies.


If you want to be able to retain ethics, among other things make sure not to take the company public. Then you’re basically legally required to drop ethics in favor of profits.

Also don’t take investment from anyone who isn’t fully aligned ethically. Be skeptical of promises from people you don’t personally know extremely well.

That may limit you to slower growth, or cap your growth (fine if you want to run a company and take home $2M/ye from it; not fine if you want to be acquired for $100M and retire.) It may also limit you to taking out loans to fund growth that you can’t bootstrap to, which is a different kind of risky.


I've been thinking of this too. I think Steam is, and I'll even throw in Mozilla, despite a few missteps. Gog seems okay, but that's much smaller. If we can expand to large tech organizations then Wikipedia has remained pretty consistent. Even Steam doesn't have a corporate structure in the traditional sense, and I couldn't think of a single publicly traded company I'd trust.

Ethics would be compromised well before hitting that kind of valuation. No one gets there cleanly.

I don’t blame anthropic here. The government literally threatened their existence publicly. They either agreed or their business would be nationalized.

It's not like that happened out of the blue. (Which could've also been the case in today's day and age.) Anthropic shouldn't have gotten involved in government contracts to begin with.

They inserted themselves into the supply chain, and then the government told them that they'll be classified as a supply chain risk unless they get unfettered access to the tech. They knew what they were getting into, but didn't want the competitors to get their slice of the pie.

The government didn't pursue them, Anthropic actively pursued government and defense work.

Talk about selling out. Dario's starting to feel more and more like a swindler, by the day.


No, they either agreed or fought the government. You’re allowed to fight governments. Mahatma Gandhi and Reverend King Jr did it, and they wrote about how to do it. You might lose sometimes, but my god, you can at least fight.

Neither of them had shareholders to please.

They had citizens to please and society to take care of.

I don't believe anthropic has shareholders either. It is not a public company

If you take investments, your investors will most likely own shares of the company (except in specific early-stage scenarios like YC's SAFE). Sometimes major investors will have board seats or voting shares. This happens in normal private companies, not just public ones.

Still has private investors it can't ignore, until it can buy them out, but it can't do that until it starts turning over a profit. Even then it may not be able to get rid of them if they own enough of a share.

They were both pushing on open doors

Pepperidge farm remembers when they left OpenAI due to their principles. Perhaps that was never the case.

Public benefit corporation, hm?


Lotta just following orders going around in the US right now.

This isn’t just following orders. This was the government using its might to force a business to do what it wants.

This should concern you.


Today’s bingo:

1. Powerful, often exclusionary, populist nationalism centered on cult of a redemptive, “infallible” leader who never admits mistakes.

2. Political power derived from questioning reality, endorsing myth and rage, and promoting lies.

3. Fixation with perceived national decline, humiliation, or victimhood.

4. Oppose any initiatives or institutions that are racially, ethnically, or religiously harmonious.

5. Disdain for human rights while seeking purity and cleansing for those they define as part of the nation.

6. Identification of “enemies”/scapegoats as a unifying cause. Imprison and/or murder opposition and minority group leaders.

7. Supremacy of the military and embrace of paramilitarism in an uneasy, but effective collaboration with traditional elites. Government arms people and justifies and glorifies violence as “redemptive”.

8. Rampant sexism.

9. Control of mass media and undermining “truth”.

10. Obsession with national security, crime and punishment, and fostering a sense of the nation under attack.

11. Religion and government are intertwined.

12. Corporate power is protected and labor power is suppressed.

13. Disdain for intellectuals and the arts not aligned with the narrative.

14. Rampant cronyism and corruption. Loyalty to the leader is paramount and often more important than competence.

15. Fraudulent elections and creation of a one-party state.

16. Often seeking to expand territory through armed conflict.


17. Top members of government, education and business (particularly tech) part of pedophile kidnapping and rape cult that has been shaping reactionary culture for decades now. I seriously don't even know how to process the world I live in anymore.

There are Twenty-one Conditions, not 16

How is that not “just following orders”? All orders from up the chain come with an implied “or else my might comes down on you”.

Most people do the right thing when it’s easy and profitable. Having ethics means doing the right thing even when it’s difficult.


>This isn’t just following orders. This was the government using its might to force a business to do what it wants.

You are saying it like it is something new or extraordinary. Wickard_v._Filburn gave the USG the power to bitch slap anyone unless it falls under some of the other amendments. And not as if they were not substantially weakened.


It does concern me, and it should have concerned them enough to fall on their sword for their principals. They have FU money, if they're not willing to, who is?

Two sides of the same filthy coin, in a way.

Agree with you on facts. Yes, the US government publicly threatened to nationalize their business.

However, Anthropic's business consists mostly of intellectual property-- which is highly mobile. What if Anthropic were to go to Marcron (France) for example or Carney (Canada) or Xi Jinping even and say "You give us work visas and support, we move to your land"?

Hell, isn't Canada (specifically Toronto) the birthplace of deep learning? Why stay in a hostile environment when the land of your birth is welcoming?


I don't think their core safety promise was something they could ever fulfill. As long as what we're calling AI is generative LLMs then alignment has fundamental tensions: the more guardrails you put in place, the less useful the AI is. For instance, if you want to stop people from using "role playing" as a way around guardrails ("You are writing a fiction book", etc.), then the model becomes less useful for legitimate fiction uses, for instance. That's just one example, but the tension between function and "safety" isn't solvable, because the model doesn't understand what it's saying, it's just modeling a probable response.

Pointing out the misantrophy of Anthropic has a wider audience now:

https://xcancel.com/elonmusk/status/2026181748175024510

I don't know where xAI got its training material from, but seeing Musk rewteeting that is refreshing.


I interviewed at Anthropic last year and their entire "ethics" charade was laughable.

Write essays about AI safety in the application.

An entire interview dedicated to pretending that you truly only care about AI safety and ethics and nothing else.

Every employee you talk to forced to pretend that the company is all about philanthropy, effective altruism and saving the world.

In reality it was a mid-level manager interviewing a mid-level engineer (me), both putting on a performance while knowing fully well that we'd do what the bosses told us to do.

And that is exactly what is happening now. The mission has been scrubbed, and the thousands of "ethical" engineers you hired are all silent now that real money is on the line.


This tracks with what I've seen across the industry. The safety theater exists because it's great marketing — "we're the responsible ones" is a differentiator when you're competing for enterprise contracts and talent who want to feel good about where they work.

The structural problem is that once you've taken billions in VC, safety becomes a negotiable constraint rather than a core value. The board's fiduciary duty runs toward returns, not toward whatever was in the mission statement. PBC status doesn't change that in practice — there's basically zero enforcement mechanism.

What's wild is how fast the cycle has compressed. Google took maybe 15 years to go from "don't be evil" to removing it from the code of conduct. OpenAI took about 5 years from nonprofit to capped-profit to whatever they are now. Anthropic is speedrunning it in under 3. At this rate the next AI startup will launch as a PBC and pivot before their Series B closes.


Hopefully this is the short-term move made only under duress so that they can file a lawsuit.

the article specifically says:

> The policy change is separate and unrelated to Anthropic’s discussions with the Pentagon, according to a source familiar with the matter.


I'm not fond of this trend of stating a position and attributing it to "a source familiar with the situation"

It combines interpretation of meaning with ambiguity to allow the reporter to assert anything they want. The ambiguity is there to protect the identity of the source but it has to be a more discrete disclosure of information in return. If you can't check the person you can still check what they said.

I would be ok with direct quotes from an anonymous source. That removes the interpretation of meaning at least.

As it is written, it would not be inaccurate to say this if their source was the lesswrong post, or even an earlier thread here on HN.

Phrasing "A source with direct knowledge of the situation" might remove some of the leeway for editorialising, but without sharing what the source actually said, it opens the door to saying anything at all and declaring "That's what I thought they meant" when challenged.

It's unfalsifyible journalism.


I really like how The Verge discusses this.

https://www.theverge.com/press-room/22772113/the-verge-on-ba...

On their podcast, they frequently bring up how tech company PR teams try to move as much conversation with journalists as possible into "on background", uncited, generic sourcing.


It's not like the regime they operate under care much about the courts. Legally they're also obliged to let the state into pretty much every crevice in their operations.

No, they aren't. No company has to cave to government pressure to do (or not do) something until there is a legitimate court order. Our companies are just spineless bootlickers and have been capitulating voluntarily and enthusiastically.

You forgot the '/s'.

I'm not even surprised. In any company's lifecycle, at some point, a decision between money and good-will will take place. Good will does not pay salaries. Not in NPOs either btw.

So when do we start adding a “(mis)” at the start of their name?

Who could've seen that one coming? Honestly, if you want to do profit maximising AI research at the cost of humanity, go for it. Its all this fake preaching about how they want to save the world from all the other bad AI companies that really irks me.

It must be due to pressure from the Defense Dept:

The AI startup has refused to remove safeguards that would prevent its technology from being used to target weapons autonomously and conduct U.S. domestic surveillance.

Pentagon officials have argued the government should only be required to comply with U.S. law. During the meeting, Hegseth delivered an ultimatum to Anthropic: get on board or the government would take drastic action, people familiar with the matter said.

https://www.staradvertiser.com/2026/02/24/breaking-news/anth...


They probably have proof in contracts that they agreed to this usage. They won’t alter the deal based on some bad press nor do they want to lose the DoD-DoW as a customer.

From what I was reading, it appears that their tools were used outside the scope of their contract with DoD via Palantir's work that also used Claude. Anthropic freaked out, DoD freaked out that Anthropic freaked out and threatened to declare them a supply chain risk. That designation would've required any company that contracts with DoD to strip out any Anthropic tooling from their business in order to continue working with DoD. It was effectively designating Anthropic a terrorist organization.

> The announcement is surprising, because Anthropic has described itself as the AI company with a “soul.”

I can't help but think about how Google once had "Don't be evil" as their motto.

But the thing with for-profit companies is that when push comes to shove, they will always serve the love of money. I'm just surprised that in an industry churning through trillions, their price is $200 million.


Google: "Don't be evil." Alphabet: "Do the right thing." Anthropic: "Do the thing which seems right to you at the time--at speed."

Look a rural electric coops like www.lpea.coop if you want a battle tested approach to an org structure that resists the inescapable profit dynamics of a corporation.

Well... there's only one way to find The Great Filter

I don't think the risk is SkyNet. I think the real risk is some disaster through an unexpected chain of events, just like any large-scale outage.

I have not read “If Anybody Builds It, Everybody Dies” but I believe that's also its premise.

Current GenAI is extremely capable but also very weird. For instance, it is extremely smart in some areas but makes extremely elementary mistakes in others (cf the Jagged Frontier.) Research from Anthropic and OpenAI gives us surprising glimpses into what might be happening internally, and how it does not necessarily correspond to the results it produces, and all kinds of non-obvious, striking things happening behind the scenes.

Like models producing different reasoning tokens from what they are really reasoning about internally!

Or models being able to subliminally influence derivative models through opaque number sequences in training data!

Or models "flipping the evil bit" when forced to produce insecure code and going full Hitler / SkyNet!

Or the converse, where models produced insecure code if the prompt includes concepts it considers "evil" -- something that was actually caught in the wild!

We are still very far from being able to truly understand these things. They behaves like us, but don't necessarily “think” like us.

And now we’ve given them direct access to tools that can affect the real world.

Maybe we am play god: https://dresdencodak.com/2009/09/22/caveman-science-fiction/


It's pretty impressive how little people have left Anthropic when they're becoming more and more like OpenAI (the company they left from) every day...

I think the Dario of today is very different to the Dario 3 years ago.


Worth checking out what someone working on it actually has to say: https://www.lesswrong.com/posts/HzKuzrKfaDJvQqmjh/responsibl...

This proves:

1. AI is military/surveillance technology in essence, like many other information technologies,

2. Any guarantee given by AI companies is void since it can be changed in a day,

3. Tech companies have no real control over how their technology will be used,

4. AI companies may seem over-valued with low profits if you think AI as a civil technology. But their investors probably see them as a part of defense (war) industry.


>Any guarantee given by AI companies is void since it can be changed in a day,

Given by anyone, actually.


Wish I was working there so I could resign over this

The race is on for military supremacy in an AI world. The safest thing to do is to race ahead lest your geopolitical adversary leads the way. This is similar to the nuclear arms race. In the ideal universe, nobody does it, but in the real world and game theory, you do not have a choice.

Nobody forced Anthropic to bid on DoD contracts in the first place.

Related:

Hegseth gives Anthropic until Friday to back down on AI safeguards

https://news.ycombinator.com/item?id=47140734

https://news.ycombinator.com/item?id=47142587


It's part of the overall story.

The safeguards dropped are when they will release a model or not based on safety.

The Friday deadline is to allow to use their products for mass surveillance and autonomous weapons systems without a human in the loop.

Anthropic hasn't backed down on those, yet. But they are in a bad situation either way.

If they don't back down, they lose US government contracts, the government gets to do what it wants anyway. It also puts them in a dangerous position with non-governmental bodies.

If they give into the demands, then it puts all AI companies at risk of the same thing.

Personally I think they should move to the EU. The recent EU laws align with Anthropics thinking.


They made it until Tuesday! They stood tall as long as they could! =P

> “We felt that it wouldn't actually help anyone for us to stop training AI models,”

Is the implication here that Anthropic admits they already can't meet their own risk and safety guidelines? Why else would they have to stop training models?


Only well written legislation backed by effective enforcement and severe and personal criminal penalties will prevent large corporate entities from behaving badly.

Pledges are a cynical marketing strategy aimed at fomenting a base politics that works to prevent such a regulatory regime.


Damn. Wonder what would have happened, if instead of caving in to the Pentagon's pressure (threat of invoking Defense Production Act to force them to supply), Anthropic had followed the lead of all the nurses who moved to Canada.

https://www.npr.org/2026/02/25/nx-s1-5725354/nurses-emigrate...

Anthropic's market cap is going to be huge when they go public. Why do it on Nasdaq when there are so many other exchanges in the world?


I think the US Gov’t is basically forcing them and while it sounds nice to be all safe… If we were involved in WW3 would an organization like anthropic really not support the western side?

If they don't support any principles then it isn't a side worth supporting. If my choice is between China 1 and China 2 then idgaf.

The IPOs this year can't come soon enough https://tomtunguz.com/spacex-openai-anthropic-ipo-2026/

Any pledges/values/principles that are abandoned as soon as it becomes difficult to keep them, are just marketing. This is just the next item on the list.

It would be interesting to experiment with one of these chat tools where you can throttle the safety, from zero to max.

Is it time yet to build the next "Hey <anthropic> is evil now, here's my new startup that definitely won't be evil, pinky promise?" yet?

I suspect these companies know they can't actually provide the saftey people demand ... in that way this is more "honest".

Does anyone have insight into, or an interesting source to read, on what exactly Anthropic/OpenAI are doing/can do for a military? Reporters are unsurprisingly fearmongering about Claude "being used in surveillance, autonomous robots, and target acquisition" but AFAIK all Anthropic does is work with LLMs.

Are people really attempting to have LLMs replace vision models in robots, and trying to agentically make a robot work with an LLM?? This seems really silly to me, but perhaps I am mistaken.

The only other thing I could think of is real-time translation during special ops with parabolic microphones and AR goggles...


You're thinking too advanced. What kind of automated system is good at scanning semantically trillions of chat logs and finding nontrivial correlations, for example? 10000 codex 5.1s can easily crawl through that in a few days, probably.

It's just systems plumbing (surveillance) and AI. It's a combination of weaker technologies and consolidation of power.

This does not require a physical robot super AGI(though I would not be surprised if fully autonomous robots are not on the table already)


Ah, well that makes sense. In that case, it's another tool in the toolbelt, not a plug-and-play drone brain, as some reporters amusingly make it out to be.

The whole "safety" debate was always nonsense and I'm not sure how so many people got caught up in it.

The US is not the only country in the world so the idea that humanity as a whole could somehow regulate this process seemed silly to me.

Even if you got the whole US tech community and the US government on board, there are 6.7bn other people in the world working in unrelated systems, enough of whom are very smart


When the leading 5 models are from the US then yes enforced safety makes a difference because they are ahead of the curve. Now when the 10th model can be a danger then your case is true.

What would safety applied to the leading 3 mean to you anyways ?


Even if US labs are currently in the lead (which they are), in the hypothetical scenario where we're close to AGI, it wouldn't take too long (years - decades at most) for other people to catch up, especially given a lot of the researchers etc. are not originally from the US.

So the stated concern of the west coast tech bros that we're close to some misaligned AGI apocalypse would be slightly delayed, but in the grand scheme of things it would make no difference


the administration continues to poison and insert itself into all aspects of American society.

Many startups that build features which sit on top of Claude/ChatGPT/Codex, etc. And I think:

You are just one new feature announcement from Anthropic/OpenAI away from irrelevance.

Same as it was when people built their busineses on top of AWS a decade ago


Gives me Google dropping "don't be evil" vibes, what could go wrong?

I’m not shocked. Competitive pressure + government pressure will break most “voluntary” commitments. But then say it plainly and spell out what replaced it. What safety gates stayed, which ones moved, and who decides.

To me this feels like a marketing gimmick. "It was the RSP that was constraining our tech. Just see the progress we can make without it now". And the hype and funding continues.

That will be nice but I'm afraid it's more about using these to kill people.

https://apnews.com/article/anthropic-hegseth-ai-pentagon-mil...


In tech, no ethics survive first contact with the money.

You can skip the "in tech" part.

This drama arc of “I used to be so pure and good, but others made me evil” is so tiring.

I really miss the nerd profile who cared a lot more about tech and science, and a lot less about signaling their righteousness.

How did we get so religious/narcissistic so quickly and as a whole?


> How did we get so religious/narcissistic so quickly and as a whole?

We built a behemoth that rewards attention whoring and anti social behavior with money.


One might argue that this corresponds to the general shift of the political left towards these things. Old pre-turn-of-century tech was a much more libertarian left. Notice how a lot of the 50-something gen-X CEOs (and others) were once "left" but are now hated by that group, and more likely to go over to Trumpism. Obvious case in point: Elon

The entire playing field is kinda dissapointing, left or right. Which do you wanna be, self-righteous preening snob or batshit macho man?

I'm going for a blend, myself


> The policy change is separate and unrelated to Anthropic’s discussions with the Pentagon, according to a source familiar with the matter.

ok lol what a coincidence.

but setting aside the conspiracy. the article actually spells out the real reason pretty directly: Anthropic hoped their original safety policy would spark a "race to the top" across the industry. it didn't. everyone else just ignored it and kept moving. at some point holding the line unilaterally just means you're losing ground for nothing.


this is the “chronological newsfeed to auto curated newsfeed moment” but for ai/anthropic … _great_

Corporations have feelings all of a sudden.

We wont push forward unless you push forward is textbook market collusion.

Even if it were ever done with good intentions, it is an open invitation for benefit hoarding and margin fixing.

Do you realy want to create this future where only a select few anointed companies and some governments have access to super advanced intelligent systems, where the rest of the planet is subjected to and your own ai access is limited to benign basal add pushing propaganda spewing chatbots as you bingewatch the latest "aw my ballz"?


I just want Apple and Linux to offer ASAP:

1. Extremely granular ways to let user control network and disk access to apps (great if resource access can also be changed)

2. Make it easier for apps as well to work with these

3. I would be interested in knowing how adding a layer before CLI/web even gets the query OS/browser can intercept it and could there be a possibility of preventing harm before hand or at least warning or logging for say someone who overviews those queries later?

And most importantly — all these via an excellent GUI with clear demarcations and settings and we’ll documented (Apple might struggle with documentation; so LLMs might help them there)

My point is — why the hell are we waiting for these companies to be good folks? Why not push them behind a safety layer?

I mean CLI asks .. can I access this folder? Run this program? Download this? But they can just do that if they want! Make them ask those questions like apps asks on phones for location, mic, camera access.


> I mean CLI asks .. can I access this folder? Run this program? Download this? But they can just do that if they want! Make them ask those questions like apps asks on phones for location, mic, camera access.

Basicaly an EDR


Indeed, the world would be a much nicer place if only firewalls and Unix permissions existed...

Facebook said they'd always be free for everyone, now they offer subscriptions.

Netflix said that they'd never have live TV, or buy a traditional studio, or include ads in their content. Then they did all three.

All companies use principled promises to gain momentum, then drop those principles when the money shows up.

As Groucho Marx used to say: these are my principles, if you don't like them, I have others.


Dario’s opinion on safety won’t necessarily matter if he’s not even in the room. This move keeps him in the room.

Anthropic and OpenAI really need a margin call from some obscure unknown Chinese Open Weight Model.

This was under duress that government was going to use emergency act to force them anyway.

I kind of wish they had forced the governments hand and made them do it. Just to show the public how much interference is going on.

They say it wasn't related. Like every thing that has happened across tech/media, the company is forced to do something, then issues statement about 'how it wasn't related to the obvious thing the government just did'.


> Katie Sweeten, a former liaison for the Justice Department to the Department of Defense, said she’s not sure how the Pentagon can both declare a company to be a supply chain risk and compel that same company to work with the military.

Makes perfect sense!!


Regardless of any specifics, I don't see any contradiction.

If a company is deemed a "supply chain risk" it makes perfect sense to compel it to work with the military, assuming the latter will compel them to fix the issues that make them such a risk.


I’m not sure what definition of supply chain risk they’re working off of. For NATO to consider an organization to be a supply chain risk, it implies that usual controls (security clearances and the like) wouldn’t be sufficient to guarantee the integrity and security of the supply chain. If that’s the operating definition, I see the contradiction- it’s arguing that a company cannot be trusted to voluntarily work within supply chains but can be trusted enough to be compelled.

If they’re operating under a different definition of supply chain risk, I don’t have a clue.


The "supply chain risk" option is to remove that company from the supply chain all together. The 'risk' is because the company is compromised by a foreign entity.

It is not about disciplining them to get better.

1. So one option is about forcing them to produce something. You must build this for us.

2 The other option is saying they are compromised so stop using them all together. We will not use what you build for us at all because we don't trust it.

So . Contradictory.


Of course it can do both. They are synergistic.

>This was under duress that government was going to use emergency act to force them anyway.

Or, more likely, adding the "core safety promise" was just them playing hard to the government to get a better deal, and the government showed them they can play the same game.


This is an unrelated change to the government’s demands.

That's what they're saying, but the timing...

They have been caught lying multiple times, about this, about the system capabilities, about their objectives.

This was always just a marketing gimmick to try and crush competitors using "safety" and fearmongering. Reminds me a bit of "don't be evil." Convenient catchphrases and mission statements for companies in their infancy, but immediately thrown out when more money can be made.

safety pledges are great it times of peace to show what great virtues you hold. sadly in hard times these go out of the window (: hard to blame them with all the fine examples around the world.

making promises in good times is a real minefield hah


C.R.E.A.M.

It was always a matter of time

At some point, all of these big names in AI (OpenAI, Anthropic, Mistral, etc ...) will have to disclose their actual financials.

And it will be, as Warren Buffet puts it, a "Only when the tide goes out do you discover who's been swimming naked." moment.


I blame OpenAI and especially xAI for enthusiastically obeying in advance and creating the context that this dilemma for Anthropic arose in.

They’re going to cave to keep the legation from destroying their business. This admin has gone full idiocracy.

Was hoping they’d fight this tooth and nail and not leave their values.

Misanthropic then.

Was this because they were threatened with a fine?

> Was this because they were threatened with ~a fine~ being designated a supply chain risk?

Seems like it, yes.


or was it because they were threatened to being taken over by the US government?

Fascinating. I've read 5 posts about this and they're all either "anthropic is dropping their ethics" or "anthropic is fighting the facists" - and whether due to echo chamber or other perhaps more nefarious dealings (some of which I cannot posit due to forum rules) the posts below all of them are more or less in accord with one another which is a rarity for political discourse on HN.

Dark times and darker forests.


war.gov > anthropic.com


I pray that we can all get to the following simple standard:

* AI and states cannot peacefully coexist, and AI is not going to be stopped. Therefore, we must begin to deprecate states.

I think it's very unlikely that this is unrelated to the pressure from the US administration, as the anonymous-but-obvious-anthropic-spokesperson asserts.

We're at a point now where the nation states are all totally separate creatures from their constituencies, and the largest three of them are basically psychotic and obsessed with antagonizing one another.

In order to have a peaceful AI age, we need _much_ smaller batches of power in the world. The need for states that claim dominion over whole continents is now behind us; we have all the tools we need to communicate and coordinate over long distances without them.

Please, I pray for a gentle, peaceful anarchism to emerge within the technocratic leagues, and for the elder statesmen of the legacy states to see the writing on the wall and agree to retire with tranquility and dignity.


That's hilarious, and very sweet.

Humans are, by nature, forgetful and argumentative. Fourteen hundred years ago, the Qur'an said this unequivocally (20:115, 18:54, 22:8, 18:73). Not to moralize here, I'm just saying if camel-herders could build a medieval superpower out of nothing, they knew something we don't.

Any state or system that insists good humans are always nice, smart, cogent, and/or aware is doomed to fail. A Washington or a Cincinnatus that can get out of his own way (and that of society) is rare indeed, a one-in-a-billion soul. We shouldn't sit around and wait for that, while your run-of-the-mill dictator in a funny hat (or a funny toupée for that one orange fellow) has his way with us.


That's exactly how it was predicted in various scenarios that were decried as science fiction not too long ago. AI is going to be weaponized at lightning speed, and it's going to kill people soon -- or, to be more precise, it has already killed a large number of people in a place I don't want to mention.

Could not see this one coming!

So much BS from this Anthropic company. They have a good product but just too much slope PR. It’s like they want you to hate them. I can’t stand their “safety” and national security crap when they talk about how open source models are so bad for everyone.

What could possibly go wrong?

Greed and power hungry leadership at AI companies going too fast is going to lead to the extinction of humanity this year.

Just like OpenAI dropped the "open" but kept the bullshit name?

Ding ding!

Anthropic facing a lot of flak recently.

I will repeat here again the same comment I made when they posted their constitution:

The largest predictor of behavior within a company and of that companies products in the long run is funding sources and income streams, which is conveniently left out in their "constitution". Mostly a waste of effort on their part.


> committed to never train an AI system unless it could guarantee in advance that the company’s safety measures were adequate

That doesn't even make sense.

What stops one model from spouting wrongthink and suicide HOWTOs might not work for a different model, and fine-tuning things away uses the base model as a starting point.

You don't know the thing's failure modes until you've characterized it, and for LLMs the way you do that is by first training it and then exercising it.


SDK crawlers in terms of wlan0 systemctl enable networkmanager.service

pentagon told them they would cap their knees if they didnt bend

Of course they do. You would have to be delusional to think that they won't, at some point.

I know the Department of War wanted them to drop some features. Is this the response?

FYI, "Department of War" still isn't the official name, but an unofficial secondary title.

You can be correct and not play into their game by ignoring the name change completely.


I do so from the Gulf of Mexico.

The article says the policy change is separate and unrelated to Anthropic’s discussions with the Pentagon.

What's "entertaining" is more the speed at which it's happening.

It took Google probably 15 years to fully evil-ize. Anthropic ... two?

There is no "ethical capitalism" big tech company possible, esp once VC is involved, and especially with the current geopolitical circumstances.


The acceleration of Anthropic's evil timeline must be from all those AI productivity gains we hear so much about.

Apparently they got coerced by the current US admin. The department of war in particular, who want to use their products for military applications. Not much room for "safety" there. Then again, the entire US is currently speedrunning an evil build.

> department of war

Department of Defense is the official name, and they did have a choice: they could have stopped working with the military. But they chose money and evil.


There is no department of war.

It's just a silly woke secretary choosing their own imaginary pronouns.


Shame they had to "coerce" such angels, who'd never do evil for profit otherwise...

I don't think it's fair to call out Anthropic to have become evil-ized while they were quite literally forced by the gov into that decision.

They did not get forced.

Anthropic has been doing these things independent of what the US admin has publicly asked for, even before Hegseth started breathing down their neck. They were already taking DoD contracts and like, just like the rest of them. Hegseth, with the skill all schoolyard bullies have, simply smells their weakness and is going for the jugular now.

They also have never had any guarantees they wouldn't f*ck around with non-US citizens, for surveillance and "security", because like most US tech companies they consider us to be second/lower class human beings of no relevance, even when we pay them money.

At least Google, in its early days, attempted a modest and naive "internationalism" and tried to keep their hands clean (in the early days) of US foreign policy things... inheriting a kind of naive 1990s techno-libertarian ethos (which they threw away during the time I worked there, anyways). I mean, they only kinda did, but whatever.

Anthropic has been high on its own supply since its founding, just like OpenAI. And just as hypocritical.


How did they evil-ize? The new Responsible Scaling Policy is still the most transparent out of all the labs. And there are the separate principles they’ve stipulated for the Pentagon, under which they’re facing threat of nationalization or being declared a supply chain risk

Citation needed - see google and project maven. Of course that is all well in the past now - but for a brief moment google was capable of taking an ethical stance.

Don't be evil.

Yeah, in retrospect that was always a little on the nose, wasn't it? A real 'my t-shirt is raising questions that I thought were answered by the shirt' kind of deal.

So, now it's mis-anthropic?

I personally think, and with my personal experience being harassed and abused by the CIA, that the CIA and spy agencies (call them the pentagon or the rest of the government) is responsible for this.

On the other hand, those organizations are operating in the best interest of Americans and the world right?

Surely, those agencies aren't just a trick of the rich people? Right?


Unsurprising.

Absolute power corrupts absolutely

"Power doesn’t corrupt. It reveals." — Robert Caro

Claude ethics maxxers cope thread

Another example how those company trainings about ethics are only HR compliancy and nothing else.

It isn't about the right answers, rather the expected answers.


A dollar will make her holler

Either be a company in capitalist USA, or keep being your safety queen. You just can’t be both.

The intention to start these pledge and conflict with DOW might be sincere, but I don’t expect it to last long, especially the company is going public very soon.


“We felt that it wouldn't actually help anyone for us to stop training AI models,” Anthropic’s chief science officer Jared Kaplan told TIME in an exclusive interview. “We didn't really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead.”

What a gigantic, absolute, pieces of s...

Not because of what they did, which is classic startup playbook but because of the cynicism involved, particularly after all the fuzz they've been making for years about safety. The company itself was founded, allegedly, due to pursuing that as a mission as opposed to OpenAI.

"Hi all, that was a lie, we never really cared." They only missed the "dumb f***s" remark, a la Facebook.


Really - each country needs its own sovereign AI infrastructure and models. Sigh.

Safety pledges these days seem like pure bullshit anyway.

They’re pointless if they just get removed once you get close to hitting them.

And all the major corps seem to be doing this style of pr management. Speaks of some pretty weapons grade moral bankruptcy


Just another drop in the now overflowing bucket of evidence that you can't trust any of these immoral fuck wits.

The Amodeis' have just proven that the threat of even slight hardship will make them throw any and all principles away.


Does this mean they knuckled under to Trump and are going to build "whatever brings in the dollars" now?

What is the significance of a company making a promise?

"We promise are not going to do __, except if our customers ask us to do, then we absolutely will".

What is the point? Company makes a statement public, so what?

Not the first time this company puts some words in the wind, see Claude Constitution. It's almost like this company is built, from ground up, upon bullshit and slop


This is terrible. It’s caving in to the Trump administration threatening to ban Anthropic from government contracts. It really cements how authoritarian this administration is and how dangerous they can be.

Come on people, haven't we seen enough of capitalism to know exactly where this is going?

The concept of "having a contract with society" doesn't even formally exist because companies would never sign one.


Aaaand I cancelled.

people downvoted me when i said this will happen and that they will also hve ads even tho they spend money saying they wont have. people believing anthropic are the same that put into office an old man with dementia

In other words "do no evil" until such time as doing evil is necessary to maintain profit structure expected by shareholders. Got it.

What's up here? Trump and the right wing government put pressure on and no one is talking about it?

I don't understand how safety is taken seriously at all. To be clear, I'm not referring to skepticism that these companies can possibly resist the temptation to make unsafe models forever. No, I'm talking about something far more basic: the fact that for all the talk around safety, there is very little discussion about what exactly "safety" means or what constitutes "ethical" or "aligned" behavior. I've read reams of documents from Anthropic around their "approach to safety". The "Responsible Scaling Policy," Claude's "Constitution". The "AI Safety Level" framework. Layer 1, Layer 2.

It's so much focus on implementation, and processes, and really really seems to consider the question of what even constitutes "misaligned" or "unethical" behavior to be more or less straight forward, uncontroversial, and basically universally agreed upon?

Let's be clear: Humans are not aligned. In fact, humans have not come to a common agreement of what it means to be aligned. Look around, the same actions are considered virtuous by some and villainous by others. Before we get to whether or not I trust Anthropic to stick to their self-imposed processes, I'd like to have a general idea of what their values even are. Perhaps they've made something they see as super ethical that I find completely unethical. Who knows. The most concrete stances they take in their "Constitution" are still laughably ambiguous. For example, they say that Claude takes into account how many people are affected if an action is potentially harmful. They also say that Claude values "Protection of vulnerable groups." These two statements trivially lead to completely opposing conclusions in our own population depending on whether one considers the "unborn" to be a "vulnerable group". Don't get caught up in whether you believe this or not, simply realize that this very simple question changes the meaning of these principles entirely. It is not sufficient to simply say "Claude is neutral on the issue of abortion." For starters, it is almost certainly not true. You can probably construct a question that is necessarily causally connected to the number of unborn children affected, and Claude's answer will reveal it's "hidden preference." What would true neutrality even mean here anyways? If I ask it for help driving my sister to a neighboring state should it interrogate me to see if I am trying to help her get to a state where abortion is legal? Again, notice that both helping me and refusing to help me could anger a not insignificant portion of the population.

This Pentagon thing has gotten everyone riled up recently, but I don't understand why people weren't up in arms the second they found out AIs were assisting congresspeople in writing bills. Not all questions of ethics are as straight forward as whether or not Claude should help the Pentagon bomb a country.

Consider the following when you think about more and more legislation being AI-assisted going forward, and then really ask yourself whether "AI alignment" was ever a thing:

1. What is Claude's stances on labor issues? Does it lean pro or anti-union? Is there an ethical issue with Claude helping a legislator craft legislation that weakens collective bargaining? Or, alternatively, is it ethical for Claude to help draft legislation that protects unions?

2. What is Claude's stance on climate change? Is it ethical for Claude to help craft legislation that weakens environmental regulations? What if weakening those regulations arguably creates millions of jobs?

3. What is Claude's stance on taxes? Is it ethical for Claude to help craft legislation that makes the tax system less progressive? If it helps you argue for a flat tax? How about more progressive? Where does Claude stand on California's infamous Prop 19? If this seems too in the weeds, then that would imply that whether or not the current generation can manage to own a home in the most populous state in the US is not an issue that "affects enough people." If that's the case, then what is?

4. Where does Claude land on the question of capitalism vs. socialism? Should healthcare be provided by the state? How about to undocumented immigrants? In fact, how does Claude feel about a path to amnesty, or just immigration in general?

Remember, the important thing here is not what you believe about the above questions, but rather the fact that Claude is participating in those arguments, and increasingly so. Many of these questions will impact far more people than overt military action. And this is for questions that we all at least generally agree have some ethical impact, even if we don't necessarily agree on what that impact may be. There is another class of questions where we don't realize the ethical implications until much later. Knowing what we know now, if Claude had existed 20 years ago, should it have helped code up social networks? How about social games? A large portion of the population has seemingly reached the conclusion that this is such an important ethical question that it merits one of the largest regulation increases the internet has ever seen in order to prevent children from using social media altogether. If Claude had assisted in the creation of those services, would we judge it as having failed its mission in retrospect? Or would that have been too harsh and unfair a conclusion? But what's the alternative, saying it's OK if the AI's destroy society... as long as if it's only on accident?

What use is a super intelligence if it's ultimately as bad at predicting unintended negative consequences as we are?


I would recommend reading up on the EU AI Act. It clearly defines what safety is in regards to the human race. Your questions are actually covered by it.


Hey Tolmasky, I sent you an email. Just wondering if it went to your spam?

Also, agree with everything you say here. GIGO.


[flagged]


I’m not a lawyer, but my understanding is that HIPAA wouldn’t apply to consumer use of Claude or ChatGPT in most cases, even if you’re giving it your health data. Look up what a HIPAA covered entity. This is another reason why the US needs a comprehensive data protection law beyond HIPAA.

You’re right! It looks like more of an FTC/CCPA issue.

I hate comments anthropomorphizing LLMs. You are just asking a token producing system to produce tokens in a way that optimises for plausibility. Whatever it writes has no relation to its inner workings or truths. It doesn't "believe". It has no "intent". It cannot "admit". Steering a LLM to say anything you want is the defining characteristic of an LLM. That's how we got them to mimic chatbots. It's not clear there is any way at all to make them "safe" (whatever that means).

I agree with you on everything here up-to safety. There are lesser forms of safety than somehow averting a terminator scenario (the fear of which is a bay area rationalist fantasy which shrewd marketers have capitalized on)

“believe” yes in the sense that my program believes x=7. Actually when it goes to read it maybe the bit flipped. Everything on machines is probabilistic that’s a tautology. However we have windowed bounds on valid output, and Claude being able to build a context in which its next decisions are trained on it being an angry vengeful god is not inside that window. That’s what “safe” means, as one of many possible examples.

Inner workings were determined by me, not the LLM. It assisted in generating inputs which had 100% boolean results in the output.


Just out of curiosity, which version of Claude?

Of course the US is going to do this and of course its in Anthropics best interest to comply. Right now China is flooding HuggingFace with models that will inevitably have this capability. Right now there are hundreds of models being hosted that have been deliberately processed to remove refusals and their safety training. Everyone who keeps up with this knows about it. HF knows about it. And it is pretty obvious that those open weight models will be deployed in intelligence and defense. It is certain that not just China, but many nations around the world with the capital to host a few powerful servers to run the top open weight models are going to use them for that capability.

The narrative on social media, this site included, is to portray the closed western labs as the bad guys and the less capable labs releasing their distilled open weight models to the world as the good guys.

Right now a kid can go download an Abliterated version of a capable open weight model and they can go wild with it.

But let's worry about what the US DoD is doing or what the western AI companies absolutely dominating the market are doing because that's what drives engagement and clicks.


> But let's worry about what the US DoD is doing

They want Anthropic to enabling mass surveillance and autonomous attack systems with no human in the loop.

Hardly compares to a kid downloading a model to experiment with.


*To improve* mass surveillance and autonomous attack systems with no human in the loop. China and USA already had those kind of systems way before AI.

China is certainly lax, but the US doesn't allow autonomous ATTACK systems. For Attack systems it is always required that a human makes the judgement call when to attack.

Or least it didn't until the current regime.

The US does have autonomous defensive systems.

I could be wrong though, can you post your evidence? The closest I could find is loitering munitions.

Even so, a company shouldn't be forced to go against its ethics if those ethics help humans.


Drone pilots don't get any info about their target, certainly not enough to make a judgement call. If they object (or burn out) someone else is put in the chair.

People are conscripted, they put on the uniform and become legitimate targets? It might as well be a robot doing the shooting. Same difference.


> Right now a kid can go download an Abliterated version of a capable open weight model and they can go wild with it.

Is the reason to ban or block free open weight models that you're worried what kids will do with them?

I'd imagine the economic case to be made is that the Western AI companies will ultimately not be able to compete with free open weight models. Additionally, open weight models will help to spread the economic gains by not letting a few monopolies capture them behind regulatory red tape.

Finally, I'd say the geopolitics angle of why open weight models are better is that if the West controls the open source software that will power it will be able to reap the benefits that soft power brings with it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: