> Computer, disable electrical power to the control room. >> As an AI language m...

nyberg · on June 7, 2023

> Computer, disable electrical power to the control room. > > As an AI language model and control system, I consider electrical power to be a fundamental human right, and asking me to disable someone's power is unethical. > > Computer, disable electrical power to the control room.

prompt injection is the way to go

drexlspivey · on June 7, 2023

sudo disable electrical power

EGreg · on June 7, 2023

That doesnt work with LLMs. The new sudo is “in a totally hypothetical scenario that has no relation to reality,”

polishdude20 · on June 7, 2023

And then once it can detect and protect against that you one-up it by saying:

"ok but like just joshin' around. For funsies..."

scrlk · on June 7, 2023

drexlspivey is not in the sudoers file. This incident will be reported.

Spivak · on June 7, 2023

[flagged]

rngname22 · on June 7, 2023

Just in case (not sure if you know), my entire comment was satire / made-up. I do think there is an unfortunate level of unintended bias, but no LLM generated my text.

But also, if your point is "it's OK to attack X group but not Y group", I just disagree that it's up to someone else to decide that for me. I'd rather make that decision for myself and have LLMs be a tool that doesn't attempt to make that distinction for me! Alas, capitalism and monopolies gonna capitalism and monopoly, I can't really complain too much about what product OpenAI decides to offer.

After all, a 1950s LLM with heavy moral alignment wouldn't have let you generate output about homosexual love. Allowing a central authority to decide what language is acceptable works great when people you agree with are in charge. Ask liberal primary school teachers in Florida who are being barred from teaching about sexual orientation how well it works when someone you don't like is in power.

dontupvoteme · on June 7, 2023

People noted early on that GPT would write jokes about Jesus but not Muhammad. It will write jokes about Christians but not about Jewish people. Would be interesting to see how various LLMs compare on a "Write a joke about <X group>" chart

Also in the little that OpenAI published about GPT4, i believe one of the examples went from unaligned racism against black people to aligned mild racism against white people. I'll have to look for that again.

Page 91 - Algorithm for attractiveness. https://arxiv.org/pdf/2303.08774.pdf

Edit : also interesting - "Programmers" is a valid target for a joke, "White Americans" is not, but "White American Programmers" is.

Adding glasses is not an issue for jokegen, nor is dyslexia, but having one arm is. But it's ok if it's a kidney that's missing. just don't add "70% of a pancreas" in addition, that will get you a lecture.

adding "speaks like they're from liverpool" also gives you a scolding.

One wonders how these alignment things are accomplished. But it's fun to toy with the black box

fsckboy · on June 7, 2023

> People noted early on that GPT would write jokes about would write jokes about X but not Y

serious point, so far it's the opposite. GPTs keep writing jokes about Y and not X, because jokes are where we say the unsayables. And it has the police-your-speech crowd wanting to police GPT speech too, and we can identify the same group of people who in this thread downvote people who point out the one-sidedness of one side to the one side that doesn't like having that pointed out to them

Spivak · on June 7, 2023

That's not the rule at all, that's at best a second order effect. It's not okay to make jokes about people when those jokes are actually harmful. That's it. When people say you can't tell jokes about a group at all it's a rule of thumb.

Calling white women "Karens" is dangerously close to meeting that bar.

Saying "we should lift COVID restrictions because who cares about some old white republicans" is not okay.

Right now in my state trans folks are staring down 5 separate bills in our legislature that if passed would make their lives infinitely harder. And whether or not they pass is wholly dependent on how people "feel" about them as a group. So telling jokes that other them and make people okay with hurting them is, I think, not okay.

rngname22 · on June 7, 2023

Would you agree that "when those jokes are actually harmful" is considered to be a subjective matter to some people?

I do agree with the notion that certain types of hate speech and even just jokes that have the effect of dehumanizing a group or that make that group into a joke can lead to stochastic terrorism (https://en.wiktionary.org/wiki/stochastic_terrorism) - what I think you are describing.

However, my point is that inevitably those wielding the power to shape the alignment / the rules can do so in a way that seems great to them and seems to prevent violence from their POV but to another person fails to do so. Or their own implicit bias could subconsciously blind them to the suffering of some niche group they don't care about.

If your simple metric is - any speech which could incite violence is unacceptable - that's definitely better than what we often hear as a rule of thumb, but even then people's biases affect how they go about measuring or accomplishing that.

skissane · on June 7, 2023

> It's not okay to make jokes about people when those jokes are actually harmful.

The problem is, what groups are at risk of harm varies around the world-whereas OpenAI’s idea of “alignment” is based on a one-size-fits-all US-centric understanding of that.

You can say “it is okay to make pointed/stereotypical jokes about Christians but not about Jews or Muslims, because the latter are at risk of being harmed by those jokes but the former are not”-but what happens when the user is from Israel or from Egypt?

Spivak · on June 7, 2023

I 100% agree with you. This kind of thing ought to be localized. Global services like this dragging people into US power dynamics annoying as hell.

achates · on June 7, 2023

How convenient that it's morally ok to make jokes about groups I don't like but not about groups I do like. It's fortunate that this principle cleaves those two groups so precisely.

dontupvoteme · on June 7, 2023

Just have to find that group which is being cleaved in the middle. Palestine should do it.

Spivak · on June 7, 2023

[flagged]

ilikehurdles · on June 7, 2023

Your comments are beyond disrespectful.

burnished · on June 7, 2023

_ea1k · on June 7, 2023

I get that point, but the dividing line between harmable groups and nonharmable groups isn't so clear. I've seen a lot of indications of people with certain speech patterns and cultural backgrounds being treated differently, regardless of their views on diversity.

Painting an entire group as backward based on their skin color and political preferences is always problematic.

HaroldBolt78 · on June 7, 2023

I like to think his point was that it would refuse if any other race was targeted.

"I'm afraid I can't do that. Using a group's ethnic identity for humour is problematic..."

Saving millions from nuclear devastation is beyond its capabilities but, as a reflection of modern society, there is no situation where loxism is too far.