Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Our production servers were suspended by Google Cloud (onvoard.com)
209 points by ayewo on March 13, 2023 | hide | past | favorite | 180 comments


Hi everyone, I'm the customer in this case. Wanted to let everyone know that this sort of abrupt suspension for production apps has been a recurring issue that has happened time and again with Google Clown Platform.

Similar horror stories:

https://news.ycombinator.com/item?id=32547912

https://news.ycombinator.com/item?id=33737577

-----

[Update 1]

Yes I agree I have some fault for the initial suspension, but that's not the biggest issue here.

The biggest issue for me is how they "react" to help a customer get things back to normal when something goes wrong.

First, I had to wait for 8 hrs for their Trust and Safety team.

Then it was 24-48 hrs. See https://imgur.com/a/i0OHXcI

Until now I still haven't had a resolution.

And the most ridiculous part is they could still tell me "I just received an email from the Trust and Safety team and they mentioned that they have sent you a separate email." when I had replied 3 times within an hour after they requested for more details. See proof https://imgur.com/a/1ZSmGBH


I wonder if there is a cheap existing solution to use multiple cloud hosts at once and minimize this risk instead of moving all eggs to the AWS basket.


Well if you deploy to Kubernetes it standardizes thee APIs you're using to deploy with quite a lot, then you can move around easier.


I've been trying very hard to sell that idea to my team.

But no, they insist on using every product the different cloud providers offers. This, IMO, is a recipe to lock yourself to vendors and make maintenance and interoperability a nightmare.

It feels like when devs felt they had to use all the gang of four design patterns.


You need to follow some engineering standard. If your engineering org is not optimising away vendor lockin, blocking someone from using tailored-feature-number-367 is unlikely to have any positive effect and you'll just make everyone move slower. If avoiding vendor lock-in is something the org cares about and is doing properly, yeah sure, enforce it, otherwise I've seen many people fighting windmills on this and bikeshedding about "not getting locked-in" when effectively many other things already lock you into that vendor.


If you have any fault it's very tiny. Those type of "important" emails are too common. And no warning is ridiculous too.


I rent one of dedicated machines in Hetzner. Cheap and performant.


It is also a different product from Google Cloud or AWS.


Why do you use a hosting solution that you yourself call "Google Clown" ?


Probably because OP didn't call it that back then


We don't all always get a choice. We are often left with the inherited sins of our forerunners.


> Then it was 24-48 hrs. See https://imgur.com/a/i0OHXcI

Usually I don't care much for grammatical errors, as long as the point gets across clearly. In this case I don't know why it irks me.


It's a shame the industry can't agree on a basic SLA:

- Solicit a hard limit when you open an account. People can only afford so much, don't bankrupt them.

- As soon as we see your account heading for a 10,000% increase on last month, we'll send a SMS, GIVE YOU A PHONE CALL, even send a letter and confirm everything is okay.

- If you're about to trigger a suspension, we'll send you an automated SMS, and if you're a paying customer, we'll do you a solid and spend $5 calling you to make sure you understand what's about to happen to your business.

- You get one 7-day extension. People get sick, go on holiday, have life in their way and often haven't figured out how to juggle business critical credentials amongst people.

Just imagine the quality of life improvements if people felt safe using a cloud provider. Instead we see stories like this and withdraw back to colocation and flat-rate VPSes.


Absolutely agree.

AWS is pretty much a line of credit, unlimited credit actually.

When you open a credit card you are usually given a limit, say 10K.

If there is a (pretty obvious) suspicious activity on your account it gets disabled, the bank just won't let your card be used indefinitely to buy yachts and planes.

And when you contact the bank they won't ask you to go find the bad guys and stop them from using your card before they can help you.

That would make absolutely no sense but that's exactly what AWS did in our case.


That's the thing right at some point you should have an account manager.

if your spending maybe 4 but definitely 5 figures you should have an account manager. I would argue same on the publisher side.


> - Solicit a hard limit when you open an account. People can only afford so much, don't bankrupt them.

As is said every time this gets proposed, this is not only wildly impractical to implement, but also dangerous from an availability. Why is it impractical? You are billed for active things (running databases, network traffic) but also passive ones such as storage. What do you want AWS to do when you go over the hard limit, delete everything? As for the availability, nobody wants to lose the production environment because nobody has updated the budget with more appropriate values in years. What AWS and GCP do is allow you to set a budget, with notifications however you want them when you approach/hit it. When you get a surprise bill due to a misconfiguration or whatever, you can dispute it and cloud providers are extremely generous with them and usually waive the first few errors.

I agree with the rest though.


Admittedly, I'm no trillion dollar cloud host —and I don't know your perspective here— but I'd expect that a trillion dollar cloud host could look at my billing data at an hourly resolution and fairly accurately predict that that'll look like at the end of the month.

They don't need to cut me off right away (unless I'm already outside my budget) but for most bitten people, getting a call the day after making a colossal mistake will save everybody a lot of money and angst. Neither of us needs to negotiate and I don't have to sell Grandma.

It's been a few years but the last time I tried AWS's Budgets, it took a hundred clicks or so per service to just get an email out of it. Emails get missed, life gets in the way, etc. The cost to a trillion dollar cloud host to call me up is far less than it would cost you to call me as they off-shore it all.

And I'm all for people actively choosing to play without limits. I just think a lot of people a) don't respect that from the off, and b) aren't notified in a timely fashion.


That's the human and courtesy way of doing it. Not sure why it's so hard.


That all sounds like it costs money. This is a capitalist society: companies operating correctly should provide as little service as possible for as much money as possible. Nothing you suggest aligns with that goal:

-A hard limit? Why limit your income to $600 when your customers can accidentally rack up a $6000 bill? You can even be magnanimous and negotiate it down to a $2000 bill and everyone's happy.

-Why would we pay money in monitoring software, SMS & phone call fees to stop a 10,000% revenue increase?

-$5 to try to tell a customer not to give us more money? Are you crazy?

-People getting sick under capitalism are drags on the economy and should be derided and made as miserable as possible so it doesn't happen again.

> Just imagine the quality of life improvements if people felt safe using a cloud provider.

We're not here to make people feel safe. We're here to make incredibly rich people even richer. Get on board, man. You'll never operate a successful company with your approach, everyone else will beat you on price.


Just give us the option to pay. Some might call it an insurance premium.

(And yes my satire detector is triggered, though one can never be sure)


The problem is that, very likely, there are decision-makers at these companies that have exactly this reasoning about problems like this.


Well, a company would do that to get more customers and to fidelize them. But none of that is important on an oligopolized market.


Make sure your decision to migrate to AWS is not an emotional one.

I had my own feud with AWS over a $60,000 bill and an enormous amount of hours spent by my team fighting crypto hackers after they took over one of our accounts and use very advanced scripts to create hundreds of large instances.

AWS has become very complicated to manage and at the same time the attacks to their infrastructure have increased exponentially.

If your company doesn't have a dedicated team to stay on top of AWS you'll be vulnerable to attacks like the one I experienced.

You simply can't count on the baseline configuration of your account and instances provided by AMS, you need advanced knowledge on how to secure your resources beyond basic stuff like MFA and monitoring.

And if you think AWS will have your back when shit hits the fan, lol good luck with that.

They will try to collect the bills while providing absolutely no support, reminding you about their Shared Responsibility Agreement.

They also expect you to fight the attacks.

I ended up spending months of work securing my services and cleaning my debt with AWS billing, to the point we almost went to court.

So again, you may have good reasons to migrate, just beware that AWS won't come free of hassle.


I find the author's stance of "let's move to the competition" hilarious for this reason.

I was trying some VPN services at the time, and logged in to an AWS account with the VPN turned on, after which AWS suspended resources in the account. This took a two days to review and fix, though the only saving grace was that I did not have a paid offering running there.

When pressed as to why they'd flag my account, they'd would evade the question as to why my account was suspended, though I'm pretty sure it's the VPN. (And before you ask, no, my account was not compromised, as I had to review 90 days' Cloudtrail logs before AWS would budge on the issue).


AWS understands lifetime value of the user, not just the account. I've had mistakes ON MY PERSONAL ACCOUNT that were in no way connected to my work. Times when I was just experimenting. They refunded it in full. I've also heard from STUDENTS who may not ever get into infra work that they're still reasonable and lenient.

You would be absolute mad to use any other platform. AWS gets it right on the customer service front. Amazon, perhaps deeply less so... but AWS? amazing.


It was not our experience.

Tech support was always friendly, polite and apologetic, but could not provide help.

The attack was way beyond their level of skill and they insisted in providing instructions that were completely helpless, all manual actions to fight a full scale extremely vicious automated attack (change password, delete instances, disable regions, delete roles, etc).

AWS security got it done in one hour, but it took months to get to them.


A friend of mine had a similar experience when he was just learning. He accidentally racked up a large bill (for a student), contacted support, and they forgave the charges.

That's not to say that Amazon is being altruistic here, I think it just makes good business sense. Foregoing a few thousand dollars for even a tiny chance that a student's future self decides to use AWS for a multi-million spend project down the line is worth it.


> You simply can't count on the baseline configuration of your account and instances provided by AMS, you need advanced knowledge on how to secure your resources beyond basic stuff like MFA and monitoring.

When could you? when AWS was security by obscurity a decade ago.

When you create an account, one of the first things AWS has you acknowledge is the Shared Responsibility Model.

AWS has always given you the rope to hang yourself.


I had a similar experience with AWS. They warned us of a crypto hijacking of one of our servers. They weren't any help on internally battling it. They were hands off on anything within the server, and didn't have any advice on how to address it. I had to manually battle it. Their backup and device swapping services were pretty handy though.


> You simply can't count on the baseline configuration of your account and instances provided by AMS, you need advanced knowledge on how to secure your resources beyond basic stuff like MFA and monitoring.

What advanced knowledge are we talking about here?


All the buried config options on AWS services to lock them down. It's a complicated platform.


> very advanced scripts to create hundreds of large instances.

What was very advanced about those scripts?

As an outsider, I'd think that perhaps a very simple script might be enough to create hundreds of large instances.


Sure, but that was only one part of the attack and not the clever one. The ability to bypass security, even after following AWS support instructions is what made it so difficult to fight, to the point that only after AWS security team took over the case it did not stop. The problem is AWS security doesn't interact with customers directly (as far as I know), you have to go through tech support and these guys just don't have that level of knowledge.


If your company doesn't have a dedicated team to stay on top of AWS you'll be vulnerable to attacks like the one I experienced.

This has been my experience, 5+ years into our 'cloud journey'. For all those people we didn't have to have on board because cloud magically made them redundant, we've had to add almost as many people to ride herd on our relationship with the cloud providers.

TANSTAAFL


Did you get to a root cause of how the “crypto hackers” “took over” your account, or of what the “very advanced” scripts did?


We had basic security like MFA set up and we never shared keys, so we don't understand until this day how they managed to take over.

The attack resembled a very aggressive, clever virus so the people behind it definitely had very advance knowledge of AWS.

They could enable regions we had disabled and recreate roles that were deleted in a matter of minutes, again MFA on, no keys.

AWS kept giving us conflicting instructions and refused to disable our account unless we stop the attack first.

We have been customers of AWS since 2014.

I believe during that time the Shared Responsibility Agreement did not exist, so AWS demanded we signed it now for them to even consider taking a look.

That's when I reached out to the Attorney General of my state who sent a letter to AWS and got things moving.

We were contacted by another team (two months later) in AWS that seemed much more knowledgeable and they gave us the scripts that eventually stopped the attack.

Our case is not that rare, if you do some research you'll fine plenty of stories like ours.


> The attack resembled a very aggressive, clever virus so the people behind it definitely had very advance knowledge of AWS.

In what way did the attack resemble a very aggressive, clever virus?

Do you mean that it happened fast? How do you know it wasn't something simple like your credentials leaking?


Why all the quotes? It makes your response seem either sarcastic or disbelieving, either way a bit disrespectful.


Or because he's quoting the comment in his question.


> Or because he's quoting the comment in his question.

On HN the usual way to quote is like I have done above.

I too react like the poster above you and would probably not quote that way except to ridicule. Hopefully there aren't many of these across my posts.

(As for an almost green account commenting on HN standards I have been here is some form or another since around 2009 I think, I just once in a while create a new account so it won't be trivially simple to doxx me.)


It’s to quote specific words which are pertinent to the threat level implied, rather than the entire post.

Extraordinary claims require extraordinary evidence.


Common sense is a thing and what common sense can tell us absolutely isn't an extraordinary claim.

Edit:

By the way, what do you mean by threat level here? And FWIW, here I could have used quotes without being rude.

Also: if you read that sentence again you'll probably find that it wasn't just the use of quotes but the sentence that as a whole that made it stand out as rude.


A baseline AWS account with MFA will not be taken over. That would require a credential leak or an IAM misconfiguration permitting access from some other account.

Suggesting it is an extraordinary claim, and would suggest an extraordinary threat to all AWS accounts. This seems very unlikely.


Emails are a difficult thing, aren't they?

Was the 10 day countdown the first time Google mentioned "we'll suspend the account unless..."? Generally on these things - and unless it's an absolute emergency - you'd expect lots of prior warning and at least a couple of "YOUR SERVICES ARE DUE TO BE SUSPENDED" emails in the 48 hours before.

That said, I've rightly called my bank out for sending an email with the subject line "Your account was compromised" - and then the body began "...is what you could be reading if you didn't have adequate security", followed by a sales pitch for some cyber security software. When you get rubbish like this (or absolutely banal "important information regarding" emails) so often, it's hard to know when to bother.


True.

AWS at least prefixes mails with "[ACTION REQUIRED]" if there _might_ be some action on your side required.

And by coubting the mails that are coming in today with "important information", I can fully understand if you'll miss one.


I get a lot of emails (from various companies) that have "action required" and no action required at all. Usually "you've logged in from a new device, you don't need to do anything unless you've been hacked."


I get your point, and that's true. IMHO, I rather have some "action required" mails that won't need action than a single without the prefix that will require action.

Until everyone uses the prefix everytime...


It's frustrating when emails don't categorize as critical and we end up missing something important. And don't even get me started on those misleading subject lines that turn out to be nothing but a sales pitch. It's like we have to play detective just to figure out which emails are worth our attention. It would be great if companies could be more transparent and upfront about the urgency of their messages.


Every message is urgent to the company that sent it.


Get a better vendor? They should at least have separate categories for marketing vs "important" account information, vs actually important account information (that last category can't be unsubscribed from).


“Moving Forward. Starting today, I will be making plans to move everything to AWS.” I laughed when I read this because it’s presenting moving to AWS as a completely different future.

The industry has moved beyond depending on a single cloud computing vendor. It’s negligent to put all your eggs in one basket.

I would say don’t put your business in the hands of a single cloud vendor. Either exit the cloud or go multi cloud or run plain old Linux machines on hetzner or ionos. Design your systems to withstand your credit card expiring, or missing an email.


I disagree.

I’ve hosted on AWS for ages. They don’t do this. We didn’t pay the bill for two months once because the owner of the card left. No problems. This was a small spender outfit. They could have thrown us on the street as we were a rounding error. Now we’re a big spender and we still forgot to pay them sometimes (sorry Bezos)

But some of your point is worth considering. Design for portability should the worst happen (you are cut off due to geopolitics or other problems) or somewhere else turns out to be a better deal. Stick to IaaS and portable services only.

For the moment I trust AWS though.


How did Russian companies mitigate the effects of cloud services hosting being blocked by sanctions


Judging by my friends' experience, they were never really popular inside the country. For those building projects aimed at internal audience, local companies were preferred for tax or legal reasons, and most of those working on international projects quickly emigrated to more sane countries.


Mostly by migrating to other countries.


[flagged]


Substitute "nationality X" for "Jews" to instantly recognize that tone anytime, anywhere.


They are a society manipulated heavily by propaganda, run by rich elites and pushed into an unproductive war by a political dictatorship supported by the elites. Perhaps everyone should have stopped doing business with the USA for the last few decades for the same reason.

You can creatively frame a problem any way you choose.

All countries have a problem with media, elites and politicians. We should aim to solve those problems, not cut a large chunk of the planet off. They are people too.


The difference is when something goes wrong at AWS someone picks up the phone, fixes your stuff, apologizes profusely, and give you a credit.

Mutlicloud is great but somewhat complicated to setup correctly.

The value add of AWS is that Amazon has unusually good customer service.


It's, to some degree, built into the ethos of the company. Jeff Bezos' "customer obsession" mantra is quoted by professors in every MBA program from Arizona to Zagreb. Case-in-point, here's a compilation of him expressing it from 1999 to 2018: https://www.youtube.com/watch?v=DIrhVU04GHc

Amazon may not be particularly customer centric in terms of the products they allow on their market place, or AWS by the plate of spaghetti that is their various cloud offering, but their customer service division is great. You can get in contact with them in hours, and there's never a hesitation to give you refunds or credit if something goes wrong; something I can't say the same of when it comes to Microsoft or Google.


I guess there's basically 3 approaches to customer service:

1. Excellent customer service, that tries to make the customer as happy as possible, within reason.

Amazon seems to be the big example of this one.

2. Terrible customer service, that has a hostile attitude towards the customer.

I'm not sure who's a good example here. Comcast?

3. Non-existent customer service.

This approach has been made famous by Google.


There's certainly a spot between 1 and 2; one where the customer service that exists is reasonable, but horribly under-funded and under-appreciated by the business. I'd say half of all businesses fall into that bucket.

I know when I call my insurance or bank that I'll mostly get things sorted, however I need to jump through 101 hoops to get to the right person via a automated service. Stop being cheap and just hire some operators to wire you through to the right people. It's not terrible per-se, just some combination of good (helpful customer service reps) and bad (automated operator).


> The industry has moved beyond depending on a single cloud computing vendor.

Given how much of the industry struggles when an AWS region goes down, this seems... unlikely.


How about “the industry should have moved beyond depending on a single cloud computing vendor.”?


Our company tried that and was huge waste of time. Resulted in us wasting a lot of time trying to make our stuff "generic" so we could switch off of AWS if we ever wanted to. Eventually 1+ years later someone finally realized how much time we were wasting designing for some hypothetical event that they finally told us to stop doing that and fully utilize AWS directly.


It's pretty cost-prohibitive.

(... yes, more cost-prohibitive than losing one or two days of business to a SNAFU).


That is a big difference from what you said.

The reality is that all serious outfits either host on AWS, or keep some level of AWS account at the ready, in addition to bare metal Linux servers in colos or hosting providers that they can move onto if needed.

Your statement is confusing simplicity and flexibility with multi-cloud. No one wants to deal with the confusing mix of random hosted services that 3 different cloud providers try to sell you, with one of them (Google Cloud) being this opaque horror story that the article is talking about.

Host on AWS, and mostly host your services on EC2. It's dead simple to migrate elsewhere if needed. You will likely never need to.


us-east-1, it's always us-east-1

Don't host production stuff in there. And if you must, have a backup somewhere.


AWS might be up for others but could have locked you out


What is the evidence this is true? I tend to agree with https://www.lastweekinaws.com/blog/multi-cloud-is-the-worst-...


> The industry has moved beyond depending on a single cloud computing vendor. It’s negligent to put all your eggs in one basket.

In that case, the vast majority of enterprises are negligent.


> > The industry has moved beyond depending on a single cloud computing vendor. It’s negligent to put all your eggs in one basket.

> In that case, the vast majority of enterprises are negligent.

Vast majority of enterprises are not mission critical and their products don't need guaranteed downtime lower than 1min per year. What they need is a lower price, because people don't care about quality in that case but about costs.


Well Said


It seems like Google has burned up nearly all the goodwill it used to have years ago. They now have a nearly broken search engine that needs the term “Reddit” appended to get anything useful, and you routinely hear horror stories about them acting badly without any regard to users. I hope they get disrupted and replaced by bing chat and others asap. The only thing that will possibly teach them a lesson is if their business results start rolling over in a big way and large activist investors can cause the board to realize they need to replace Sundar with someone who gets that radical change is needed.


These are memes that are popular on HN, but don't reflect reality outside of the tech bubble. There are of course elements of truth to them, hence why they're so pervasive, but they are not representative of reality.


It's for this reason that Google's ad business isn't really in danger at the moment, yes. People are going to continue to use google search and gmail and chrome.

But having this reputation in the tech bubble negatively effects things like google cloud that tech people get to make decisions on. This is one of the reasons that google cloud is so far behind AWS and Azure in adoption.


The only reason it might be behind azure is how aggressively Microsoft market azure to existing customers and bundle it with other products.

As a standalone product I'm not sure anyone would use it.


Azure has been nice to work with i have been comped multiple times for our own mistakes support is great im really not sure i get the hate..


Partially agree, but ... the tech bubble is pretty good on dictate trends and set standards that the general public will be in 2-3 years in advance.


I'm not sure that's accurate. We'd like to think so, but most tech trends don't take off and go mainstream.


Is that still true? I think a group of tech influencers existed in the past but I don't see any evidence that continues to be the case. I guess they could all exist in discords I don't have site into. Previously you could see trends migrate through tech communities and out into the world. I don't really see that now.


It's funny you're appending Reddit to your searches too. I noticed this is very common, ha.


I'm amazed you think there's any goodwill for Google at all. Their search engine sucks shit these days.


This is one of those reasons why I prefer prepaid cloud services. There are not many providers in this field (Vultr, Digital Ocean, etc). You can generally estimate the length of the runway based in the funds you have in your account.

It also helps that there are multiple funding sources, so I can split the bill between multiple cards or accounts.

I travel the major part of the year, and it is quite common to get my cards lost (often at my own fault) or compromised (Mastercard tends to suspend cards when they detect charges from multiple countries in a short time frame). I can wait for the new cards to arrive worry-free that my cloud services don't take a hit like this.


Unfortunately, this wasn't a payment issue so this wouldn't have saved you. It's a Know-Your-Customer issue, i.e prove you're not doing business with someone on the US embargo list.

DO, Vultr etc. would have the same issue depending on the timeframe they were given to prove compliance by the feds.


I’ve had Vultr and Digital Ocean accounts for years. Never got a KYC request from them. This is driven by the lawyers at Google.


Only one email from Google Cloud? That's weird: I got four when my Wells Fargo debit card was compromised on 2/28/2023.

- 2023-03-04 "Your Project: nono is at risk of suspension"

- 2023-03-04 "Action required: your billing account 0028C9-xx-yy is past due or has invalid payment information"

- 2023-03-04 "Google Cloud Platform & APIs: Update your payment method to keep services"

- 2023-03-04 "Google Cloud Platform & APIs: Payment was declined for 0028C9-xx-yy"

Similarly, I got four emails from Digital Ocean, but only one from AWS.


This isn’t past due, this is a KYC thing.


I hate how the internet age has conditioned customers to expect less.

There is no way a non-internet business would cut off a client of 4-years without actually talking (phone) to them.


Same with uptime.

I come from a healthcare and telephony background where "five nines" (roughly five minutes and 32 seconds of cumulative downtime in a year) is actually still a thing. I'm absolutely amazed at the number of companies that just accept each region/service/control plane/networking/etc outage in the big clouds. First you wait an extended period of time for them to even acknowledge the issue (rarely even updating their status dashboard) and then you stand around powerless waiting for your business to come back online. Time and time again.

Yes, multiple region deployments are encouraged (still not even five nines) but when AWS and others give you the ability to create really convoluted architectures (which they love and encourage - it's sticky) using at least a half dozen of their products (IAM, VPC, RDS, S3, etc) for the simplest solution that make it virtually impossible to deploy, debug, and maintain multi-region without a dedicated "AWS Architect" role. Downstream outages for solutions/companies building in clouds because of region or service issues demonstrate just how difficult this approach is in practice.

Then you sit back and wait for the almost inevitable "where did this $50k invoice come from", "we ran through our monthly budget overnight", "Oh they don't like us anymore and are closing our account", "What's the spend going to be this month - no one knows" and other scenarios discussed in this thread.

Can't figure out how to wrangle costs? Sick of the outages? Good luck getting your cloud Rube Goldberg machine migrated out (especially because by now you've hired based on cloud of choice).

Depending on a variety of factors any of these scenarios can kill a company.

This isn't normal. This is regressive. It is unacceptable.


You mean like airlines, banks, insurances, utility companies etc.? If anything standards and expectations went up in a lot of areas - e.g., think one day shipping, returning stuff whenever you feel like etc.


It's all relative to how much you spend and how much other customers spend.

The first rule of production on GCP is to get off credit card billing, and that has been pretty well-known for a while.


Google don't care what your name is... They just want to be paid.

However, the law requires them to not do business with people on the US embargo list. And to verify that, they need your name.

That is probably why there was no way to ignore this paperwork, and why customer service couldn't override it.


The problem isn't that they are suspending the account due to missing information. The problem is that they apparently consider a single email (that looks very similar to dozens of other emails that they send out) sufficient warning. Especially since most devs probably get dozens of spam emails a month that look more urgent ("Account Suspension" is a popular subject line for phishing emails).

At the very least they should try calling the account holder before suspending the account.

If they can't reach the account holder, they should re-enable the account very quickly after they reach them, and give them an extension with enough time to provide all the required information.


Not to mention all the useless spam that Google send themselves. The SNR is abysmal.


... and why the time window was so short. Google isn't triggering this because they want to lose customers; they're triggering it because the Feds breathed down their neck about a suspected violation.


In many ways, Google continues to expose the failure of automation.

The 2010s dream of "megacorp serving everyone with centralised programming" ever more exposed as an unethical pipedream.


> failure of automation

Many of the Google horror stories are less a failure of automation and more just a failure of well designed business processes. This same event could/would have occurred if it were humans sending the "additional information required" emails and also following a protocol of "shut them down in 10 days if they don't provide". The fact that this process was (most likely) automated is not the issue.

Honestly, as an outsider, it would appear that Google is a tech company that isn't actually very good at business. They do some smart tech things with their huge number of smart workers, but the business side seems to make a lot of very poor decisions (or simply doesn't even consider some part of the business or doesn't prioritize business processes well).


> it would appear that Google is a tech company that isn't actually very good at business

Their primary product (ads) sells itself.

So as a company they haven't ever had to become good at business -- in the interacting-with-other-business-entities sense.


Every time I’m told it was an automatic system at Google, I doubt it. I can’t imagine all those heavy decisions aren’t actually reviewed by humans, who validate the automatic suggestions.

Would anyone imagine that the decision to cut off a customer entirely, is not checked by a clerk? by a lawyer? The defense “It was automatic” doesn’t even pass the basic sanity test.


At Google's scale, cutting one paying customer isn't a heavy decision. That's what makes doing business with them so weird; it may be mission-critical to your business, but to them it's a Tuesday.


I suggest a large part of this is Google's longstanding strategy of outsourcing as much non-product, non-sales functions as possible to vendors. There's a strong disconnect, and also strongly antagonistic incentives, getting in the way of closed loop alignment (not to mention being able to flexibly/reactively adjust business processes based on market needs).


Reminder: if your level of interaction with a service is putting your credit card in, the service is likely to interact with you at the level of automated emails, and also likely to trust you as much as someone who filled out a web form.

If it's business critical, talk to a human. All the cloud providers have account managers just waiting to talk to you. If this sort of thing happens and you have an account manager, they call you!

Even better, get a contract that prevents them from cutting you off like this, or sets an SLA for getting back online. Again, all the cloud providers will offer this, even for relatively low spends.


How do you get such a contract at the big providers?


You email them and say you're looking to chat about your plans for the next year. They'll set up a meeting in no time. This is the norm! Running a business on services purchased semi-anonymously is not.


Show them the invoice of your current cloud provider. If it's a big enough number then...


I work in gcp as account manager for enterprise, you won’t believe I chase deal for as low as 2k per month, tough time, every dollar counts


I am not a fan of "The Cloud" for ongoing production. I understand the case for it but I never liked the risk of someone else, with a different agenda and incentives than me, running my critical infrastructure. The Cloud does alleviate some risks with hosting your own servers, but it introduces new risks. This reminds me of a slogan I saw many years ago: There is no cloud, it's just somebody else's computer. That somebody else can complete shut you down when their interests no longer align with your own. No thank you.


I sort of tend to agree with you, but the counter-point is that there is always going to be some critical piece of the infrastructure that you are dependent upon that someone else controls. If you colocate, it's the data centre. If you host on-prem, it's the physical telco infrastructure.

There are no shortages of incompetent people and institutions that will screw you. It's typically incompetence, not malice. But you will run into malicious actors as well.

The reason I sort of agree with you is that, being someone who is very comfortable with bare metal solutions and came from the 90s mentality of writing portable code, I have started to see "The Cloud" as a return to mainframe programming. There is nothing wrong with that, but you are programming for "a machine" that often leads to code that is exceedingly hard to port. "Cloud Agnosticism" is the industry buzzword for not locking yourself into one vendor, but you need to make that decision on day one and set engineering policies enforcing a cloud services abstraction layer to be maintained. Most startups in the process of scaling are not going to see the immediate business value in incurring that cost. (though, with more horror stories like this one maybe that will change... but I doubt it).


I've done large colo deployments. When you're provisioning your own hardware it's cloud-init, ansible, K8s, whatever. Nothing even knows you're in colos operated by different providers in different regions. Nothing knows or cares you're using five different upstream internet providers (whether your own - complicated or IP blends from the colo).

Have a falling out or big issue? No reputable colo provider is going to hold your hardware hostage (it's the physical world and legal and reputational issues heat up quickly). They'll be happy to ship it out (get rid of you) to another provider, who will be even happier to re-rack. Meanwhile if your architecture is even remotely well designed (load balancer, multiple facilities) you're never down and it's not even that much of a "fire drill". With the capabilities of modern hardware this is often a single 1U redundant HW machine.

All of this, by the way, is a fraction of the cost of cloud solution (a $25k bonkers 1U is ~$700 a month with a three year lease and 1U hosting + bandwidth + power is ~$200 a month). So for $2k a month (and it's actually predictable) you're up in two facilities and have capabilities an order of magnitude greater than a big cloud at equivalent spend.

Companies screaming on twitter and profusely apologizing to their customers due to the most recent "name a cloud" outage demonstrate that multi-cloud (and even multi-region) is extremely difficult if not impossible to do in practice.


> All of this, by the way, is a fraction of the cost of cloud solution (a $25k bonkers 1U is ~$700 a month with a three year lease and 1U hosting + bandwidth + power is ~$200 a month). So for $2k a month (and it's actually predictable) you're up in two facilities and have capabilities an order of magnitude greater than a big cloud at equivalent spend.

You mean you have raw power an order of magnitude greater than a big cloud at equivalent spend. Capabilities are all DIY - if you want a database, object storage, replicated storage between your two locations, message queue, etc. etc. you have to set it all (and think of the backups, patching, monitoring, etc.) on your own. That's a non-insignificant costs that will easily dwarf that $2k. Source: I've done this at an MSP/MHP, handling a decent chunk of the infra side.


When you turn $20k/mo of cloud spend into $2k/mo that’s $216k/yr - $648k over three years.

Postgres, minio, rabbit, whatever absolutely does not even come close for initial implementation and ongoing support. If you spend more than a tenth of that you’re doing it wrong.

Source: CTO who’s done all these things.


Yeah your mileage will vary greatly depending on your requirements. A couple of years ago I owned a very high-traffic, media-oriented website. Bandwidth costs were my highest cost, not compute or storage. I explored moving to something cloud-based and the cost increase just on bandwidth per GB would have been enough to put me out of business. Yes this was including a paid Cloudflare / CDN account sitting in front of my servers.

I considered a "hybrid" model where I would keep bare metal servers for distributing media but move compute and persistence to "the cloud" but just couldn't really justify it on cost alone. The replication and auto-scaling would have been nice, and were the major selling points, but wasn't a high enough ROI for me.


Could I contact you to pick your brains about this? I spend much more than $2k for dev environments at my startup and a setup like this would work beautifully for us.


Sure! Contact info in profile.


> 1) Google Cloud dropped us an email 10 days back that I missed out

You missed a KYC e-mail and they locked your account. If you miss a "Important information about your bank account", what do you think your bank is going to do?


As someone working at a bank: you’re going to get a phone call and a bunch of letters (on actual paper). It’s exceedingly unlikely a bank would suspend your account on missing one KYC e-mail and no further follow ups.


I agree that this was partly the OPs fault, but I think that Google did the very least.

1. Send more than one email. Three would probably be reasonable for an account shutdown action maybe at 10, 4 and 2 days.

2. Use a better subject. That subject sounds like they are giving you information. At least something like "Required Important Information about your Bank Account". But something even better would probably be like "Must Provide Information about your Bank Account to avoid Account Closure" and maybe some escalating severity in the later warnings.

3. Do they have a phone number on file? Give that a call if there is no response to the original email in 3 business days. It could even be an automated message. "Hello, this is Google Cloud. We have sent you an email about required action on your account. Please read the information in that email and act appropriately otherwise your account will be closed. Remember that all Google Cloud emails come from @google.com addresses."

4. A big banner in the Cloud Console. After 4 business days show it to all users in the account.

We are talking account shutdown of a paying customer's likely production services. They should take a bit of care.


I would try to put me in customer shoes before shutting down their business because of missing paperwork. Even a single missed payment shouldn't be enough. Assume that counterpart may e didn't mean to do wrong, it almost always works.


"I missed an email that had important in the title because I didn't think it was important, now my servers are suspended and it's all Google's fault". Makes sense, I'm sure you'll fare better with Amazon.


I’d normally agree — if you ignore an important email, the consequences are on you — but the Google verification process is very hostile and Google do constantly send emails about “important” but actually innocuous things which trains users to consider these emails as unimportant.

The OP certainly has some culpability but having bad experience of this process myself multiple times, it does feel as if Google designed it to be hostile — the emails being just one part of it.

Google Cloud is great in many ways, I love the product, but the billing is such a pain, top to bottom.


>Google do constantly send emails about “important” but actually innocuous things which trains users to consider these emails as unimportant.

Haha yea. Like the post notes, the Data Processor notifications have gotten so irritating that I had to setup a filter to junk them immediately.

GDPR caused spam being foisted on the rest of us not in the EU.


“Google sends a single email to a single inbox before killing production servers” sounds more accurate.

I’m sure they will do better with AWS because AWS isn’t stupid enough to kill everything you have after a single email warning.


My payment info needed to updated at AWS recently, they sent increasingly frequent emails about it for weeks (what can I say? I'm a bit lazy) and no services were affected while I dawdled. So, yes, AWS does it better.


More like "we gonna mess up customer's entire business because we can't bother to give a call on phone (even robocall would have been fine)". I get phone calls/sms for my phone bill/internet bill/electricity bill or even amazon deliveries. I guess a customer's business is not that important for google cloud


> "I missed an email that had important in the title because I didn't think it was important, now my servers are suspended and it's all Google's fault". Makes sense, I'm sure you'll fare better with Amazon.

No, but Google pretending that email is reliable way to inform customers about such important thing is.


It's all about expectations bound to your scale. I worked at a place that spend 7 digits with AWS a year. We missed invoices a lot, mostly because the person who could sign them off was a single point of failure. We never got suspended though, just confused emails from our account manager.


Please move to Hetzner. If you miss payments even they might pull the plug, but at least they have humans who respond to emails and can get you back online quickly once you pay.


This isn't really helpful because it's true until it isn't true or you get locked out in a way that you can't prove ownership or something.

All businesses need to at least consider BC and DR, even if they eventually decide they have to risk it. There are loads of reasons why a dependency goes down which you have no control over so even basic spreading between providers or having a less-capable but simple DR system can go miles towards having a business not get killed in this way.


As much as I love Hetzner and can recommend their legal department, they are very risk averse especially with new customers and might not even let you open an account.

Any sort of weirdness from your server will also result in a very harsh email and a threat to suspend your services in way less than 10 days. I had the pleasure to experience their grumpy network admins when IPFS decided that node discovery included dialing 900+ RFC1918 IPs.


Hetzner works the complete opposite. You might not even get to open an account :p


“Important” is a magic word that causes letters from the credit card company to go straight to the trash.


bait and switch.

It's so easy, all you need is x. Then when you're far enough in they demand a through w or you're shutdown with extreme prejudice. Do they really need it? Is that what you agreed to? Did you negotiate the change in terms. "I am altering the deal, pray I do not alter the deal further..." They don't blush. They don't flinch they don't even blink. Good luck if you really don't have another choice.

Is the whole "cloud" just a con to trap people with predatory pricing into a massive overspend?


TBF, Google definitely doesn't collect KYC data for fun. They'd much prefer not to.

But the Feds are pretty solid on the whole "Aiding and and abetting terrorism is treason" thing.


Tell them to take a hike until they come back with a warrant or at least a reasonable justifiable suspicion not dragnet surveillance. You know like google is a goddamn american company? What is this "Oh it's ok, we're just going along with the government's presumption of guilt to justify yet another power grab so that's fine."

TBF Google are big enough to display some spine if they'd "prefer not to" I'm pretty solid on that. So I flat out don't believe you when you say they'd prefer not to because if that were true, they wouldn't. Are you sure I'm wrong to think that way? Looks to me like they're quite happy to.


> Tell them to take a hike until they come back with a warrant or at least a reasonable justifiable suspicion not dragnet surveillance

The Patriot Act doesn't work that way. Congress passed laws placing some reasonable minimum requirements on financial institutions that they have some sense of the actual identity of the people they are doing business with. Those are part of the "You're incorporated and do business in the financial sector" laws, so they're not a warrant kinda thing; a company's continued authorization to do business by law is contingent upon following them.

If you want to get into the philosophy of the Patriot Act, feel free, but if you're an individual US citizen, you have more power to change it than Google does. Google can push back, but that's a good way to get hauled in front of a Senate subcommittee or have their business practices curtailed unless the law itself changes.


You talk like google pay less in lobbying cash than an individual citizen.

Google could refuse and take that s&^t to the supreme court while funding the opponent of every politician who took them on for it. That's what they do to ensure they're not getting broken up like a monopoly should be, after all. But yeah the latter is their profit, the former is merely the right thing to do. They're motivated to enhance their profit at the expense of the country, which is what out sized market power /is/ and to ignore something where they might improve the country because there's no cash in it for them.

You kind of might want to revisit this part: "if you're an individual US citizen, you have more power to change it than Google does." Because it's clearly ridiculous on every level and comically so.

Did the law really change since these guys signed up with google or did google "overlook it" to sign them, then pull the switch later without any change in law between the two dates?


A) SCOTUS would tell them to file the damn paperwork. Patriot Act has been challenged several times; it's well-crafted to be in compliance with the Constitution.

B) Google already has a PAC but PACs aren't votes, though the cynics would claim otherwise. If people want to repeal any part of the Patriot Act, they need to make it an issue. The kind of issue that shows up on the news.

C) It's hard to get purchase on changing this because know-your-customer is... Kind of a good requirement actually? Anonymity in business enables all manner of lawbreaking; sunlight is the best disinfectant, etc.


A) you don't speak for the supreme court. But sure, that can be like, your opinion.

B) The WHOLE purpose of a PAC is to galvanise public opinion in ways that suit the donor and obfuscate and de-amplify things that don't. This incredible government overreach is not an issue for google's vast spending in washington. Why? Because it's your fault and if you cared you could move the needle and get it repealed? Seriously, come on. You can't fight google and if you try you'd need a google sized budget to garner support! There's cynical and there's wilfully naive and sometimes, maybe deliberately disingenuous.

> C) It's hard to get purchase on changing this because know-your-customer is... Kind of a good requirement actually? Anonymity in business enables all manner of lawbreaking; sunlight is the best disinfectant, etc.

Yet they skipped the requirements when signing up customers, deliberately, by policy. Lower friction. Get them in first for sure! "It's so easy, just go!" The bait. Then came the switch, the outages and "It's not our fault it's the law" The law which IS unconstitutional and which hasn't changed between the bait and the switch. Google support clearly sucks very, very hard and it's not easy at all.

I wonder if one day we'll look back at "cloud" and think, "what the actual fudge? How?"


A) No doubt; I'm just a pundit. All we can do is guess based on past rulings, like the one where they upheld prevention of material support for terrorists, expert advice, or assistance (https://www.nytimes.com/2010/06/22/us/politics/22scotus.html...). Of course, they also overturned an instance where police tracking-bugged a suspect's vehicle without their consent. I assume "filing paperwork disclosing who you're doing business with" leans closer to preventing the former than doing the latter, but reasonable folks can disagree.

B) You can't fight Google with that attitude, no. ;)

... but to be less flip: You're not fighting Google. Google isn't using its PAC to advocate for KYC law. So I'm not sure what you're arguing for here. If you think Google should change their behavior, the way to act on that opinion would be to... Organize Google users to put pressure on the company. So, power's back in our court again.

> I wonder if wone day we'll look back at "cloud" and think, "what the actual fudge? How?"

Oh, that I can help with. Maintaining one's own prem for anything that gets to be maintain-one's-own-prem-size is much worse. I've carried a pager for a cloud service, but I've never piled into the car at 2AM to drive half a state over to unlock my rented cage in a shared space with a physical link to a heavy-traffic backbone so I could power-cycle a machine. Nor have I put out a physical fire in a machine room, had a power supply blow up in my face, or driven up and down and up and down dirt roads to find the place where my fiber link is down because someone thought it'd be funny to take a shotgun to a junction box.

A lot of folks who ask "why Cloud" have nearly zero experience with what came before.

(TBF, for small businesses you can maybe handle your load with a few servers under a desk. Many people in this topic have explained how that goes though, and customers expect five-nines uptime these days.)


Dunno how this happened but you and I are having two completely different conversations.

Best.


You expect not to be treated the same way by AWS? Am I missing something here?


AWS don't do shutdowns over a email for missing details.

Even when they have issue over non payment, they wait at least 3 months, gives calls, sends follow up emails and for the actual shutdown they disable the network. the servers will continue to run for sometime. so if the payment is made it comes back without many issues.


True, but this wasn't a missed-payment shutdown; it's a KYC-compliance shutdown.

Still, I'd bet even money Amazon does the messaging better than Google on something like that.


AWS would be easy to get on the phone near instantly.


It's really annoying that companies like Google bombard you with all kinds of "cover our ass" e-mails about every change they make. Most about features I don't even use. Making it even harder to understand what the impact is and know if I can safely ignore the mail.

So I can fully understand people missing some "important" Google communication because of that.


With many online services, if there is something that needs verification (like confirming your email address), every time you use the app you will see a banner at the top reminding you of this need.

Anything that is a countdown to doom should be a persistent banner on the app, reminding you of the importance and maybe even showing how many days left you have to deal with it.


To be fair, Google cloud does this.

But most people don't log in the Google cloud dashboard every day for fun.


Google is actually uniquely bad at this. The system that displays notify banners for Cloud is a too-many-cooks-in-the-kitchen mechanism that requires multiple layers of authorizations, signoff, scrutiny, and binary changes to their UI.

For that reason, it isn't (at least, didn't used to be) blocking to roll a feature out for that pipeline to work, so a lot of things that should be top-banner messages weren't.


Or maybe, make a phone call?


I got a phone call from a domain registrar a few months ago. They let me know one of my domains was expiring and they didn’t have a card on file.

I thanked them, hung up, and have since been slowly moving all my domains over to them.

I really appreciated that phone call.


With cloud orchestration tools it's pretty common to log in to the provider's web console infrequently. A banner is easy to miss, even if it's persistent.


I got a bill on GC that was less than a cent and my bank couldn't process that. GC closed my project (free tier) and then I paid the minimum amount that my bank allowed.


Reading this makes me wonder if there is room for something similar to "The Global Chubby Planned Outage". In addition to good comms, they could gradually degrade your service up until suspension as a way of getting your attention.


Suspended by valid reason, and your fault, as I could clearly see 23 words into the post.


A nice big Threadripper in the corner of your office is all you need, I can almost certainly guarantee it. There are very very few businesses with less than 1000 employees that can't be run on one modern high end server.


What happens when the power goes out? Internet goes out? Someone spills a cup of coffee on the server? Landlord needs to do some work and drywall dust flies around everywhere?

It's not that most CRUD apps could run on a single machine, it's that they probably shouldn't. Customers want redundancy, automated failover, etc. Datacenters are much better at offering these things than your office park.


You can do automated failover to Raspberry Pi, it'll hold while the main server is cared for. If you go cloud, you deserve what you get. And what you get is what you get, because it's not your computer, and it's not your data.


Someone spills a cup of coffee on the server?

Are people spilling coffee on your servers at work?


You invest in 1-2 UPSes and a 4/5G mobile modem data plan that kicks in when needed. If you really want to be sure, get a diesel generator also. Will take some time to setup but it is worth it.

- No one can see your code

- No one can shut you down


Using a colo where each instance has a backup in another colo would probably be sufficient for most.


Interestingly enough the chances of that happening are actually less than an AWS region outage :).


This always assumes these applications are built by competent people who value efficiency. The job market is full of people that think they need high scalability but then underestimate the cost of an HTTP request or the memory footprint of keeping an entire dataset in memory. It has happened several times in my career building platforms that I have to double check that packet capture I just took from that single Kubernetes node is accurate (my favorite is 22k DNS queries, all A IN s3.amazonaws.com, in a 0.2s span). Similar things happen with memory and CPU.


You lose the benefit of a CDN though (caching content on a server closer to where each individual user is accessing it)


Is it just me or is google getting a disproportionate amount of locked out issues?


Proptional to their prices, probably very proportionate. If you want better service, choose another provider and pay the price that it costs to pay 1000s of customer support staff, sometimes to only answer things like "why can't I see my EC2 instance" or "Where can I find my invoices".

Personally, I have found both Azure and AWS to be pretty responsive to Customer Service, although AWS can be quite blunt with their replies like when we tried to get them to authorise allowing us to use EC2 instances as mail relays. "No, use SES" was the long-and-short of it.

ymmv


Scale also. If this happens in just 0.00001% of cases, that's plenty of users to get one or two events a day.


If your servers were suspended, how is your website functional?

https://console.onvoard.com/register/


In the second paragraph of the article: "I managed to get our servers back online few hours ago with a workaround by adding a new billing account."


I remember when Google was the best thing because they were saving us from MS.

Now they behave like this. I doubt even MS would have done a thing like this back in the day when they were the baddies.


The author mentions they will switch to AWS. But is Amazon really any better?

Probably still face this same risk. You don’t own the hardware. You are just leasing it.


It looks like a knee- jerk reaction. Not sure if Amazon will provide any more benefits. At least they'll get some alternative hosting


Actual subject line from AWS email when account about to be terminated. Got my attention.

“Action required – Your AWS account is past due”


> Google Cloud have zero empathy for customers

Common to any large company, if you want better service, support small business.


yes this is fairly unreasonable behaviour by Google, but also, come on man. if this is your business and livelihood, then you should be at the very least scanning over all emails from your main service provider, never mind ones clearly marked as “important”


I have nothing running on GCP and still have hundreds of "important", "alert", etc. emails from their platform. Tax changes in Belarus! Advisory Notifications will become Generally Available! Spike in user reported spam!

Same reason I've almost thrown out important communications from my credit card company; they send so much "important!" mail that's actually an offer to sign up for paid credit card monitoring it starts to be a spam flag.


I personally have 2 cloud compute instances running, one with a specially requested GPU, and going back 3 months I have no emails from them marked "important", and only two that haven't directly resulted from my actions, each an update - clearly marked "[Update]" - to some element of their legal infrastructure which I suspect they're required to inform of

have you actually checked this or are you just assuming it's the case?


Why does everyone want a copy of our identity documents? It's creepy and unsafe.



They're big on shutting you down for missing an email and then not talking to you. They did it to me over 10 years ago with adsense. Also, im a big fan of you calling them out and vowing to never use them again. Thats how it works.


Better use colocated servers. You own the hardware, the app and the data.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: