I think the fact that it is not possible to put hard spending caps on API keys might be ruled illegal by some EU court soon enough, at least when they sell to consumers (given the explosion of vibecoding end-users making some apps). When I use OpenAI, Openrouter etc., I can put 10 $ on my API key, and when the key leaks, someone can use these 10 $ and that's it. With Google, there is no way to do that - there are extremely complicated "billing alerts" https://firebase.google.com/docs/projects/billing/advanced-b... , but these are time-delayed e-mails and there is no out of the box way to do the straightforward thing, which is to actually turn off the tap automatically once a budget is spent. The only native way to set a limit enforced immediately is by rate limiting - but I didn't see params which made it safe while usable in my case.
(a legal angle might be the Unfair Contract Terms Directive in the EU, though plenty of individual countries have their own laws that may apply to my understanding. A quite equivalent situation were the "bill shock" situations for mobile phone users, where people went on vacation and arrived home to an outrageously high roaming bill that they didn't understand they incurred. This is also limited today in the EU; by law, the service must be stopped after a certain charge is incurred)
> When I use OpenAI, Openrouter etc., I can put 10 $ on my API key, and when the key leaks, someone can use these 10 $ and that's it.
On that note, I'll just mention that I had discovered over the last while that when you prepay $10 into your Anthropic account, either directly, or via the newer "Extra usage" in subscription plans, and then use Claude Code, they will repeatedly overbill you, putting you into a negative balance. I actually complained and they told me that they allow the "final query" to complete rather than cutting it off mid-process, which is of course silly, because Claude Code is typically used for long sessions, where the benefit of being cut off 52% into the task rather than 51% into it is essentially meaningless.
I ended up paying for these so far, but would hope that someone with more free time sues them on it.
I'm spitballing here, but I suspect that (same with AWS) google uses post processing for billing, they run a job that scrapes the states THEN bills you for that. instead of the major AI companies are checking billing every API request coming in.
Yes, you are on the money. A cloud service provider needs to maintain reliability first and foremost, which means they won't have a runtime dependency on their billing system.
This means that billing happens asynchronously. You may use queues, you may do batching, etc. But you won't have a realtime view of the costs
>they won't have a runtime dependency on their billing system
Well, that makes sense in principle, but they obviously do have some billing check that prevents me from making additional requests after that "final query". And they definitely have some check to prevent me from overutilizing my quota when I have an active monthly subscription. So whatever it is that they need to do, when I prepay $x, I'm not ok with them charging me more than that (or I would have prepaid more). It's up to them to figure this out and/or absorb the costs.
They do have a billing check, but that check is looking at "eventually consistent" billing data which could have arbitrary delays or be checked out-of-order compared to how it occurred IRL. This is a strategy that's typically fine when the margin of over-billing is small, maybe 1% or less. I take it from your description that the actual over-billing is more like dozens of dollars, potentially more than single-digit percentages on top of the subscription price. Here's hoping they tighten up metering <> billing.
Then the right thing to do from a consumer standpoint is to factor that overbilling into their upfront pricing, rather than surprising people with bills that they were led not to expect.
> they obviously do have some billing check that prevents me from making additional requests after that "final query"
No they don't actually! They try to get close, but it's not guaranteed (for example, make that "final query" to two different regions concurrently).
Now, they could stand up a separate system with a guaranteed fixed cost, but few people want that and the cost would be higher, so it wouldn't make the money back.
You can do it on your end though: run every request sequentially through a service and track your own usage, stopping when reaching your limit.
I don't know if its still like this but around 1 year ago I set a spending limit for an OpenAI api key but it turns out its not a true limit. I spent 80$ on a 20$ limited key in the matter of minutes due to some bad code I wrote causing a looped loop.
I still had to pay it or else I wouldn't have been able to use my account.
OpenAI also does a really fun thing where prepaid credits just straight up expire after a year, which is straight up completely illegal in most (all?) of the EU.
In fact, OpenAI's "billing", "usage tracking" and "billing/spending alerts" UX all have terrible UX. They look like completely independent features.
For example, you can set alert on how much you've spent in a month, but not on how much you have left in your credit bank. So you never really know how much you can still spend unless you go check their slow and confusing UI. You can set it to auto-refill your credits and to limit that to some amount per month (I think?), but again the alerts for this are absolutely atrocious or entirely missing.
Another insane thing I've seen with OpenAI is that, for some reason, your account can be thousands in the red, and some prompts, with some models, or some feature set, still go through. I haven't been able to figure out what heuristic or rule they are using to determine when they let your request through and overbill you, or when they just deny it altogether. Maybe they let all text requests through? Or perhaps it just lets websearch requests through and denies anything else? Maybe it profiles your your most common request and lets those go through? Maybe it had something to do with specific endpoints and APIs? Who knows.
We've moved entire projects off of them in part due to these issues. We got tired of constantly being in the red without a proper notification system (actually: with an insufficient, deceitful system), or of having seemingly random drops in requests only to find out suddenly that combination of parameters got blocked. Please, just completely block me and make me pay. Or give me a better alerts system. We have the money. What we haven't got is the patience to deal with such an obtuse system
let's hope it happens soon, I'm pretty sick of this reality where companies get to charge you whatever they want and it's designed to always be your fault
You're configuring something that costs money (electricity, hardware, real estate) to provide. Either it's "pay as you go" or you have a flat rate and a cap.
If you have a cap and then your thing hits the front page and suddenly has 10000% more legitimate traffic than usual, and you want the legitimate traffic, they're going to get an error page instead of what you want. If there is no cap, you're going to get a large bill. People hate both of those things and will complain regardless of which one actually happens.
The main thing Google is screwing up here is not giving you the choice between them.
The main thing Google is screwing up is that if my API key somehow leaks and I end up with extremely out of line billing at Microsoft, I will be on the phone with a customer representative as soon as we or they notice something weird happening and a solution will be found.
Google will probably have me go through five bots and if, by some kind of miracle, I manage to have a human on the phone, they will probably explain to me that I should have read the third paragraph of the fourth page of the self service doc and it's obviously my fault.
It took me approximately 6 months to get a billing dispute resolved with Google. Somehow my maps key got leaked, and someone ran up 1.8k in charges on it.
Super, super painful. That being said, I'm still using Google for geocoding (mostly batch) because their service works better for my data.
Imagine the outrage here, when a company credit card expires and the cloud provider terminates all their instances, deletes all your storage and blob backups?
it's not an either or, they can easily let me configure any kind of behavior that I want. No cap, a hard cap, a soft cap, a cap that I program with a python script, a cap where I throttle, a cap where I opt in to deleting certain machines to save money. It can all be done. People are complaining because obvious features are not provided. People would not be complaining if they had all the options that we needed to control how to scale resources in response to load, not just technical load but also financial load.
You can already do any of those things in your own code when making the API requests. The issue here is, if you unintentionally try to make a billion expensive requests or allow someone else to do it against your account, do you want them to automatically turn off your stuff or do you want the bill that comes if they don't?
You seem to not comprehend the concept of informed choice.
Upstream in the comments someone said they expect the EU might soon rule this type of billing illegal. That doesn't mean it becomes illegal, it just means yet another reaffirmation or reminder that - yes - this is indeed illegal.
You said that no fixed response -whether that is allow unexpected billing to increase without limit upon a surge vs serving error pages- will be accepted by the clientele, because some want it one way and others want it the other way.
Why would you force a single shoe size onto a population? Give them the choice. Whenever freedom of choice is violated in the name of market freedom, it is nearly always a violation of law, it's just a matter of hoping one lives in a jurisdiction that upholds its laws
> The issue here is, if you unintentionally try to make a billion expensive requests or allow someone else to do it against your account, do you want them to automatically turn off your stuff or do you want the bill that comes if they don't?
That is precisely the choice people are asking for! And it doesn't have to be just those 2 options: let the user define their own trigger formulas for different levels of increase: a small one might result in a notification delayed until certain working hours on weekdays and log each visitors reported origin (referer header), a slightly larger one might result in a notification during awake hours regardless of weekday or workday, yet a further larger consumption increase may trigger an unconditional notification, yet a further one might trigger an unconditional notification that requires a timely confirmation by the user/organization, in the absence of which a soft measure could be taken like adding a small header to the page being served notifying visitors that while still functional a hug of death may be in progress, and asking the visitors to paste the URL of the page from where they clicked the link to your site (to make sure that a full URL can be consulted in case the host operators are unable to find the hyperlink that led to their site from merely the origin domain), yet another increase in traffic may be chosen to result in specifically rate limiting users from the originator domains that caused the peak, so that your regular visitors from the past can still make normal use of the page, and so on.
Do freedom, choice, informed choice, preparedness mean something to you?
We could have an open standard configuration textual machine readable file format for these choices and settings, so that people can share their settings, and the machine readable format could have <private> tags to wrap around phone numbers etc to notify, so that people can easily run a command line program or script that censors those exact values and replaces the first phone number like "<private><phone>(+32)474123456</phone></private>" with "<private><phone>generic phone number 1</phone></private>" and the second email address in the file like "<private><email>john.smith@nonprofit.org</email></private>" is replaced with "<private><email>generic.email@address.2</email></private>", so that people can easily export and share such files, possibly hosting it like robots.txt but say billing_policy.txt so people can inspect how others handle these situations so that popular consensus policies can form.
Hosting, compute etc. services that allow users to configure such files and have them be executed by the hosting service will be more attractive than those which don't.
> You said that no fixed response -whether that is allow unexpected billing to increase without limit upon a surge vs serving error pages- will be accepted by the clientele, because some want it one way and others want it the other way.
No, it's because people dislike both of those things and don't want either one of them, and frequently fail to realize ahead of time that choosing between them is even necessary and then get upset by whichever one actually happens.
> Why would you force a single shoe size onto a population?
Here's my original post:
> The main thing Google is screwing up here is not giving you the choice between them.
> And it doesn't have to be just those 2 options
We're talking about an API used by programmers. You don't need them to give you any of that, all you need is for the API to tell you what your current usage is -- and even that is only necessary if something other than your own code is racking up usage. When you're the one making queries and the price of each one is known ahead of time or available via the API, you can already implement any of that logic yourself.
You're oversimplifying the problem in the other direction. Fine-grained scriptability of hard limits would bump up against all of the thorny distributed systems problems. But I do agree that fixing the simple cases is straightforward - maximum spend rates per instant and per unit of time (eg per minute, hour, day, month). Providers would shoulder the small costs from the slightly-leaky assumptions they have to make to implement those limits, and users can then operate within that framework to optimize what they want on a best-effort basis (eg a script that responds within a minute to explicitly scale resources, or a human-in-the-loop notification cycle over the course of hours so that you have the possibility to say "actually this is popularity traffic that I really do want to pay for, etc).
> I'm pretty sick of this reality where companies get to charge you whatever they want and it's designed to always be your fault
But have you considered it from the companies POV? Charging whatever you like and its always the customers fault is a pretty sweet deal. Up next in the innovation pipeline is charging customers extra fees for something or other. It'll be great!
Why should I care about the companies POV? The company always wants to rat fuck everyone to make money. The company should be legally compelled to care about the customer because that's the only way these things change.
This is just the utility model. It's nothing particularly nefarious. Consider what your electric utility, your water utility, etc. do. If you use more, you pay more. If someone comes around and hooks up a garden hose to your outside faucet and steals your water, or plugs an extension cord into your outside outlet and steals your electricity, you still pay. Unless you can catch the thief and make him pay.
Funny enough, the utility business broadly wants to move away from this model to more of a cap-based prepaid model. Where I live, to get on the standard payment system may require a quite hefty deposit up front, but the prepaid payment option does not. I get the impression that, if not for customer sentiment and inertia, this would be the default option.
(a legal angle might be the Unfair Contract Terms Directive in the EU, though plenty of individual countries have their own laws that may apply to my understanding. A quite equivalent situation were the "bill shock" situations for mobile phone users, where people went on vacation and arrived home to an outrageously high roaming bill that they didn't understand they incurred. This is also limited today in the EU; by law, the service must be stopped after a certain charge is incurred)