I'm probably not going to hit the weekly limit, but it makes me nervous that the limit is weekly as opposed to every 36 hours or something. If I do hit the limit, that's it for the entire week—a long time to be without a tool I've grown accustomed to!
I feel like someone is going to reply that I'm too reliant on Claude or something. Maybe that's true, but I'd feel the same about the prospect of loosing ripgrep for a week, or whatever. Loosing it for a couple of days is more palatable.
Also, I find it notable they said this will affect "less than 5% of users". I'm used to these types of announcements claiming they'll affect less than 1%. Anthropic is saying that one out of every 20 users will hit the new limit.
I love Claude Code, but Anthropic’s recent messaging is all over the map.
1- “More throughput” on the API, but stealth caps in the UI
- On Jun 19 Anthropic told devs the API now supports higher per-minute throughput and larger batch sizes, touting this as proof the underlying infra is scaling. Yay!??
- A week later they roll out weekly hard stops on the $100/$200 “Max” plans — affecting up to 5 % of all users by their own admission.
Those two signals don’t reconcile. If capacity really went up, why the new choke point? I keep getting this odd visceral reaction/anticipation that each time they announce something good, we are gonna get whacked on an existing use case.
2- Sub-agents encourage 24x7 workflows, then get punished… The Sub-agent feature docs literally showcase spawning parallel tasks that run unattended.
Now the same behavior is cited as “advanced usage … impacting system capacity.”
You can’t market “let Claude handle everything in the background” and then blame users who do exactly that. You’re holding it wrong?
3 Opaqueness forces rationing (the other poster comments re: rationing vs hoarding, I can’t reconcile it being hoarding since its use it or lose it.)
There’s still no real-time meter inside Claude/CC, only a vague icon that turns red near 50%. Power users end up rationing queries because hitting the weekly wall means a seven day timeout. Thats a dark dark pattern if I’ve seen one, id think not appropriate for developer tooling. (CCusage is a helpful tool that shouldn’t be needed!)
The, you’re holding it wrong, seems so bizarre to me meanwhile all of the other signaling is about more usage, more use cases, more dependency.
I'm not sure how this will play out long term, but I really am not a fan of having to feel like I'm using a limited resource whenever I use an LLM. People like unlimited plans, we are used to them for internet, text messaging, etc. The current pricing models just feel bad.
Confused on the Max 5x vs Max 20x. I'm on the latter, and in my email it says:
> "Most Max 20x users can expect 240-480 hours of Sonnet 4 and 24-40 hours of Opus 4 within their weekly rate limits."
In this post it says:
> "Most Max 5x users can expect 140-280 hours of Sonnet 4 and 15-35 hours of Opus 4 within their weekly rate limits."
How is the "Max 20x" only an additional 5-9 hours of Opus 4, and not 4x that of "Max 5x"? At least I'd expect a doubling, since I'm paying twice as much.
200 bucks a month isn't enough. Fine. Make a plan that is enough so that I will be left alone about time limits and enforced breaks.
NOTHING breaks flow better than "Woops! Times up!"; it's worse than credit quotas -- at least then I can make a conscious decision to spend more money or not towards the project.
This whole 'twiddle your thumbs for 5 hours while the gpus cool off' concept isn't productive for me.
'35 hours' is absolutely nothing when you spawn lots of agents, and the damn thing is built to support that behavior.
It's strange that everywhere I see people just believing everything they say and blaming the users. It's not as if a big corp has ever lied before...? People are just so gullible.
Using the $20 Pro sub and for anything above Hello World project size, it's easy to hit the 5 hour window limit in just 2 hours. Most of the tokens are spent on Claude Code own stupidity and its mistakes quickly snowballing.
One user consumed tens of thousands in model usage on a $200 plan. Though we're developing solutions for these advanced use cases, our new rate limits will ensure a more equitable experience for all users while also preventing policy violations like account sharing and reselling access.
All AI companies are hitting the same thing and dealing with the same play - they don't want users to think about cost when they're prompting, so they offer high cost flat fee plans.
The reality is though there will always be a cohort of absolute power users who will push the limits of those flat fee plans to the logical extremes. Startups like Terragon are specifically engineered to help you optimize your plan usage. This causes a cat and mouse game where they have to keep lowering limits as people work around them, which often results in people thinking about price more, not less.
Cursor has adjusted their limits several times, now Anthropic is, others will soon follow as they decide to stop subsidizing the 10% of extreme power users.
Just offer metered plans that let me use the web interface.
This email could have been a lot more helpful if it read “in the following months, your account entered one of these rate limits: Aug 2024, Jan 2025, May 2025” or similar.
I have no idea if I’m in the top 5% of users. Top 1% seems sensible to rate limit, but top 5% at most SaaS businesses is the entire daily-active-users pool.
It turns out that maybe "sell at a loss and make up for it in volume" may not be a solid business strategy. The bait-and-switch continues in the AI bubble.
For most people, this is a tool we use daily. What’s the reasoning behind choosing a weekly usage limit instead of a daily one? Is it because the top 5 percent of users tend to have spiky usage on certain days, such as weekends? If that’s the case, has there been any consideration of offering different usage tiers for weekdays and weekends?
I’m just curious how this decision came about. In most cases, I’ve seen either daily or monthly limits, so the weekly model stood out.
I've gotten some very good use out of LLM's outside of standard U.S. work hours, but I often find that they are quite awful at being helpful coding assistants during my work day. I assume this is due to users competing for resources.
My issue is: a request made during peak usage is treated the same as a request made during low usage times even though I might not be able to get anything useful/helpful out of the LLM during those busy hours.
I've talked with coworkers and friends who say the same.
This isn't a problem with Claude specifically - seems to happen with all the coding assistants.
Possibly dumb suggestion, but what about adaptive limits?
Option 1: You start out bursting requests, and then slow them down gradually, and after a "cool-down period" they can burst again. This way users can still be productive for a short time without churning your servers, then take a break and come back.
Option 2: "Data cap": like mobile providers, a certain number of high requests, and after that you're capped to a very slow rate, unless you pay for more. (this one makes you more money)
Option 3: Infrastructure and network level adaptive limits. You can throttle process priority to de-prioritize certain non-GPU tasks (though I imagine the bulk of your processing is GPU?), and you can apply adaptive QoS rules to throttle network requests for certain streams. Another one might be different pools of servers (assuming you're using k8s or similar), and based on incoming request criteria, schedule the high-usage jobs to slower servers and prioritize faster shorter jobs to the faster servers.
And aside from limits, it's worth spending a day tracing the most taxing requests to find whatever the least efficient code paths are and see if you can squash them with a small code or infra change. It's not unusual for there to be inefficient code that gives you tons of extra headroom once patched.
Probably phrased to sound like little but as someone used to seeing like 99% (or, conversely, 1% down) as a bad uptime affecting lots and lots of users, this feels massive. If you have half a million users (I have no idea, just a ballpark guess), then you're saying this will affect just shy of the 25 thousand people that use your product the most. Oof!
Anthropic has been incredibly generous. I use regularly ~750 USD worth of opus tokens per month, which is a great deal for 200 USD. I’ve never hit a limit on the Max 20x plan, but the Max 5x plan was laughably limited. The impression I got was that there was basically no limiting at all, and Anthropic was just watching the usage patterns.
It’s an all you can eat buffet, you’re just not allowed takeout!
I'm well within the 95%. I might lack an imagination here, but... What are you guys doing that you hit or exceed limits so easily, and if you do... Why does it matter? Sometimes I'd like to continue exploring ideas with Claude, but once I hit the limit I make a mental note of the time it'll come back and carry on planning and speccing without it. That's fine. If anything, some time away from the slot machine often helps with ensuring I stay on course.
I don't have a problem with companies adding usage limits and whatnot, but it's shady to do for existing customers who have pre-paid for some amount of time. If I pay the yearly up-front amount, I expect my terms of use to stay the same for that entire year.
Ok I really really have to figure out how to have a local setup of the open-source LLMs. I know i know - the "fixed costs" are high. But I have a strong feeling being able to setup local LLMs (and the rig for it) is the next build-your-own-PC phase. All I want is a coding agent and the grunt power to run it locally. Everything else Il build (generate) with it.
I see so many folks claiming crazy hardware rigs and performance numbers so no idea where to begin. Any good starting points on this?
(Ok budget is TBD - but seeing a you get X for $Y would atleast help make an informed decision).
One feature I would love to have is the ability to switch the model used for a message using a shorthand like #sonnet. Often, I don't want or need opus but I don't want to engage in a 3 step process where I need to:
1. switch models using /model
2. message
3. switch back to opus using /model
Help me help you (manage usage) by allowing me to submit something like "let's commit and push our changes to github #sonnet". Tasks like these rarely need opus-level intelligence and it comes up all the time.
Well I’m ecstatic about this, purely because we can tell the “ai will kill all programming jobs by 2026” people that yes ai will remove all programming jobs but you can’t afford to pay for ai for more than two days a week and the rest of the week the ai bots will be idle and refuse to program
The 5% being ‘abusive’ limit seems high (1/20 users — that feels like an arbitrary cost cut based on customer numbers rather than objective based on costs/profit). I would have much preferred to see a scalpel applied to the abusive accounts than this — and from what I’ve seen those users should be very obvious (I’ve seen posts on Reddit with people running dozens of CC instances 24/7).
I also have to wonder how much Sub Agents and MCP are adding to the use, sub agents are brand new and won’t even be in that 95% statistic.
At the end of this email there a lot of unknowns for me (am I in the 5%, will I get cut off, am I about to see my usage increase now that I added a few sub agents?). That’s not a good place to be as a customer.
I was wondering when the free lunch for these tools was going to end. All the AI stuff has been incredibly subsidized by investors and it'll be interesting to see whay the real cost is going to be when companies like Anthropic and OpenAI need to make money.
Their business model with the pro plan is to sell a dollar for 80 cents for a while to gain market share. Once they have spent the money allocated to this plan and bring it to a close, don’t expect them to resume it in response to righteous indignation: the money will be gone. See also Uber, MoviePass etc
Some equivocation here between legitimate 'heavy use', which is obviously relative and actually referenced in this document, and 'policy violations', which are used at the rationale/justification for it.
Would be great to see how our previous months usage stacked up and when, if at all, we would have been rate limited.
I'd be pretty surprised if I were to get rate limited, but I do use it a fair amount and really have no feel for where I stand relative to other users. Am I in the top 5%? How should I know?
Fix the web interface to not be so slow. On older laptops other AI models run fine.
Claude seems to be running locally, and I see no discussion of this.
I hit some limits in Claude Desktop fairly quick and that made it unusable. Paid for the whole year but you can't get your money back if you cancel. Such a bummer.
Overall I think this is as positive - protecting the system from being hit heavily 24/7 and with multiple agents from some users might make the system more sustainable for a wider population of users.
This one thing that bugs me is the visibility of how far through your usage you are. Being told when you're close to the end means I cannot plan. I'm not expecting an exact %, but a few notices at intervals (eg: halfway through) would help a lot. Not providing this kinda makes me worry they don't want us to measure. (I don't want to closely measure, but I do want to have a sense of where I am at)
To be fair - abuse is real. This also shows that "AI" is on "VC ventilators" of greed.
Waiting for higher valuations till someone pulls the trigger for acquisition.
IPOs I don't see to be successful because not everyone gets a conman like Elon as their frontman that can consistently inflate the balloon with unrealistic claims for years.
I hit the 5 hour limit almost every work day (pro, not max).
It has become a kind of goal to hit it twice a day. It means I've had a productive day and can go on and eat food, touch grass, troll HN, read books.
I'm on Claude Code after hitting Cursor Pro for the month. It makes more sense to subscribe to a bunch of different tools at $20/month than $100/month on one tool that throws overloaded errors. We'll probably get more uptime with the weekly restriction.
I'm working on a coding agent for typescript teams and I'm curious how people would like to consume these things generallyin terms of price a predictability. I've thought through a bunch of stuff, not sure what is best... Right now I have a base fee and then a concept of credits, the base fee ($500) includes 10k credits, and the credits are tied to PRs, it works out to 100 "credits" per simple PR and 200 "credits" for a complex PR, Commit is 20 credits. 20 credits are $5. PR reviews are free.
Is this way too complicated? It feels complicated to me and I worked on it, so I presume it is?
I don't want to end up in some "you can work for X number of hours" situation that seems... not useful to engineers?
How do real world devs wanna consume this stuff and pay for it so there is some predictability and it's useful still?
Can I have an option to easily "fall back" to metered spend when hitting these hard limits? I wouldn't mind spending $5-10 on api credits to not interrupt my flow one day, and right now that means switching to another tool or logging out and re-logging in when the rate limit switches back.
They need to come up with a better way of detecting people who are actively breaking the ToS by using the Max plans as a kind of backdoor API key, because those users are obviously not using it in the way it was intended and abusing the system. Not sure how they would do that, but I'm guessing you could fingerprint the pattern of requests and see that some of the requests don't fit the expected pattern of genuine requests made by the Claude Code client.
Anyway, I've been resigned to this for a while now (see https://x.com/doodlestein/status/1949519979629469930 ) and ready to pay more to support my usage. It was really nice while it lasted. Hopefully, it's not 5x or 10x more.
I understand, but I also will not pay a subscription fee for limited service. I canceled as soon as I got this e-mail. Too bad I signed up for an annual subscription last month.
This is also exactly why I feel this industry is sitting atop a massive bubble.
I hope that this communication is not typical of the output of Claude, but if it is it should get a prize of sorts for vagueness and poor style. No way for users to find out if they are affected or not, lots of statements that carry zero information. Not impressed, to put it mildly, besides, they should have enforced their limits from day #1 as they were, not allow people to spend 10K worth of resources on a $200 plan. Now they risk those that are not even affected from re-thinking their relationship with the company.
Do not love. I used opus on api billing for some time before the new larger plans came out, so I switched. I routinely hit the opus limits in an hour or two ($100 plan). There are some tasks sonnet is good with but for many it’s worse, and sometimes subtly so.
Upshot - I will probably go back to api billing and cancel. For my use cases (once or twice a week coding binges) it’s probably cheaper and definitely less frustrating.
Why not do tiered pricing like OpenAI api? The more you consume, the more discount, but it’s never unlimited or free, so you can prevent abuse without punishing true demand.
Could this be in part because many of the recent Chinese models (which seem great, tbh) show signs of having been distilled from one or another Claude models?
Or is that a silly idea, because distillation is unlikely to be stopped by rate limits (i.e., if distillation is a worthwhile tactic, companies that want to distill from Anthropic models will gladly spend a lot more money to do it, use many, many accounts to generate syntheitc data, etc.)?
Guess the reason why they recently introduced agents? ;)
This is not a great change if you ask me. I will have to figure out how badly this affects and if needed just cancel the subscription and find an alternative.
I have been using Gemini for some time now. I switched away from Claude because I was frustrated with the rate limits and how quickly I seemed to reach them. Last week I decided to give it Claude another try so I resigned up. I linked a personal repository I am working on, prompted it for suggestions on potential refactoring recommendations and hit send. It immediately stopped and said this prompt would reach my limits. Immediately reconsidered my subscription.
I am pretty interested in what the person letting it run 24/7 is achieving. Is it a continuously processing workload of some kind that pipes into the model? Maybe a 24/7 chat support with high throughput? Very curious.
Maybe this is an unpopular opinion, but it seems like Anthropic has quietly 4x'd the real cost of the Pro plan. There are 168 hours in a week, and if I'm able to (safely) bet on 40 hours of use, realistically, I just lost 75% of the value of the plan.
What are the reasonable local alternatives? 128 GB of ram, reasonably-newish-proc, 12 GB of vram? I'm okay waitign for my machine to burn away on LLM experiments I'm running, but I don't want to simply stop my work and wake up at 3 AM to start working again..
I think they should totally do this but I think they should call it "rate-limiting" instead of "weekly limit". Seems pretty clear to me that the purpose is to avoid situations where people are running 5 background agents 24/7, not the person working during business hours normally. Reframing this makes it more clear this is about bots not users.
I think we'll see a lot more contextual engineering efforts soon. It is really inefficient to be uploading your entire codebase pretty much every request, which is what a lot of people are doing. When in reality, very few parts need the full context when programming. Although, big token doesn't seem to care, and often encourages this (including the editors).
Tools that generate code will have a lot of competition. It's good that Athropic is refining it's pricing but would have been better if users got to know their exact usage and apply own controls.
Frustrated users, who are probably using the tools the most will try other code generation tools.
Notice they didn't say 5% of Max users. Or 5% of paid users. To take it to the extreme - if the free:paid:max ratio were 400:20:1 then 5% of users would mean 100% of a tier. I can't tell what they're saying.
I am a Max 20x subscriber, and I'm not unhappy that Anthropic is putting this in place.
Claude is vital to me and I want it to be a sustainable business. I won't hit these limits myself, and I'm saving many times what I would have spent in API costs - easily among the best money I've ever spent.
I'm middle aged, spending significant time on a hobby project which may or may not have commercial goals (undecided). It required long hours even with AI, but with Claude Code I am spending more time with family and in sports. If anyone from Anthropic is reading this, I wanted to say thanks.
These resource constraints create an extremely strong incentive for customers to try all competitors…this is what makes for the best products/services. When there’s no network effects it’s a fight over algorithms and compute and fundraising ability and we actually get real competition instead of natural monopolies.
This is the most exciting business fight of our time and I’m chomping popcorn with glee.
I think Anthropic is grossly overestimating the addressable market of a CLI tool, while also falsely believing they have a durable lead right now in their model, which I’m not so sure of. Also their treatment of their partners has been…shall we say…questionable. These are huge missteps at a time they should be instead hitting the gas harder imo.
They’re getting cocky. Would love to see a competitor to swoop in and eat their lunch.
As part of the 95% here, I'm totally cool with this. I'm just a Pro plan user, but holy shit I hit problems with their service constantly. Claude is my preferred LLM currently, but sometimes during a normal 9-5, I can't use it at all due to outages, which really gets in the way while developing an MCP server.
Anthropic seems like they need to boost up their infra as well (glad they called this out), but the insane over-use can only be hurting this.
I just can't cosign on the waves of hate that all hinges on them adding additional limits to stop people from doing things like running up $1k bills on a $100 plan or similar. Can we not agree that that's abuse? If we're harping on the term "unlimited", I get the sentiment, but it's absolutely abuse and getting to the point where you're part of the 5% likely indicates that your usage is abusive. I'm sure some innocent usage will be caught in this, but it's nonsense to get mad at a business for not taking the bath on the chunk of users that are annihilating the service.
I wish you could allow some concept of rollover of credits, even if only fractional, for cases where someone has to be away for a few days and the clock is ticking with no usage.
I feel like I am constantly hitting the 5 hour limits not doing very much. I feel like I will quit using Claude Code outright if my usage is gone by Tuesday.
Okay but when will we get visibility into this other than we're at the 50% of the limit? If you're going to introduce week long limits, transparency into use is critical.
Apparently people are consistently getting thousands of dollars worth of tokens for their $200/mo sub so this was just obviously unsustainable for Anthropic.
I switched to Claude Code because of Cursor’s monthly limits.
If I run out of my ability to use Claude Code, I’m going to just switch back to Cursor
and stay there. I’m sick of these games.
If you think it’s ok, then make Anthropic dog food it by putting every employee in the pro plan and continue to tell them they must use it for their work but they can’t upgrade and see how they like it.
Anthropic has plans such as $150/user and $150/5-users-but-less-hours-per-user. I could not work out what 2 users or heavy-usage-5-users are intended to do.
There are people that will always try to steal, but there may also be those that just don't understand their pricing.
Also some people keep going forever in the same session, causing it to max out - since the whole history is sent in every request. Some prompting about things like that (your thread has gotten long..) would probably save quite a bit of usage and prevent innocent users from getting locked out for a week.
This was bound to happen at some point, but probably net-on-net won't affect most users. I think it's pretty useful for a variety of tasks, but those tend to fall into a rather narrow category (boilerplate, simple UI change requests, simple doc-strings/summaries), and there is only so much of that work which is required in a month. I certainly won't be cancelling my plan over this change, but so far I also haven't seen a reason to increase it over the hobbyist-style $20/mo plan. When I do run into usage limits, its usually already at the end of the day, or I just pivot to another task where it isn't helpful.
Vibe pricing. That's all this is. "Pay us $200/mo and get... access". There's no way to get a real usage meter (ccusage doesn't count). I want an Anthropic dashboard showing "you've used x% of your paid quota". Instead we get vibe usage. Vibe pricing. "Hey pay us money and we'll give you some level of access but like you won't know what, but don't worry only 5% of our users will trip the switches" bullshit. Someone else in this thread nailed it:
> sounds like it affects pretty much everyone who got some value out of the tool
Feels that way.
But compared to paying the so-called API pricing (hello ccusage) Claude Code Max is still a steal. I'm expecting to have to run two CC Max plans from August onwards.
Seems like their business plan is unsustainable. What's a sustainable cost model?
Say an 8xB200 server costs $500,000, with 3 years depreciation, so $166k/year costs for a server. Say 10 people share that server full time per year, so that's going to need $16k/year/person to break even, so ~$1,388/month subscription to break even at 10x users per server.
If they get it down to 100 users per server (doubt it), then they can break even at $138/month.
And all of this is just server costs...
Seems AI coding agents should be a lot more expensive going forward. I'm personally using 3-4 agents in parallel as well..
Still, it's a great problem for Anthropic to have. "Stop using our products so much or we'll raise prices!"
This explanation is so vague that it’s hard to take seriously. Anthropic has full access to usage data—they could easily identify abusive users and throttle them specifically. But they don’t. Why? Because it was never really about stopping abuse. The truth is: Anthropic can’t handle the traffic and growth, and now they’re looking for a convenient excuse to limit access and point fingers at so-called “heavy users.”
The problem is, we have no visibility into how much we’ve actually used or how much quota we have left. So we’ll just get throttled without warning—regularly. And not because we’re truly heavy users, but because that’s the easiest lever to pull.
And I suspect many of us paying $200 a month will be left wondering, “Did I use it too much? Is this my fault?” when in reality, it never was.
This is fantastic news. They're burning through too much cash too fast. They're going to have to sooner or later charge more money at which point businesses will balk at the price and the AI hype cycle will come to an end. I can't wait.
I feel rug pull after rug pull ($10->$20, hourly quotas, weekly quotas) because they can't scale and they aggressively focus on the $200+ customers and limit the lower tier to maximize profits.
If you wanted a more equitable experience for all you wouldn't just limit the high-end users, but return the money to low-end users.
Charging a low flat fee per use and still warning when certain limits hit is possible. But it's market segmentation not to do it. Just charge a flat fee, then lop off the high-end, and you maximize profit.
Let's start with stating, that Opus 4 + Sonnet 4 are a gift to humanity. Or at least to developers.
The two models are not just the best models for coding at this point (in areas like UX/UI and following instructions they are unmatched); they come package with possibly the best command line tool today.
The invite developers to use them a lot. Yet for the first time ever, I can feel how I cannot 100% fully rely on the tool and feel a lot of pressure, when using it. Not because I don't want to pay, but because the options are either:
> A) Pay $200 and be constantly warned by the system that you are close to hitting your quota (very bad UX)
> B) Pay $$$??? via the API and see how your bill grows to +$2k per month (this is me this month via Cursor)
I guess Anthropic has the great dilemma now: should they make the models more efficient to use and lower the prices to increase limits and boost usage OR should they cash in their cash cows while they can?
I am pretty sure no other models comes even close in terms of developer-hours at this point. Gemini would be my 2nd best guess, but Gemini is still lagging behind Claude, and not that good at agentic workloads.
Guess they ran into the usage limits themselves when they worked on the messaging in Claude Code: "Claude usage limit reached. Your limit will reset at 8pm (UTC)"
For the sake of saying something positive on HN, Claude Code is great. I haven't run into any limits yet. My code is quite minimalist and the output of Claude Code is also minimalist, maybe that's why.
If you work on some overengineered codebase, it will produce overengineered code; this requires more tokens.
I'm guessing less than 5% of the users are just letting Claude Code run in an autonomous loop making slop. I tried this too: and Opus 4 isn't good enough to run autonomously yet. The Rube Goldberg machine needs to be carefully calibrated.
Sincerely enjoy and appreciate Claude, feedback based on that:
- It would be nice to know if there was a way to know or infer percentage wise the amount of capacity a user is currently using (rate of usage) and has left, compared to available capacity. Being scared to use something is different than mindful.
- Since usage it can feel a little subjective/relative (simple things might use more tokens, or less, etc) to things beyond a user's usage alone, it would be nice to know how much capacity is left both on the current model and in 1 month now to learn.
- If there is lower "capacity" usage rates available at night vs the day, or just slower times, it might be worth knowing. It would help users who would like to, plan around it, compared to people who might be just making the most of it.
I only began using Claude because OpenAI was fumbling in my use cases. Whenever their public facing offering was rate limiting, or experiencing congestion, or having UX issues like their persistent "network error" in the middle of delivering a response, then I would go to Claude.
You having the same issue kills the point of using you.
afraid im in the 5%. not doing anything nefarious, just lots of parallel usage, no scripting or overnight or anything.
i just found ccusage, which is very helpful. i wish i could get it straight from the source, i dont know if i can trust it... according to this ive spent more my 200$ monthly subscription basically daily in token value.. 30x supposed cost
ive been trying to learn how to make ccode use opus for planning and sonnet for execution automatically, if anyone has a good example of this please share
Can we PLEASE fix the bug in VS Code where the terminal occasionally scrolls out of control and VS Code crashes? It is very painful and we have to start the context all over again. This happens at least 1x per day.
Tangential: Is there a similar service we can use in the cli, a replacement for CC? I like Cursor, I pay both for Cursor and CC, but. I live in the terminal (tmux, nvim, claude code, lazygit, yazi), and I prefer to have an agentic coding experience in the terminal. But CC has deteriorated so much in the past weeks that I constantly use repomix to compress whole projects and ask o3 for help because CC just can’t solve tasks that it previously would solve in a single shot.
I'm sure it's way more than 5%, ISP's pulled the same thing with bandwidth caps to shame people.
Part of the reason there is so much usage is because using claude code is like slot machine, where SOMETIMES it's right, most times it needs to rework what it did, which is convenient for them. Plus their pricing is anything but transparent as for how much usage you actually get.
I'll just go back to ChatGPT. This is not worth the headache.
I'm probably not going to hit the weekly limit, but it makes me nervous that the limit is weekly as opposed to every 36 hours or something. If I do hit the limit, that's it for the entire week—a long time to be without a tool I've grown accustomed to!
I feel like someone is going to reply that I'm too reliant on Claude or something. Maybe that's true, but I'd feel the same about the prospect of loosing ripgrep for a week, or whatever. Loosing it for a couple of days is more palatable.
Also, I find it notable they said this will affect "less than 5% of users". I'm used to these types of announcements claiming they'll affect less than 1%. Anthropic is saying that one out of every 20 users will hit the new limit.
I love Claude Code, but Anthropic’s recent messaging is all over the map.
1- “More throughput” on the API, but stealth caps in the UI - On Jun 19 Anthropic told devs the API now supports higher per-minute throughput and larger batch sizes, touting this as proof the underlying infra is scaling. Yay!?? - A week later they roll out weekly hard stops on the $100/$200 “Max” plans — affecting up to 5 % of all users by their own admission.
Those two signals don’t reconcile. If capacity really went up, why the new choke point? I keep getting this odd visceral reaction/anticipation that each time they announce something good, we are gonna get whacked on an existing use case.
2- Sub-agents encourage 24x7 workflows, then get punished… The Sub-agent feature docs literally showcase spawning parallel tasks that run unattended. Now the same behavior is cited as “advanced usage … impacting system capacity.”
You can’t market “let Claude handle everything in the background” and then blame users who do exactly that. You’re holding it wrong?
3 Opaqueness forces rationing (the other poster comments re: rationing vs hoarding, I can’t reconcile it being hoarding since its use it or lose it.)
There’s still no real-time meter inside Claude/CC, only a vague icon that turns red near 50%. Power users end up rationing queries because hitting the weekly wall means a seven day timeout. Thats a dark dark pattern if I’ve seen one, id think not appropriate for developer tooling. (CCusage is a helpful tool that shouldn’t be needed!)
The, you’re holding it wrong, seems so bizarre to me meanwhile all of the other signaling is about more usage, more use cases, more dependency.
I'm not sure how this will play out long term, but I really am not a fan of having to feel like I'm using a limited resource whenever I use an LLM. People like unlimited plans, we are used to them for internet, text messaging, etc. The current pricing models just feel bad.
Confused on the Max 5x vs Max 20x. I'm on the latter, and in my email it says:
> "Most Max 20x users can expect 240-480 hours of Sonnet 4 and 24-40 hours of Opus 4 within their weekly rate limits."
In this post it says:
> "Most Max 5x users can expect 140-280 hours of Sonnet 4 and 15-35 hours of Opus 4 within their weekly rate limits."
How is the "Max 20x" only an additional 5-9 hours of Opus 4, and not 4x that of "Max 5x"? At least I'd expect a doubling, since I'm paying twice as much.
200 bucks a month isn't enough. Fine. Make a plan that is enough so that I will be left alone about time limits and enforced breaks.
NOTHING breaks flow better than "Woops! Times up!"; it's worse than credit quotas -- at least then I can make a conscious decision to spend more money or not towards the project.
This whole 'twiddle your thumbs for 5 hours while the gpus cool off' concept isn't productive for me.
'35 hours' is absolutely nothing when you spawn lots of agents, and the damn thing is built to support that behavior.
It's strange that everywhere I see people just believing everything they say and blaming the users. It's not as if a big corp has ever lied before...? People are just so gullible.
Using the $20 Pro sub and for anything above Hello World project size, it's easy to hit the 5 hour window limit in just 2 hours. Most of the tokens are spent on Claude Code own stupidity and its mistakes quickly snowballing.
From Anthropic’s Reddit account:
One user consumed tens of thousands in model usage on a $200 plan. Though we're developing solutions for these advanced use cases, our new rate limits will ensure a more equitable experience for all users while also preventing policy violations like account sharing and reselling access.
This is why we can’t have nice things.
They need metered billing for their plans.
All AI companies are hitting the same thing and dealing with the same play - they don't want users to think about cost when they're prompting, so they offer high cost flat fee plans.
The reality is though there will always be a cohort of absolute power users who will push the limits of those flat fee plans to the logical extremes. Startups like Terragon are specifically engineered to help you optimize your plan usage. This causes a cat and mouse game where they have to keep lowering limits as people work around them, which often results in people thinking about price more, not less.
Cursor has adjusted their limits several times, now Anthropic is, others will soon follow as they decide to stop subsidizing the 10% of extreme power users.
Just offer metered plans that let me use the web interface.
This email could have been a lot more helpful if it read “in the following months, your account entered one of these rate limits: Aug 2024, Jan 2025, May 2025” or similar.
I have no idea if I’m in the top 5% of users. Top 1% seems sensible to rate limit, but top 5% at most SaaS businesses is the entire daily-active-users pool.
It turns out that maybe "sell at a loss and make up for it in volume" may not be a solid business strategy. The bait-and-switch continues in the AI bubble.
That's fine, please make it VERY CLEAR how much of my limit is left, and how much i've used.
"... and advanced usage patterns like running Claude 24/7 in the background" this is why we can't have nice things
For most people, this is a tool we use daily. What’s the reasoning behind choosing a weekly usage limit instead of a daily one? Is it because the top 5 percent of users tend to have spiky usage on certain days, such as weekends? If that’s the case, has there been any consideration of offering different usage tiers for weekdays and weekends?
I’m just curious how this decision came about. In most cases, I’ve seen either daily or monthly limits, so the weekly model stood out.
I've gotten some very good use out of LLM's outside of standard U.S. work hours, but I often find that they are quite awful at being helpful coding assistants during my work day. I assume this is due to users competing for resources.
My issue is: a request made during peak usage is treated the same as a request made during low usage times even though I might not be able to get anything useful/helpful out of the LLM during those busy hours.
I've talked with coworkers and friends who say the same.
This isn't a problem with Claude specifically - seems to happen with all the coding assistants.
Possibly dumb suggestion, but what about adaptive limits?
Option 1: You start out bursting requests, and then slow them down gradually, and after a "cool-down period" they can burst again. This way users can still be productive for a short time without churning your servers, then take a break and come back.
Option 2: "Data cap": like mobile providers, a certain number of high requests, and after that you're capped to a very slow rate, unless you pay for more. (this one makes you more money)
Option 3: Infrastructure and network level adaptive limits. You can throttle process priority to de-prioritize certain non-GPU tasks (though I imagine the bulk of your processing is GPU?), and you can apply adaptive QoS rules to throttle network requests for certain streams. Another one might be different pools of servers (assuming you're using k8s or similar), and based on incoming request criteria, schedule the high-usage jobs to slower servers and prioritize faster shorter jobs to the faster servers.
And aside from limits, it's worth spending a day tracing the most taxing requests to find whatever the least efficient code paths are and see if you can squash them with a small code or infra change. It's not unusual for there to be inefficient code that gives you tons of extra headroom once patched.
> affecting less than 5% of users
Probably phrased to sound like little but as someone used to seeing like 99% (or, conversely, 1% down) as a bad uptime affecting lots and lots of users, this feels massive. If you have half a million users (I have no idea, just a ballpark guess), then you're saying this will affect just shy of the 25 thousand people that use your product the most. Oof!
Anthropic has been incredibly generous. I use regularly ~750 USD worth of opus tokens per month, which is a great deal for 200 USD. I’ve never hit a limit on the Max 20x plan, but the Max 5x plan was laughably limited. The impression I got was that there was basically no limiting at all, and Anthropic was just watching the usage patterns.
It’s an all you can eat buffet, you’re just not allowed takeout!
I'm well within the 95%. I might lack an imagination here, but... What are you guys doing that you hit or exceed limits so easily, and if you do... Why does it matter? Sometimes I'd like to continue exploring ideas with Claude, but once I hit the limit I make a mental note of the time it'll come back and carry on planning and speccing without it. That's fine. If anything, some time away from the slot machine often helps with ensuring I stay on course.
I don't have a problem with companies adding usage limits and whatnot, but it's shady to do for existing customers who have pre-paid for some amount of time. If I pay the yearly up-front amount, I expect my terms of use to stay the same for that entire year.
Counter-take: this is a good thing.
Seems like some people are account-sharing or scripting/repackaging to such an extent that they were able to "max out" the rate limit windows.
Ultimately - this all gets priced in over time; whether that's in a subscription change or overall rate limit change, etc.
So if you want to simply use it as intended, over time stopping this kind of pattern is better for us?
Ok I really really have to figure out how to have a local setup of the open-source LLMs. I know i know - the "fixed costs" are high. But I have a strong feeling being able to setup local LLMs (and the rig for it) is the next build-your-own-PC phase. All I want is a coding agent and the grunt power to run it locally. Everything else Il build (generate) with it.
I see so many folks claiming crazy hardware rigs and performance numbers so no idea where to begin. Any good starting points on this?
(Ok budget is TBD - but seeing a you get X for $Y would atleast help make an informed decision).
One feature I would love to have is the ability to switch the model used for a message using a shorthand like #sonnet. Often, I don't want or need opus but I don't want to engage in a 3 step process where I need to:
1. switch models using /model 2. message 3. switch back to opus using /model
Help me help you (manage usage) by allowing me to submit something like "let's commit and push our changes to github #sonnet". Tasks like these rarely need opus-level intelligence and it comes up all the time.
Well I’m ecstatic about this, purely because we can tell the “ai will kill all programming jobs by 2026” people that yes ai will remove all programming jobs but you can’t afford to pay for ai for more than two days a week and the rest of the week the ai bots will be idle and refuse to program
The 5% being ‘abusive’ limit seems high (1/20 users — that feels like an arbitrary cost cut based on customer numbers rather than objective based on costs/profit). I would have much preferred to see a scalpel applied to the abusive accounts than this — and from what I’ve seen those users should be very obvious (I’ve seen posts on Reddit with people running dozens of CC instances 24/7).
I also have to wonder how much Sub Agents and MCP are adding to the use, sub agents are brand new and won’t even be in that 95% statistic.
At the end of this email there a lot of unknowns for me (am I in the 5%, will I get cut off, am I about to see my usage increase now that I added a few sub agents?). That’s not a good place to be as a customer.
Official Anthropic post on Reddit: https://www.reddit.com/r/ClaudeAI/comments/1mbo1sb/updating_...
I was wondering when the free lunch for these tools was going to end. All the AI stuff has been incredibly subsidized by investors and it'll be interesting to see whay the real cost is going to be when companies like Anthropic and OpenAI need to make money.
Their business model with the pro plan is to sell a dollar for 80 cents for a while to gain market share. Once they have spent the money allocated to this plan and bring it to a close, don’t expect them to resume it in response to righteous indignation: the money will be gone. See also Uber, MoviePass etc
Having 4 separate limits that all are opaque and can suddenly interrupt work is not ok.
We don't know what the limits are, what conditions change the limits dynamically, and we cannot monitor our usage towards the limits.
1. 5 hour limit
2. Overall weekly limit
3. Opus weekly limit
4. Monthly limit on number of 5 hour sessions
Other thread: https://news.ycombinator.com/item?id=44713837
Some equivocation here between legitimate 'heavy use', which is obviously relative and actually referenced in this document, and 'policy violations', which are used at the rationale/justification for it.
I cancelled my subscription
I'll keep openAI and they dont even let me use CLI's with it, but they're at least Honest about their offerings.
Also their app doesnt tell you to go fuck off ever, if you're Pro
Would be great to see how our previous months usage stacked up and when, if at all, we would have been rate limited.
I'd be pretty surprised if I were to get rate limited, but I do use it a fair amount and really have no feel for where I stand relative to other users. Am I in the top 5%? How should I know?
We're all going to end up with free open source/weight models, running locally, with no usage limits. This is temporary.
Darth Viber: "I am altering the deal. Pray I don't alter it any further."
Fix the web interface to not be so slow. On older laptops other AI models run fine. Claude seems to be running locally, and I see no discussion of this.
I hit some limits in Claude Desktop fairly quick and that made it unusable. Paid for the whole year but you can't get your money back if you cancel. Such a bummer.
You can use GLM 4.5 instead. It is matching Claude 4
https://openrouter.ai/z-ai/glm-4.5
It's even possible to point Claude Code CLI to it
Overall I think this is as positive - protecting the system from being hit heavily 24/7 and with multiple agents from some users might make the system more sustainable for a wider population of users.
This one thing that bugs me is the visibility of how far through your usage you are. Being told when you're close to the end means I cannot plan. I'm not expecting an exact %, but a few notices at intervals (eg: halfway through) would help a lot. Not providing this kinda makes me worry they don't want us to measure. (I don't want to closely measure, but I do want to have a sense of where I am at)
To be fair - abuse is real. This also shows that "AI" is on "VC ventilators" of greed.
Waiting for higher valuations till someone pulls the trigger for acquisition.
IPOs I don't see to be successful because not everyone gets a conman like Elon as their frontman that can consistently inflate the balloon with unrealistic claims for years.
Does this mean they will force heavy users to have multiple accounts rather than being able to extend an account with extra subscriptions?
Who does that benefit? Does number of accounts beat revenue in their investor reports?
I hit the 5 hour limit almost every work day (pro, not max).
It has become a kind of goal to hit it twice a day. It means I've had a productive day and can go on and eat food, touch grass, troll HN, read books.
I'm on Claude Code after hitting Cursor Pro for the month. It makes more sense to subscribe to a bunch of different tools at $20/month than $100/month on one tool that throws overloaded errors. We'll probably get more uptime with the weekly restriction.
I'm working on a coding agent for typescript teams and I'm curious how people would like to consume these things generallyin terms of price a predictability. I've thought through a bunch of stuff, not sure what is best... Right now I have a base fee and then a concept of credits, the base fee ($500) includes 10k credits, and the credits are tied to PRs, it works out to 100 "credits" per simple PR and 200 "credits" for a complex PR, Commit is 20 credits. 20 credits are $5. PR reviews are free.
Is this way too complicated? It feels complicated to me and I worked on it, so I presume it is?
I don't want to end up in some "you can work for X number of hours" situation that seems... not useful to engineers?
How do real world devs wanna consume this stuff and pay for it so there is some predictability and it's useful still?
Thank you. :)
Can I have an option to easily "fall back" to metered spend when hitting these hard limits? I wouldn't mind spending $5-10 on api credits to not interrupt my flow one day, and right now that means switching to another tool or logging out and re-logging in when the rate limit switches back.
They need to come up with a better way of detecting people who are actively breaking the ToS by using the Max plans as a kind of backdoor API key, because those users are obviously not using it in the way it was intended and abusing the system. Not sure how they would do that, but I'm guessing you could fingerprint the pattern of requests and see that some of the requests don't fit the expected pattern of genuine requests made by the Claude Code client.
Anyway, I've been resigned to this for a while now (see https://x.com/doodlestein/status/1949519979629469930 ) and ready to pay more to support my usage. It was really nice while it lasted. Hopefully, it's not 5x or 10x more.
I bounce off the daily limit and have to take breaks. This is no bueno for me
I wonder if this is related to the capacity/uptime issues Anthropic has had lately. I got quite a lot of errors last week!
Hopefully they sort it out and increase limits soon. Claude Code has been a game-changer for me and has quickly become a staple of my daily workflows.
I understand, but I also will not pay a subscription fee for limited service. I canceled as soon as I got this e-mail. Too bad I signed up for an annual subscription last month.
This is also exactly why I feel this industry is sitting atop a massive bubble.
To make it easier for users to know what to expect, can you release a monitor for users to run locally?
I can understand setting limits, and I'd like to be aware of them as I'm using the service rather than get hit with a week long rate limit / lockout.
the rate limits already were very low - and now they are getting even lower, wow. On a max plan I can use Opus for only a few minutes per day.
Starting to realise the business model of LLM's has some serious flaws I see.
I hope that this communication is not typical of the output of Claude, but if it is it should get a prize of sorts for vagueness and poor style. No way for users to find out if they are affected or not, lots of statements that carry zero information. Not impressed, to put it mildly, besides, they should have enforced their limits from day #1 as they were, not allow people to spend 10K worth of resources on a $200 plan. Now they risk those that are not even affected from re-thinking their relationship with the company.
Do not love. I used opus on api billing for some time before the new larger plans came out, so I switched. I routinely hit the opus limits in an hour or two ($100 plan). There are some tasks sonnet is good with but for many it’s worse, and sometimes subtly so.
Upshot - I will probably go back to api billing and cancel. For my use cases (once or twice a week coding binges) it’s probably cheaper and definitely less frustrating.
A related question: has anyone looked into secondary markets for services like this and rate limit "sharing"? Legals, technicals etc.
Why not do tiered pricing like OpenAI api? The more you consume, the more discount, but it’s never unlimited or free, so you can prevent abuse without punishing true demand.
Could this be in part because many of the recent Chinese models (which seem great, tbh) show signs of having been distilled from one or another Claude models?
Or is that a silly idea, because distillation is unlikely to be stopped by rate limits (i.e., if distillation is a worthwhile tactic, companies that want to distill from Anthropic models will gladly spend a lot more money to do it, use many, many accounts to generate syntheitc data, etc.)?
Guess the reason why they recently introduced agents? ;) This is not a great change if you ask me. I will have to figure out how badly this affects and if needed just cancel the subscription and find an alternative.
> These changes will not be applied until the start of your next billing cycle.
If I’m on annual Pro, does it mean these won’t apply to me till my annual plan renews which is several months away.
I have been using Gemini for some time now. I switched away from Claude because I was frustrated with the rate limits and how quickly I seemed to reach them. Last week I decided to give it Claude another try so I resigned up. I linked a personal repository I am working on, prompted it for suggestions on potential refactoring recommendations and hit send. It immediately stopped and said this prompt would reach my limits. Immediately reconsidered my subscription.
I am pretty interested in what the person letting it run 24/7 is achieving. Is it a continuously processing workload of some kind that pipes into the model? Maybe a 24/7 chat support with high throughput? Very curious.
Maybe this is an unpopular opinion, but it seems like Anthropic has quietly 4x'd the real cost of the Pro plan. There are 168 hours in a week, and if I'm able to (safely) bet on 40 hours of use, realistically, I just lost 75% of the value of the plan.
What are the reasonable local alternatives? 128 GB of ram, reasonably-newish-proc, 12 GB of vram? I'm okay waitign for my machine to burn away on LLM experiments I'm running, but I don't want to simply stop my work and wake up at 3 AM to start working again..
I think they should totally do this but I think they should call it "rate-limiting" instead of "weekly limit". Seems pretty clear to me that the purpose is to avoid situations where people are running 5 background agents 24/7, not the person working during business hours normally. Reframing this makes it more clear this is about bots not users.
Who was it that used it 24/7? THIS IS WHY WE CAN’T HAVE NICE THINGS!
Why is there no link to an official blogpost? I have no way to verify that this HN post is authentic.
I think we'll see a lot more contextual engineering efforts soon. It is really inefficient to be uploading your entire codebase pretty much every request, which is what a lot of people are doing. When in reality, very few parts need the full context when programming. Although, big token doesn't seem to care, and often encourages this (including the editors).
Hm. I run Claude Code in several containers, though generally only one is active at a time. I wonder if they’ll see that as account sharing?
Tools that generate code will have a lot of competition. It's good that Athropic is refining it's pricing but would have been better if users got to know their exact usage and apply own controls.
Frustrated users, who are probably using the tools the most will try other code generation tools.
We will see but I hit the limit multiple times a day so I am a bit scared that this would mean looking for alternatives.
Is this limit will also count together with Claude Chat ?
>affecting less than 5% of users
Notice they didn't say 5% of Max users. Or 5% of paid users. To take it to the extreme - if the free:paid:max ratio were 400:20:1 then 5% of users would mean 100% of a tier. I can't tell what they're saying.
These are limits for the $200/mo plan?
This is like when the all-you-can-eat buffet tells you you're only allowed to go the buffet line once.
I am a Max 20x subscriber, and I'm not unhappy that Anthropic is putting this in place.
Claude is vital to me and I want it to be a sustainable business. I won't hit these limits myself, and I'm saving many times what I would have spent in API costs - easily among the best money I've ever spent.
I'm middle aged, spending significant time on a hobby project which may or may not have commercial goals (undecided). It required long hours even with AI, but with Claude Code I am spending more time with family and in sports. If anyone from Anthropic is reading this, I wanted to say thanks.
Zero surprise. Some of you were really going nuts out there.
Then again, to scale is human
These resource constraints create an extremely strong incentive for customers to try all competitors…this is what makes for the best products/services. When there’s no network effects it’s a fight over algorithms and compute and fundraising ability and we actually get real competition instead of natural monopolies.
This is the most exciting business fight of our time and I’m chomping popcorn with glee.
I think Anthropic is grossly overestimating the addressable market of a CLI tool, while also falsely believing they have a durable lead right now in their model, which I’m not so sure of. Also their treatment of their partners has been…shall we say…questionable. These are huge missteps at a time they should be instead hitting the gas harder imo.
They’re getting cocky. Would love to see a competitor to swoop in and eat their lunch.
> affecting less than 5% of users based on current usage patterns.
How about adding ToS clause to prevent abuse? wouldn't that be better than having a statement with negative effect on the rest of 95%?
As part of the 95% here, I'm totally cool with this. I'm just a Pro plan user, but holy shit I hit problems with their service constantly. Claude is my preferred LLM currently, but sometimes during a normal 9-5, I can't use it at all due to outages, which really gets in the way while developing an MCP server.
Anthropic seems like they need to boost up their infra as well (glad they called this out), but the insane over-use can only be hurting this.
I just can't cosign on the waves of hate that all hinges on them adding additional limits to stop people from doing things like running up $1k bills on a $100 plan or similar. Can we not agree that that's abuse? If we're harping on the term "unlimited", I get the sentiment, but it's absolutely abuse and getting to the point where you're part of the 5% likely indicates that your usage is abusive. I'm sure some innocent usage will be caught in this, but it's nonsense to get mad at a business for not taking the bath on the chunk of users that are annihilating the service.
I wish you could allow some concept of rollover of credits, even if only fractional, for cases where someone has to be away for a few days and the clock is ticking with no usage.
I feel like I am constantly hitting the 5 hour limits not doing very much. I feel like I will quit using Claude Code outright if my usage is gone by Tuesday.
Okay but when will we get visibility into this other than we're at the 50% of the limit? If you're going to introduce week long limits, transparency into use is critical.
I'm the other way around. Below my rate floor for bothering to renew! Work provides AI and don't have too much time to play at the weekend.
I was actually going to $ign up this week. Now I have to study everything before committing.
Apparently people are consistently getting thousands of dollars worth of tokens for their $200/mo sub so this was just obviously unsustainable for Anthropic.
This is bad.
I switched to Claude Code because of Cursor’s monthly limits.
If I run out of my ability to use Claude Code, I’m going to just switch back to Cursor and stay there. I’m sick of these games.
If you think it’s ok, then make Anthropic dog food it by putting every employee in the pro plan and continue to tell them they must use it for their work but they can’t upgrade and see how they like it.
Anthropic has plans such as $150/user and $150/5-users-but-less-hours-per-user. I could not work out what 2 users or heavy-usage-5-users are intended to do.
There are people that will always try to steal, but there may also be those that just don't understand their pricing.
Also some people keep going forever in the same session, causing it to max out - since the whole history is sent in every request. Some prompting about things like that (your thread has gotten long..) would probably save quite a bit of usage and prevent innocent users from getting locked out for a week.
Am i missing something? Why don’t people just add an api key to Claude…are the subscription models that much better?
I don't get the idea that using more compute or running a continuous agent is considered "power user". Consuming more =! power users lol
This was bound to happen at some point, but probably net-on-net won't affect most users. I think it's pretty useful for a variety of tasks, but those tend to fall into a rather narrow category (boilerplate, simple UI change requests, simple doc-strings/summaries), and there is only so much of that work which is required in a month. I certainly won't be cancelling my plan over this change, but so far I also haven't seen a reason to increase it over the hobbyist-style $20/mo plan. When I do run into usage limits, its usually already at the end of the day, or I just pivot to another task where it isn't helpful.
Yes, yes let the normies who wait for their pizza to cook while running one prompt at a time "eat" so to speak mwuahahah #deathtopowerusers
LLM Token and usage limit anxiety aught to pair nicely with battery and range anxiety. All part of a head-healthy diet.
You can always create a second account, no?
Vibe pricing. That's all this is. "Pay us $200/mo and get... access". There's no way to get a real usage meter (ccusage doesn't count). I want an Anthropic dashboard showing "you've used x% of your paid quota". Instead we get vibe usage. Vibe pricing. "Hey pay us money and we'll give you some level of access but like you won't know what, but don't worry only 5% of our users will trip the switches" bullshit. Someone else in this thread nailed it:
> sounds like it affects pretty much everyone who got some value out of the tool
Feels that way.
But compared to paying the so-called API pricing (hello ccusage) Claude Code Max is still a steal. I'm expecting to have to run two CC Max plans from August onwards.
$400/mo here we come. To the moon yo.
Seems like their business plan is unsustainable. What's a sustainable cost model?
Say an 8xB200 server costs $500,000, with 3 years depreciation, so $166k/year costs for a server. Say 10 people share that server full time per year, so that's going to need $16k/year/person to break even, so ~$1,388/month subscription to break even at 10x users per server.
If they get it down to 100 users per server (doubt it), then they can break even at $138/month.
And all of this is just server costs...
Seems AI coding agents should be a lot more expensive going forward. I'm personally using 3-4 agents in parallel as well..
Still, it's a great problem for Anthropic to have. "Stop using our products so much or we'll raise prices!"
So we are not touching the rate limits for those that doesn’t reach them… that’s passive aggressive behavior in my opinion
Will there be a native way to track in Claude Code how close we are to hitting those weekly rate limits?
Gosh, maybe there's something I am not understandinng, but 24/7? Wow.-
PS. Ah! Of course. Agents ...
How do we know when we are close to hitting these limits? Will there be a way to check that?
I thought Claude code was pay as you go!?
The pitfalls of being beholden to 3rd parties for hardware.
How long until Anthropic, OpenAI, etc introduce surge pricing for LLM usage?
This explanation is so vague that it’s hard to take seriously. Anthropic has full access to usage data—they could easily identify abusive users and throttle them specifically. But they don’t. Why? Because it was never really about stopping abuse. The truth is: Anthropic can’t handle the traffic and growth, and now they’re looking for a convenient excuse to limit access and point fingers at so-called “heavy users.”
The problem is, we have no visibility into how much we’ve actually used or how much quota we have left. So we’ll just get throttled without warning—regularly. And not because we’re truly heavy users, but because that’s the easiest lever to pull.
And I suspect many of us paying $200 a month will be left wondering, “Did I use it too much? Is this my fault?” when in reality, it never was.
I think it’s hilarious they roll this out right after subagent introduction.
This is fantastic news. They're burning through too much cash too fast. They're going to have to sooner or later charge more money at which point businesses will balk at the price and the AI hype cycle will come to an end. I can't wait.
I think this is set to happpen. Welp
I feel rug pull after rug pull ($10->$20, hourly quotas, weekly quotas) because they can't scale and they aggressively focus on the $200+ customers and limit the lower tier to maximize profits.
If you're paying per-month why are the limits weekly?
Why don’t they just offer a $500/m plan?
If you wanted a more equitable experience for all you wouldn't just limit the high-end users, but return the money to low-end users.
Charging a low flat fee per use and still warning when certain limits hit is possible. But it's market segmentation not to do it. Just charge a flat fee, then lop off the high-end, and you maximize profit.
Let's start with stating, that Opus 4 + Sonnet 4 are a gift to humanity. Or at least to developers.
The two models are not just the best models for coding at this point (in areas like UX/UI and following instructions they are unmatched); they come package with possibly the best command line tool today.
The invite developers to use them a lot. Yet for the first time ever, I can feel how I cannot 100% fully rely on the tool and feel a lot of pressure, when using it. Not because I don't want to pay, but because the options are either:
> A) Pay $200 and be constantly warned by the system that you are close to hitting your quota (very bad UX) > B) Pay $$$??? via the API and see how your bill grows to +$2k per month (this is me this month via Cursor)
I guess Anthropic has the great dilemma now: should they make the models more efficient to use and lower the prices to increase limits and boost usage OR should they cash in their cash cows while they can?
I am pretty sure no other models comes even close in terms of developer-hours at this point. Gemini would be my 2nd best guess, but Gemini is still lagging behind Claude, and not that good at agentic workloads.
Everybody freaking out about this should just pay for API access like an adult
Guess they ran into the usage limits themselves when they worked on the messaging in Claude Code: "Claude usage limit reached. Your limit will reset at 8pm (UTC)"
Why not use the user's timezone?
Well those who need more than the limits, register a second account and pay for a second subscription... Not the end of the world
"Most Pro users can expect 40-80 hours of Sonnet 4 within their weekly rate limits."
deserved for this 5%, people get out of mind and abuse the service.
A bunch of words to just say:
We're going to punish the 5% that are using our service too much.
We are the 95%!!
It was always too good to last. I assume this is the end of the viability of the fixed price options.
Dial back the vibe :)
Lol enshittification begins [continues].
For the sake of saying something positive on HN, Claude Code is great. I haven't run into any limits yet. My code is quite minimalist and the output of Claude Code is also minimalist, maybe that's why.
If you work on some overengineered codebase, it will produce overengineered code; this requires more tokens.
I'm guessing less than 5% of the users are just letting Claude Code run in an autonomous loop making slop. I tried this too: and Opus 4 isn't good enough to run autonomously yet. The Rube Goldberg machine needs to be carefully calibrated.
Sincerely enjoy and appreciate Claude, feedback based on that:
- It would be nice to know if there was a way to know or infer percentage wise the amount of capacity a user is currently using (rate of usage) and has left, compared to available capacity. Being scared to use something is different than mindful.
- Since usage it can feel a little subjective/relative (simple things might use more tokens, or less, etc) to things beyond a user's usage alone, it would be nice to know how much capacity is left both on the current model and in 1 month now to learn.
- If there is lower "capacity" usage rates available at night vs the day, or just slower times, it might be worth knowing. It would help users who would like to, plan around it, compared to people who might be just making the most of it.
I only began using Claude because OpenAI was fumbling in my use cases. Whenever their public facing offering was rate limiting, or experiencing congestion, or having UX issues like their persistent "network error" in the middle of delivering a response, then I would go to Claude.
You having the same issue kills the point of using you.
Why not just have a usage-based pricing system that people can opt in to so that they just pay-as-they-go once they hit these plan limits?
It makes no sense to me that you would tell customers “no”. Make it easy for them to give you more money.
Play it fair, and it will be fine
Claude's vague limits is literally why im not a subscriber.
Let the normies who cook things between single prompt to finish "eat" so to speak. wmuahahaha
leveling the playing field i see lol
Ah, the beauty of price discovery.
Economists are having a field day.
Cancelled my subscription.
oh no, anyway!
wow, a team that has one nine of availability and trending downwards fast is relieving pressure. big surprise!
the party is over
afraid im in the 5%. not doing anything nefarious, just lots of parallel usage, no scripting or overnight or anything.
i just found ccusage, which is very helpful. i wish i could get it straight from the source, i dont know if i can trust it... according to this ive spent more my 200$ monthly subscription basically daily in token value.. 30x supposed cost
ive been trying to learn how to make ccode use opus for planning and sonnet for execution automatically, if anyone has a good example of this please share
Can we PLEASE fix the bug in VS Code where the terminal occasionally scrolls out of control and VS Code crashes? It is very painful and we have to start the context all over again. This happens at least 1x per day.
Tangential: Is there a similar service we can use in the cli, a replacement for CC? I like Cursor, I pay both for Cursor and CC, but. I live in the terminal (tmux, nvim, claude code, lazygit, yazi), and I prefer to have an agentic coding experience in the terminal. But CC has deteriorated so much in the past weeks that I constantly use repomix to compress whole projects and ask o3 for help because CC just can’t solve tasks that it previously would solve in a single shot.
Whenever a marketing person uses the word "unlimited", they mean: "limited".
I'm sure it's way more than 5%, ISP's pulled the same thing with bandwidth caps to shame people.
Part of the reason there is so much usage is because using claude code is like slot machine, where SOMETIMES it's right, most times it needs to rework what it did, which is convenient for them. Plus their pricing is anything but transparent as for how much usage you actually get.
I'll just go back to ChatGPT. This is not worth the headache.
It was always too good to last.
I assume this is the end of the viability of the fixed price options.
I cancelled my subscription. The enshittification is hitting this space massively already.
im really tired of all those ai players just winging it
can someone please find a conservative, sustainable business model and stick with it for a few months please instead of this mvp moving target bs
I saw this one coming. Going to make more and more people switch over to gemini.