Inference (i.e. running an already trained LLM) requires current top-end hardware, but over time that will become commodity hardware, so there's only a small perioud (perhaps the next decade) where running an LLM as a service instead of locally even makes sense, so they will likely not go away because they are too expensive, but because they are too cheap.
Inference (i.e. running an already trained LLM) requires current top-end hardware, but over time that will become commodity hardware, so there's only a small perioud (perhaps the next decade) where running an LLM as a service instead of locally even makes sense, so they will likely not go away because they are too expensive, but because they are too cheap.