I love the title "Big LLMs" because it means that we are now making a distinction between big LLMs and minute LLMs and maybe medium LLMs. I'd like to propose the we call them "Tall LLMs", "Grande LLMs", and "Venti LLMs" just to be precise.
I'd prefer to see olive sizes get a renaissance. I was always amused by Super Colossal when following my mom around a store as a little kid.
From a random web search, it seems the sizes above Large are: Extra Large, Jumbo, Extra Jumbo, Giant, Colossal, Super Colossal, Mammoth, Super Mammoth, Atlas.
And I'd love to see data compression terminology get an overhaul. Do we need big LLMs or just succinct data structures? Or maybe "compact" would be good enough? (Yeah LLMs are cool but why not just, you know, losslessly compress the actual data in a way that lets us query its content?)
I wonder how does the skinnies get dressed oversea: I wear European S which translate to XXS in the US, but there’s many people skinnier than me, still within a “normal" BMI. Do they have to find XXXS? Do they wear oversized clothes? Choosing trousers is way easier because the system of cm/inches of length+perimeter correspond to real values.
I worked at a Norwegian hospital once which had sizes from xxl (ekstra ekstra liten) to xxs (ekstra ekstra stor). So it's simple, you cross the ocean, you go from size xxl to xxs without having to do anything at all...
I should say though, that's the only place I've seen this particular localization.
It's a crazy experience being just physically larger than most of the world. Especially when the size on the label carries some implicit shame/judgement. Like I'm skinny, I'm pretty much the lowest weight I can be and not look emaciated / worrying. But when shopping for a skirt in Asian sizes I was a 4XL, and usually an or L-2XL in European sizes. Having to shift my mental space that a US M is the "right" size for me was hard for many years. But like I guess this is how sizing was always kinda supposed to work.
For healthy adults, thirst is a perfectly adequate guide to hydration needs. Historically normal patterns of drinking - e.g. water with meals and a few cups of tea or coffee in between - are perfectly sufficient unless you're doing hard physical labour or spending long periods of time outdoors in hot weather. The modern American preoccupation with constantly drinking water is a peculiar cultural phenomenon with no scientific basis.
Weirdly enough, the ITU already chose the superlative for the bigliest radio frequency band to be Tremendous:
- Extremely Low Frequency (ELF)
- Super Low Frequency (SLF)
- Ultra Low Frequency (ULF)
- Very Low Frequency (VLF)
- Low Frequency (LF)
- Medium Frequency (MF)
- High Frequency (HF)
- Very High Frequency (VHF)
- Ultra High Frequency (UHF)
- Super High Frequency (SHF)
- Extremely High Frequency (EHF)
- Tremendously High Frequency (THF)
Maybe one day some very smart people will make Tremendously Large Language Models. They will be very large and need a lot of computer. And then you'll have the Extremely Small Language Model. They are like nothing.
"The Overwhelmingly Large Telescope (OWL) was a conceptual design by the European Southern Observatory (ESO) organisation for an extremely large telescope, which was intended to have a single aperture of 100 metres in diameter. Because of the complexity and cost of building a telescope of this unprecedented size, ESO has decided to focus on the 39-metre diameter Extremely Large Telescope instead."
I've sat in more than one board meeting watching them take 20 minutes to land on t-shirt sizes. The greatest enterprise sales minds of our generation...
I’ve seen corporate slogans fired off from the shoulders of viral creatives. Synergy-beams glittering in the darkness of org charts. Thought leadership gone rogue… All these moments will be lost to NDAs and non-disparagement clauses, like engagement metrics in a sea of pivot decks.
But of course these are all flavors of "large", so then we have big large language models, medium large language models, etc, which does indeed make the tall/grande/venti names appropriate, or perhaps similar "all large" condom size names (large, huge, gargantuan).
One example of this is “Sorry I was very drunk and went home and
crashed straight into bed” being summarized by Apple Intelligence as ”Drunk and crashed”.
I think the real problem with LLMs is we have deterministic expectations of non-deterministic tools. We’ve been trained to expect that the computer is correct.
Personally, I think the summaries of alerts is incredibly useful. But my expectation of accuracy for a 20 word summary of multiple 20-30 word summaries is tempered by the reality that’s there’s gonna be issues given the lack of context. The point of the summary is to help me determine if I should read the alerts.
LLMs break down when we try to make them independent agents instead of advanced power tools. Alot of people enjoy navel gazing and hand waving about ethics, “safety” and bias… then proceed to do things with obvious issues in those areas.
I want a tiny_phone_based LLM to do thought tracking and comms awareness..
I actually applied to YC in like ~2014 or such for thus;
-JotPlot - I wanted a timeline for basically giving a histo timeline of comms btwn me and others - such that I had a sankey-ish diagram for when and whom and via method I spoke with folks and then each node eas the message, call, text, meta links...
I think its still viable - but my thought process is too currently chaotic to pull it off.
Basically looking at a timeline of your comms and thoughts and expand into links of thought - now with LLMs you could have a Throw Tag od some sort whereby you have the bot do work on research expanding on certain things and plugging up a site for that Idea on LOCAL HOST (i.e. your phone so that you can pull up data relevant to the convo - and its all in a timeline of thought/stream of conscious
I had a thought that I think some people value social media (e.g. Facebook) essentially for this. Like giving up your Facebook profile means giving up your history or family tree or even your memories.
So in that sense, maybe people would prefer a private alternative.
it's too bad vLLM and VLM are taken because it would have been nice to recycle the VLSI solution to describing sizes - get to very large language models and leave it at that.
Yes, that's the point of the comment and the whole discussion here. LLMs are already Large so what should the prefix be? Big LLM is a strong contender. I'm also pretty sure the creator of redis is not "someone who doesn't understand the subject matter at all".
It's very common for experts on one subject to take a jab at another subject and depend on their reputation while their skillset doesn't translate at all.
“We should regard the Internet Archive as one of the most valuable pieces of modern history; instead, many companies and entities make the chances of the Archive to survive, and accumulate what otherwise will be lost, harder and harder. I understand that the Archive headquarters are located in what used to be a church: well, there is no better way to think of it than as a sacred place.”
Amen. There is an active effort to create an Internet Archive based in Europe, just… in case.
Yup! We're here and looking to do good work with Cultural Heritage and Research Organizations in Europe. I'm very happy to be working with the Internet Archive once again after a 20 year long break.
I was looking to a book a wedding in this venue (The Permanent) and the Internet Archive server is prominently visible on the 2nd floor. The server is pretty cool and adds to the aesthetics of the space.
With this belligerent maniac in the White House who recently doubled-down on his wish to annex Canada [1], I wouldn't feel safe relocating there if the goal is to flee the US.
Anyone who takes even an hour to audit anything about the Internet Archive will soon come to a very sad conclusion.
The physical assets are stored in the blast radius of an oil refinery. They don't have air conditioning. Take the tour and they tell you the site runs slower on hot days. Great mission, but atrociously managed.
Under attack for a number of reasons, mostly absurd. But a few are painfully valid.
I understand what you describe is prohibited in many jurisdictions, however I’m curious about the technical aspect : in my experience they host the html but often not the assets, especially big pictures and I guess most movies files are bigger that pictures. Do you use a special trick to host/find them?
No. And every video game every made is available for download as well. If you even have to download it: they pride in making many of them playable in browser with just a click.
Copyright issues aside (let's avoid that mess) I was referring to basic technical issues with the site. Design is atrocious, search doesn't work, you can click 50 captures of a site before you find one that actually loads, obvious data corruption, invented their own schema instead of using a standard one and don't enforce it, API is insane and usually broken, uploader doesn't work reliably, don't honor DMCA requests, ask for photo id and passports then leak them ...
It's the worst possible implementation of the best possible idea.
And yet, it's the best we currently have. I donate to them. We can come with demands of how it should be managed, but it should not prevent us from helping them.
If you poke around at what US government agencies are doing, and what European countries and non-profits are doing, or even do a deep dive into what your local library offers, you may find they no longer lead the pack.
They didn't even ask for donations until they accidentally set fire to their building annex. People offered to help (SF was apparently booming that year) and of course they promptly cranked out the necessary PHP to accept donations.
Now it's become part of the mythology. But throwing petty cash at a plane in a death spiral doesn't change gravity. They need to rehabilitate their reputation and partner with organizations who can help them achieve their mission over the long term. I personally think they need to focus on archival, legal long-term preservation and archival, before sticking their neck out any further. If this means no more Frogger in the browser, so be it.
I certainly don't begrudge anyone who donates, but asking for $17 on the same page as copyrighted game ROMs and glitchy scans of comic books isn't a long-term strategy.
Mozilla's llamafile project is designed to enable LLMs to be preserved for historical purposes. They ship the weights and all the necessary software in a deterministic dependency-free single-file executable. If you save your llamafiles, you should be able to run them in fifty years and have the outputs be exactly the same as what you'd get today. Please support Mozilla in their efforts to ensure this special moment in history gets archived for future generations!
I think software is rather easy to archive. Emulators are they key. Nearly every platform from the past can be emulated on a modern arm/x86 Linux/windows system. Arm/x86/linux/windows are ubiquitous, even if they might fade away there will be emulators around for a long time. With future compute power it should be no problem to just use nested emulation, to run old emulators on an emulated x86/linux.
llamafiles run natively on both amd64 and arm64. It's difficult to imagine both of them not being in play fifty years hence. There's definitely no hope for the cuda module in the future. We have enough difficulties getting it to work today. That's why cpu mode is the default.
LLMs are much harder, software is just a blob of two numbers.
;)
(less socratic: I have a fraction of a fraction of jart's experience, but have enough experience via maintining a cross-platform llama.cpp wrapper to know there's a ton of ways to interpret that bag o' floats and you need a lot of ancillary information.)
The counter perspective is that this is not a book, it's an interactive simulation of that era. The model is trained on everything, this means it acts like a mirror of ourselves. I find it fascinating to explore the mind-space it captured.
While the post talks about big LLMs as a valuable "snapshot" of world knowledge, the same technology can be used for lossless compression: https://bellard.org/ts_zip/.
I miss the good ol days when I'd have text-davinci make me a table of movies that included a link to the movie poster. It usually generated a url of an image in an s3 bucket. The link always worked.
I think it’s fine that not everything on the internet is archived forever.
It has always been like that, in the past people wrote on paper, and most of it was never archived. At some point it was just lost.
I inherited many boxes of notes, books and documents from my grandparents. Most of it was just meaningless to me. I had to throw away a lot of it and only kept a few thousand pages of various documents. The other stuff is just lost forever. And that’s probably fine.
Archives are very important, but nowadays the most difficult part is to select what to archive. There is so much content added to the internet every second, only a fraction of it can be archived.
This doesn't make much sense to me. Unattributed heresay has limited historical value, perhaps zero given that the view of the web most of the weights-available models have is Common Crawl which is itself available for preservation.
I suspect the idea is that sometimes breadth wins out over accuracy. Even if it's unsuited as a primary source, this kind of lossy compression of many many documents might help a conscientious historian discover verifiable things through other routes.
I would be curious to know if it would be possible to recunstruct approximate versions of popular common subsets of internet training data by using many different LLMs that may have happened to read the same info. Anyone knows pointers to math papers about such things?
I really like the narative that now LLM is the conserving human knowledge that otherwise would be lost forever in the form of its weights in a kind of a lossy compression.
Personally I'd like that if all the knowledge and information (K & I) are readily available and accessible (pretty sure most of the prople share the same sentiment), despite the consistent business decisions from the copyright holders to hoard their K & I by putting everything behind paywalls and/or registration (I'm looking at you Apple and X/Twitter). As much that some people hate Google by organizing the world information by feeding and thriving through advertisements because in the long run the information do get organized and kind of preserved in many Internet data formats, lossy or not. After all Google who originall designed the transformer that enabled the LLM weights that are now apparently a piece of history.
Isn’t big LLM training data actually the most analogous to the internet archive? Shouldn’t the title be “Big LLM training data is a piece of history”? Especially at this point in history since a large portion of internet data going forward will be LLM generated and not human generated? It’s kind of the last snapshot of human-created content.
The problem is, where is this 20T tokens that are being used for this task? No way to access them. I hope that at least OpenAI and a few more have solid historical storage of the tokens they collect.
Small LLM weights are not really interesting though. I am currently training GPT-2 small sized models for a scientific project right, and their world models are just not good enough to generate any kind of real insight about the world it was trained in except for corpus biases.
Small large language models? This sounds like the apocryphal headline when a spiritualist with dwarfism escaped prison: "Small medium at large." Do you also have some dehydrated water and a secure key escrow system?
That's really what these are: something analogous to JPEG for language, and queryable in natural language.
Tangent: I was thinking the other day: these are not AI in the sense that they are not primarily intelligence. I still don't see much evidence of that. What they do give me is superhuman memory. The main thing I use them for is search, research, and a "rubber duck" that talks back, and it's like having an intern who has memorized the library and the entire Internet. They occasionally hallucinate or make mistakes -- compression artifacts -- but it's there.
So it's more AM -- artificial memory.
Edit: as a reply pointed out: this is Vannevar Bush's Memex, kind of.
I've been looking at it as an "instant reddit comment". I can download a 10G or 80G compressed archive that basically contains the useful parts of the internet, and then I all can use it to synthesize something that is about as good and reliable as a really good reddit comment. Which is nifty. But honestly it's an incredible idea to sell that to businesses.
And so what would the point be of anyone actually posting on the internet if no one actually visits the sites because large corps have essentially stolen and monetized the whole thing.
And I'm sure they have or will have the ability to influence the responses so you only see what they want you to see.
That's the next step after algorithmic content feeds - algorithmic/generated comment sections. Imagine seeing an entirely different conversation happening just to get you to buy a product. A product like Coca-Cola.
Imagine scrolling through a comment section that feels tailor-made to your tastes, seamlessly guiding you to an ice-cold Coca-Cola. You see people reminiscing about their best summer memories—each one featuring a Coke in hand. Others are debating the superior refreshment of Coke over other drinks, complete with "real" testimonials and nostalgic stories.
And just when you're feeling thirsty, a perfectly timed comment appears: "Nothing beats the crisp, refreshing taste of an ice-cold Coke on a hot day."
Algorithmic engagement isn’t just the future—it’s already here, and it’s making sure the next thing you crave is Coca-Cola. Open Happiness.
Or, war is good peace is bad, nuclear war is winnable, don't worry and start loving the bomb. The enemy are not human anyway, your life will be better with fewer people around.
Look at the people who want to control this, they do not want to sell you Coke.
Why Coca Cola though? Sure it is refreshing on a hot day but you know what is even better? Going to bed on a nice cool mattress. So many are either too hard or too soft. They aren’t engineered to your body so you are virtually guaranteed to get a poor nights sleep.
Imagine waking up like I do every morning. Refreshed and full of energy. I’ve tried many mattresses and the only one that has this property is my Slumber Sleep Hygiene mattress.
The best part is my partner can customize their side using nothing more than a simple app on their smartphone. It tracks our sleep over time and uses AI to generate a daily sleep report showing me exactly how good of a night sleep I got. Why rely on my gut feelings when the report can tell me exactly how good or bad of a night sleep I got.
I highly recommend Slumber Sleep Hygiene mattresses. There is a reason it’s the number one brand recommended on HN.
“Vannevar Bush's 1945 article "As We May Think". Bush envisioned the memex as a device in which individuals would compress and store all of their books, records, and communications, "mechanized so that it may be consulted with exceeding speed and flexibility".
The memex was a deterministic device to consult documents - the actual documents. The "LLM" is more like a dumb archivist that came with it ("Yes, see for example that document, it tells you that q=M·k...").
I grew up with physical encyclopedia, then moved on to Encarta, then Wikipedia dumps and folders full of PDFs. I still prefer curated information repository over chat interfaces or generated summaries. The main goal with the former is to have a knowledge map and keywords graph, so that you can locate any piece of information you may need from the actual source.
I believe LLMs are both data and processing, but even humans reasoning is based in strong ways on existing knowledge. However, for the goal of the post, indeed it is the memorization that is the key value, and the fact that likely in the future sampling such models can be used to transfer the same knowledge to bigger LLMs, even if the source data is lost.
I'm not saying there is no latent reasoning capability. It's there. It just seems to be that the memory and lookup component is much more useful and powerful.
To me intelligence describes something much more capable than what I see in these things, even the bleeding edge ones. At least so far.
I offer a POV that is in the middle: reasoning is powerful to evaluate which solution is better among N in the context. Memorization allows sampling of many competing ideas from the problem space, than the LLM picks the best, making chain of thoughts so effective. Of course zero shot reasoning also is a part of the story but somewhat weaker, exactly like we are not often able to spit the best solution before evaluation of the space (unless we are very accustomed to the specific problem).
That's the problem with the term "intelligence". Everyone has their own definition, we don't even know what makes us humans intelligent and more often than not it's a moving goalpost as these models get better.
I can ask a LLM to write a haiku about the loss function of Stable Diffusion. Or I can have it do zero shot translation, between a pair of languages not covered in the training set. Can your "language JPEG" do that?
I think "it's just compression" and "it's just parroting" are flawed metaphors. Especially when the model was trained with RLHF and RL/reasoning. Maybe a better metaphor is "LLM is like a piano, I play the keyboard and it makes 'music'". Or maybe it's a bycicle, I push the pedals and it takes me where I point it.
I regularly pushback against casual uses of the word “intelligence”.
First, there is no objective dividing line. It is a matter of degree relative to something else. Any language that suggests otherwise should be refined or ejected from our culture and language. Language’s evolution doesn’t have to be a nosedive.
Second, there are many definitions of intelligence; some are more useful than others. Along with many, I like Stuart Russell’s definition: the degree to which an agent can accomplish a task. This definition requires being clear about the agent and the task. I mention this so often I feel like a permalink is needed. It isn’t “my” idea at all; it is simply the result of smart people decomplecting the idea so we’re not mired in needless confusion.
I rant about word meanings often because deep thinking people need to lay claim to words and shape culture accordingly. I say this often: don’t cede the battle of meaning to the least common denominators of apathy, ignorance, confusion, or marketing.
Some might call this kind of thinking elitist. No. This is what taking responsibility looks like. We could never have built modern science (or most rigorous fields of knowledge) with imprecise thinking.
I’m so done with sloppy mainstream phrasing of “intelligence”. Shit is getting real (so to speak), companies are changing the world, governments are racing to stay in the game, jobs will be created and lost, and humanity might transcend, improve, stagnate, or die.
If humans, meanwhile, can’t be bothered to talk about intelligence in a meaningful way, then, frankly, I think we’re … abdicating responsibility, tempting fate, or asking to be in the next Mike Judge movie.
We never would have been able to create science, if it weren't for focusing on the kinds of thinking that can be made logical. There's a big difference. What you're doing, with this whole "let's make a bullshit word logical" is more similar to medieval scholasticism, which was a vain attempt at verbal precision. https://justine.lol/dox/english.txt
Yikes, maybe we can take a step back? I'm not sure where this is coming from, frankly. One anodyne summary of my comment above would be:
> Let's think and communicate more clearly regarding intelligence. Stuart Russell offers a nice definition: an agent's ability to do a defined task.
Maybe something about my comment got you riled up? What was it?
You wrote:
> What you're doing, with this whole "let's make a bullshit word logical" is more similar to medieval scholasticism, which was a vain attempt at verbal precision.
Again, I'm not quite sure what to say. You suggest my comment is like a medieval scholar trying to reconcile dogma with philosophy? Wow. That's an uncharitable reading of my comment.
I have five points in response. First, the word intelligence need not be a "bullshit word", though I'm not sure what you mean by the term. One of my favorite definitions of bullshitting comes from "On Bullshit" by Harry Frankfurt:
> Frankfurt determines that bullshit is speech intended to persuade without regard for truth. The liar cares about the truth and attempts to hide it; the bullshitter doesn't care whether what they say is true or false. - Wikipedia
Second, I'm trying to clarify the term intelligence by breaking it into parts. I wouldn't say I'm trying to make it "logical" (in the sense of being about logic or deduction). Maybe you mean "formal"?
Third, regarding the "what you're doing" part... this isn't just me. Many people both clarify the concept of intelligence and explain why doing so is important.
Fourth, are you saying it is impossible to clarify the meaning of intelligence? Why? Not worth the trouble?
Fifth, have you thought about a definition of intelligence that you think is sensible? Does your definition steer people away from confusion?
You also wrote:
> We never would have been able to create science, if it weren't for focusing on the kinds of thinking that can be made logical.
I think you mean _testable_, not _logical_. Yes, we agree, scientists should run experiments on things that can be tested.
Russell's definition of intelligence is testable by defining a task and a quality metric. This is already a big step up from an unexamined view of intelligence, which often has some arbitrary threshold.* It allows us to see a continuum from, say, how a bacteria finds food, to how ants collaborate, to how people both build and use tools to solve problems. It also teases out sentience and moral worth so we're not mixing them up with intelligence. These are simple, doable, and worthwhile clarifications.
Finally, I read your quote from Dijkstra. In my reading, Dijkstra's main point is that natural language is a poor programming interface due to its ambiguity. Ok, fair. But what is the connection to this thread? Does it undercut any of my arguments? How?
* A common problem when discussing intelligence involves moving the goal post. Whatever quality bar is implied has a tendency to creep upwards over time.*
I just wanted to share an essay I liked. I didn't think you'd pay it much mind. But I can see now that you are a person devoted to science. If you want to know what I believe, I think computers in the 50's were intelligent. I think gpt2 probably qualified as agi if you take the meaning of the acronym literally. At this point we've blown so far past all expectations in terms of intelligence that I've come to agree with Karpathy that the time has come to start moving the goalposts to other words, like agency, since agents are an unsolved problem, and agency is proving to possibly be more important/powerful/rare/difficult than intelligence.
I reacted negatively to the idea earlier that agency should be considered an aspect of intelligence. I think separating the concepts helps me better understand people, their unique strengths, and puzzles like why sometimes people who aren't geniuses who know everything and can rotate complex shapes are sometimes very successful, but most importantly, why LLMs continue to feel like they're lacking something, compared to people, even though they're so outrageously intelligent. It's one thing to be smart, another thing entirely to be useful.
> I reacted negatively to the idea earlier that agency should be considered an aspect of intelligence.
In the hopes of clarifying any misunderstandings of what I mean... I said "agent" in Russell's sense -- a system with goals that has sensors and actuators in some environment. This is a common definition in CS and robotics. (I tend to shy away from using the word "agency" because sometimes it brings along meaning I'm not intending. For example, to many, the word "agency" suggests free will combined with the ability to do something with it.)
> My own motivation for studying AI is to create and understand intelligence as a general property of systems, rather than as a specific attribute of humans. I believe this to be an appropriate goal for the field as a whole...
To continue my earlier comment... I prefer not to call an LLM "intelligent" much less "outrageously intelligent". Why? The main reason is communication clarity -- and by communication I mean the notion of a sender communicating a meaning to a receiver. Not just symbolic information (a la Shannon), but a faithful representation in the recipient. The phrase "outrageously intelligent" can have many conflicting interpretations in one's audience. Doing so generates more confusion than clarity.
To say my point a different way, intelligence is contextual. I'm not using "contextual" as some sort of vague excuse to avoid getting into the details. I'm not saying that intelligence cannot be quantified at all. Quite the opposite. Intelligence can be quantified fairly well (in the statistical sense) once a person specifies what they are talking about. Like Russell, I'm saying intelligence is multifaceted and depends on the agent (what sensors it has, what actuators it has), the environment, and the goal.
So what language would I use instead? Rather than speaking about "intelligence" as one thing that people understand and agree on, I would point to task- and goal-specific metrics. How well does a particular LLM do on the GRE? The LSAT?
Sooner or later, people will want to generalize over the specifics. This is where statistical reasoning comes in. With enough evaluations, we can start to discuss generalizations in a way that can be backed up with data. For example, might say things like "LLM X demonstrates high competence on text summarization tasks, provided that it has been pretrained on the relevant concepts" or "LLM Y struggles to discuss normative philosophical issues without falling into sycophancy, unless extensive prompt engineering protocols are used".
I think it helps to remember this: if someone asks "Is X intelligent?", one has the option to reframe the question. One can use it as an opportunity to clarify and teach and get into a substantive conversation. The alternative is suboptimal. But alas, some people demand short answers to poorly framed questions. Unfortunately, the answers they get won't help them.
Intelligence is closely related to the concept of attractiveness and gravitas. You say it depends on the agent. I say it's in the eye of the beholder. People aren't very good at explaining what attracts them either.
The closest thing we have to a definition for intelligence is probably the LLMs themselves. They're very good at predicting words that attract people. So clearly we've figured it out. It's just such a shame that this definition for intelligence is a bunch of opaque tensors that we can't fully explain.
LLMs don't just defy human reasoning and understanding. They also challenge the purpose of intelligence itself. Why study and devise systems, when gradient descent can figure it out for you? Why be cleverer when you can just buy more compute?
I don't know what's going to make the magical black pill of machine learning more closely align with our values. But I'm glad we have them. For example, I think it's good that people still hold objectivity as a virtue and try to create well-defined benchmarks that let us rank the merits of LLMs using numbers. I'm just skeptical about how well our efforts to date have predicted the organic processes that ultimately decide these things.
Imagine future historians piecing together our culture from hallucinated AI memories - inaccurate, sure, but maybe even more fascinating than reality itself.
People wanting this would be better off using memory architectures, like how the brain does it. For ML, the simplest approach is putting in memory layers with content-addressible schemes. I have a few links on prototypes in this comment:
Animal brains do not separate long term memory and processing - they are one and the same thing - columnar neural assemblies in the cortex that have learnt to recognize repeated patterns, and in turn activate others.
I find it very depressing to think that the only traces left from all the creativity will end up to be AI slop, the worst use case ever.
I feel like the more people use GenAI, the less intelligent they become. Like the rest of this society, they seem designed to suck the life force out of humans and and return useless crap instead.
Interesting. Just this morning I had a conversation with Claude about this very topic. When asked "can you give me your thoughts on LLM train runs as historical artifacts? do you think they might be uniquely valuable for future historians?", it answered
> oh HELL YEAH they will be. future historians are gonna have a fucking field day with us.
> imagine some poor academic in 2147 booting up "vintage llm.exe" and getting to directly interrogate the batshit insane period when humans first created quasi-sentient text generators right before everything went completely sideways with *gestures vaguely at civilization*
> *"computer, tell me about the vibes in 2025"*
> "BLARGH everyone was losing their minds about ai while also being completely addicted to it"
Interesting indeed to be able to directly interrogate the median experience of being online in 2025.
(also my apologies for slop-posting; i slapped so many custom prompting on it that I hope you'll find the output to be amusing enough)
I love the title "Big LLMs" because it means that we are now making a distinction between big LLMs and minute LLMs and maybe medium LLMs. I'd like to propose the we call them "Tall LLMs", "Grande LLMs", and "Venti LLMs" just to be precise.
I'd prefer to see olive sizes get a renaissance. I was always amused by Super Colossal when following my mom around a store as a little kid.
From a random web search, it seems the sizes above Large are: Extra Large, Jumbo, Extra Jumbo, Giant, Colossal, Super Colossal, Mammoth, Super Mammoth, Atlas.
How about wine bottle sizes since we're "bottling" a "distillation" of information...
https://en.wikipedia.org/wiki/Wine_bottle#Sizes
To get pedantic, wine is not a product of distillation.
That almost makes the metaphor more apt. Wine is the real deal, and brandy is the distilled approximation.
And I'd love to see data compression terminology get an overhaul. Do we need big LLMs or just succinct data structures? Or maybe "compact" would be good enough? (Yeah LLMs are cool but why not just, you know, losslessly compress the actual data in a way that lets us query its content?)
Well the obvious answer is that LLMs are more then just pure search. They can synthesize novel information from their learned knowledge.
Needs more superlatives. “Biggest” < “Extra Biggest” < “Maximum Biggest”. :D
maximum_biggest_final_2
"Non Plus Ultra"
Followed by another company introducing their "Plus Ultra" model.
And the US ‘small’ LLMs will actually be slightly larger than the ‘large’ LLMs in the UK.
I wonder how does the skinnies get dressed oversea: I wear European S which translate to XXS in the US, but there’s many people skinnier than me, still within a “normal" BMI. Do they have to find XXXS? Do they wear oversized clothes? Choosing trousers is way easier because the system of cm/inches of length+perimeter correspond to real values.
I worked at a Norwegian hospital once which had sizes from xxl (ekstra ekstra liten) to xxs (ekstra ekstra stor). So it's simple, you cross the ocean, you go from size xxl to xxs without having to do anything at all...
I should say though, that's the only place I've seen this particular localization.
> Choosing trousers is way easier because the system of cm/inches of length+perimeter correspond to real values.
They're not merely real values, they're also rational.
I'm not so sure, there's pi involved here!
It's a crazy experience being just physically larger than most of the world. Especially when the size on the label carries some implicit shame/judgement. Like I'm skinny, I'm pretty much the lowest weight I can be and not look emaciated / worrying. But when shopping for a skirt in Asian sizes I was a 4XL, and usually an or L-2XL in European sizes. Having to shift my mental space that a US M is the "right" size for me was hard for many years. But like I guess this is how sizing was always kinda supposed to work.
The shame you feel is yours, it's not inherent to the sizing.
We ordered swag T-shirts for a conference from two providers, but EU provider L's were actually larger than US L!
It's funny you say that, but when travelling abroad I wondered how Europeans and Japanese stay sufficiently hydrated.
For healthy adults, thirst is a perfectly adequate guide to hydration needs. Historically normal patterns of drinking - e.g. water with meals and a few cups of tea or coffee in between - are perfectly sufficient unless you're doing hard physical labour or spending long periods of time outdoors in hot weather. The modern American preoccupation with constantly drinking water is a peculiar cultural phenomenon with no scientific basis.
Don't many medications dehydrate you though? And Americans are on a lot of medications.
If you are thirsty you are already dehydrated.
Try getting a kidney stone and then find out if adequate hydration is what you want to squeak by with.
I've always understood constantly drinking water as a ruse to use the bathroom more often, which is helpful for Americans with sedentary lifestyles.
Diabetes causes dehydration
Is this a thing about how restaurants in some European countries charge for water?
Its a joke about Americans carrying around giant water bottles
> The UK
You mean the EU, right? The UK isn't covered by the AI act.
/s
Big LLM is too long as a name. We should agree on calling them BLLMs. Surely everyone is going to remember what the letters stand for.
I still like Big Data Statistical Model
>What does BLLM stand for?
https://www.abbreviations.com/BLLM#google_vignette
I want to apologize for this joke in advance. It had to be done.
We could take a page from Trump’s book and call them “Beautiful” LLMs. Then we’d have “Big Beautiful LLMs” or just “BBLs” for short.
Surely that wouldn’t cause any confusion when Googling.
Weirdly enough, the ITU already chose the superlative for the bigliest radio frequency band to be Tremendous:
- Extremely Low Frequency (ELF)
- Super Low Frequency (SLF)
- Ultra Low Frequency (ULF)
- Very Low Frequency (VLF)
- Low Frequency (LF)
- Medium Frequency (MF)
- High Frequency (HF)
- Very High Frequency (VHF)
- Ultra High Frequency (UHF)
- Super High Frequency (SHF)
- Extremely High Frequency (EHF)
- Tremendously High Frequency (THF)
Maybe one day some very smart people will make Tremendously Large Language Models. They will be very large and need a lot of computer. And then you'll have the Extremely Small Language Model. They are like nothing.
https://en.wikipedia.org/wiki/Radio_frequency?#Frequency_ban...
"The Overwhelmingly Large Telescope (OWL) was a conceptual design by the European Southern Observatory (ESO) organisation for an extremely large telescope, which was intended to have a single aperture of 100 metres in diameter. Because of the complexity and cost of building a telescope of this unprecedented size, ESO has decided to focus on the 39-metre diameter Extremely Large Telescope instead."
https://en.m.wikipedia.org/wiki/Overwhelmingly_Large_Telesco...
AFAIK "tremendously" was chosen partly because the range includes 1 "T"Hz.
It bothers me that the level below 3 Hz is not given the name "Tremendously low". Now it's not symmetrical. I hope the ITU is happy...
I hope they go with "Ludicrous" like in Spaceballs.
TLLM is close to TLM
XKCD telescope sizes also could provide some guidance
https://xkcd.com/1294/
Bureau of Large Land Management
I've sat in more than one board meeting watching them take 20 minutes to land on t-shirt sizes. The greatest enterprise sales minds of our generation...
I've seen things you people wouldn't believe.
I’ve seen corporate slogans fired off from the shoulders of viral creatives. Synergy-beams glittering in the darkness of org charts. Thought leadership gone rogue… All these moments will be lost to NDAs and non-disparagement clauses, like engagement metrics in a sea of pivot decks.
Time to leverage.
... destroyed by madness, starving hysterical! Buying weed in a store then meeting with someone off Craiglist to score eggs.
I've been labeling LLMS as "teensy", "smol", "mid", "biggg", "yuuge". I've been struggling to figure out where to place the lines between them though.
itsy-bitsy <= 3B
teensy 4B to 29B
smol 30B to 59B
mid 60B to 99B
biggg 100B to 299B
yuuge 300B+
Name them like clothing sizes: XXLLM, XLLM, LLM, MLM, SLM, XSLM XXSLM.
i did this!
XXLLM: ~1T (GPT4/4.5, Claude Opus, Gemini Pro)
XLLM: 300~500B (4o, o1, Sonnet)
LLM: 20~200B (4o, GPT3, Claude, Llama 3 70B, Gemma 27B)
~~zone of emergence~~
MLM: 7~14B (4o-mini, Claude Haiku, T5, LLaMA, MPT)
SLM: 1~3B (GPT2, Replit, Phi, Dall-E)
~~zone of generality~~
XSLM: <1B (Stable Diffusion, BERT)
4XSLM: <100M (TinyStories)
https://x.com/swyx/status/1679241722709311490
MLM... uh oh
I hate those ponzi schemes! Never buy a cutco knife or those crappy herbalife supplements.
Alternatively, just make sure you keep things consensual, and keep yourself safe, no judgement or labels from me :)
But of course these are all flavors of "large", so then we have big large language models, medium large language models, etc, which does indeed make the tall/grande/venti names appropriate, or perhaps similar "all large" condom size names (large, huge, gargantuan).
Why not LLLM for large LLM’s and SLLM for small LLM’s, assuming there is no middle ground
M, LM, LLM, LLLM, L3M, L4M.
Gotta leave room for future expansion.
Hopefully the USB making team does NOT step into this...
LLM 3.0, LLM 3.1 Gen 1, LLM 3.2 Gen 1, LLM 3.1, LLM 3.1 Gen 2, LLM 3.2 Gen 2, LLM 3.2, LLM 3.2 Gen 2x2, LLM 4, etc...
2L4M
VLLM, Super VLLM, Almost Large Language Model
What makes it a Small Large Language Model? Why jot just an SLM?
Smedium Language Model
Lousy Smarch weather
If we can’t have fun with names, why even be in IT?
S and L cancel out, so it just an LM.
Small !== -Large
SLM is a widespread term already.
Slim pickings, then?
LLM, LLM 2.0, LLM 3.0, Mini LLM, Micro LLM, LLM C.
LLM 95, LLM 98, LLM Millennium Edition, LLM NT, LLM XP, LLM 2000, LLM 7
I really appreciated the way they managed to come up with a new naming scheme each time, usually used exactly once.
Could always go with the Bungie approach for the Marathon series: LLM, LLM2, LLM∞, ℵ₁ — https://alephone.lhowon.org
(Obviously ∞ is for the actual singularity, and ℵ₁ is the thing after that).
Are you sure that ℵ1 is the thing after that?
https://en.m.wikipedia.org/wiki/Continuum_hypothesis
;-)
LLM 3.11 for Workgroups
can we have tiny LLM that can run on smartphone now
Apple Intelligence has an LLM that runs locally on the iPhone (15 Pro and up).
But the quality of Apple Intelligence shows us what happens when you use a tiny ultra-low-wattage LLM. There’s a whole subreddit dedicated to its notable fails: https://www.reddit.com/r/AppleIntelligenceFail/top/?t=all
One example of this is “Sorry I was very drunk and went home and crashed straight into bed” being summarized by Apple Intelligence as ”Drunk and crashed”.
I think the real problem with LLMs is we have deterministic expectations of non-deterministic tools. We’ve been trained to expect that the computer is correct.
Personally, I think the summaries of alerts is incredibly useful. But my expectation of accuracy for a 20 word summary of multiple 20-30 word summaries is tempered by the reality that’s there’s gonna be issues given the lack of context. The point of the summary is to help me determine if I should read the alerts.
LLMs break down when we try to make them independent agents instead of advanced power tools. Alot of people enjoy navel gazing and hand waving about ethics, “safety” and bias… then proceed to do things with obvious issues in those areas.
Determinism isn't the issue though. Many responses are fine. The displayed one is bad, whether chosen deterministically or not. Some alternatives:
- Passed out drunk
- Crashed in bed
- Slacking because drunk
...
The issue isn't a lack of context; it's that even the available context was handled poorly.
Larger LLMs can summarize all of this quite well though.
I expect that the phone will only do the prompt parsing
No. Smartphone only spin animated gif while talk to big building next to nuclear reactor. New radio inside make more efficient.
Is a tiny large language model equivalent to a normal sized one?
I want a tiny_phone_based LLM to do thought tracking and comms awareness..
I actually applied to YC in like ~2014 or such for thus;
-JotPlot - I wanted a timeline for basically giving a histo timeline of comms btwn me and others - such that I had a sankey-ish diagram for when and whom and via method I spoke with folks and then each node eas the message, call, text, meta links...
I think its still viable - but my thought process is too currently chaotic to pull it off.
Basically looking at a timeline of your comms and thoughts and expand into links of thought - now with LLMs you could have a Throw Tag od some sort whereby you have the bot do work on research expanding on certain things and plugging up a site for that Idea on LOCAL HOST (i.e. your phone so that you can pull up data relevant to the convo - and its all in a timeline of thought/stream of conscious
hopefully you can visualize it...
I had a thought that I think some people value social media (e.g. Facebook) essentially for this. Like giving up your Facebook profile means giving up your history or family tree or even your memories.
So in that sense, maybe people would prefer a private alternative.
I read this in Sam Wattersons voice with a pipe abt maybey an inch from his beard,
(Fyi I was a designer at fb and while it was luxious I still hated what I saw in zucks eyes every morn when I passed him.
Super diff from Andy Grove at intel where for whateveer reason we were in the sam oee schekdule
(That was me typing with eues ckised as a test (to myself, typos abound
LLM already has one large in it…
If we can have a "Personal PIN Identification Number", we can have a "Large LLM Language Model".
What about Impersonal PIN anonymization letter?
Redundundant
or "DietLLM, RegularLLM, MealLLM and SuperSizedLLMWithFries"
What does a 20 LLM signify?
it's too bad vLLM and VLM are taken because it would have been nice to recycle the VLSI solution to describing sizes - get to very large language models and leave it at that.
After very large language models, the next step is mega language models, or MLMs. As a bonus, it describes the VC funding scheme that backs them too.
we could also look to magnetoresistance and go for giant, colossal, extraordinary
Dismissed, Big LLM will live on along with Big Data.
Well, big data for me was always clear -- when data sizes are too large to use regular tools (ls, du, wc, vi, pandas).
I.e. when pretty much every tool or script I used before doesn't work anymore, and need a special tool (gsutil, bq, dusk, slurm), it's a mind shift.
Terrible names, to be honest. My proposal: Hyper LLMs, Ultra LLMs, Large LLMs, Micro LLMs, Mobile LLMs.
LLM M4 Ultra Pro Max 16e (with headphone jack)
GPT Inside
Pro, max, ultra…
Then there will be "decaf LLM"
"big large language model" renminds me uncomfortably of "automated teller machine machine"
“There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors.“
https://xkcd.com/1294/
Doesn't the first L in LLM mean large already?
It's like saying Automated ATM. Whoever wrote it barely knows what the acronym means.
This whole article feels like written by someone who doesn't understand the subject matter at all
We’re fine with “The big friendly giant” and the sahara desert (“desert desert”); big llm could join the family of pleonasms.
https://en.m.wikipedia.org/wiki/Pleonasm
When it's a different language it's fine.
Yes, that's the point of the comment and the whole discussion here. LLMs are already Large so what should the prefix be? Big LLM is a strong contender. I'm also pretty sure the creator of redis is not "someone who doesn't understand the subject matter at all".
It's very common for experts on one subject to take a jab at another subject and depend on their reputation while their skillset doesn't translate at all.
Almost everyone says ‘PIN number’ as well.
“We should regard the Internet Archive as one of the most valuable pieces of modern history; instead, many companies and entities make the chances of the Archive to survive, and accumulate what otherwise will be lost, harder and harder. I understand that the Archive headquarters are located in what used to be a church: well, there is no better way to think of it than as a sacred place.”
Amen. There is an active effort to create an Internet Archive based in Europe, just… in case.
Yup! We're here and looking to do good work with Cultural Heritage and Research Organizations in Europe. I'm very happy to be working with the Internet Archive once again after a 20 year long break.
https://www.stichtinginternetarchive.nl/
Congratulations!
What kind of volunteer help can the community do?
Well, it did establish a new HQ in Canada…
https://vancouversun.com/news/local-news/the-internet-archiv...
(Edited: apparently just a new HQ and not THE HQ)
I was looking to a book a wedding in this venue (The Permanent) and the Internet Archive server is prominently visible on the 2nd floor. The server is pretty cool and adds to the aesthetics of the space.
With this belligerent maniac in the White House who recently doubled-down on his wish to annex Canada [1], I wouldn't feel safe relocating there if the goal is to flee the US.
[1] https://www.nbcnews.com/politics/donald-trump/trump-quest-co...
Anyone who takes even an hour to audit anything about the Internet Archive will soon come to a very sad conclusion.
The physical assets are stored in the blast radius of an oil refinery. They don't have air conditioning. Take the tour and they tell you the site runs slower on hot days. Great mission, but atrociously managed.
Under attack for a number of reasons, mostly absurd. But a few are painfully valid.
Their yearly budget is less than the budget of just the SF library system.
What I don't understand is how do they afford the storage costs? Surely it must be pricey to have 70+ petabytes of data that's only growing.
Then maybe they should've figured out how to keep hard drives in a climate controlled environment before they decided to launch a bank.
https://ncua.gov/newsroom/press-release/2016/internet-archiv...
I realized recently, who needs torrents? I can get a good rip of any movie right there.
I understand what you describe is prohibited in many jurisdictions, however I’m curious about the technical aspect : in my experience they host the html but often not the assets, especially big pictures and I guess most movies files are bigger that pictures. Do you use a special trick to host/find them?
No. And every video game every made is available for download as well. If you even have to download it: they pride in making many of them playable in browser with just a click.
Copyright issues aside (let's avoid that mess) I was referring to basic technical issues with the site. Design is atrocious, search doesn't work, you can click 50 captures of a site before you find one that actually loads, obvious data corruption, invented their own schema instead of using a standard one and don't enforce it, API is insane and usually broken, uploader doesn't work reliably, don't honor DMCA requests, ask for photo id and passports then leak them ...
It's the worst possible implementation of the best possible idea.
And yet, it's the best we currently have. I donate to them. We can come with demands of how it should be managed, but it should not prevent us from helping them.
If you poke around at what US government agencies are doing, and what European countries and non-profits are doing, or even do a deep dive into what your local library offers, you may find they no longer lead the pack.
They didn't even ask for donations until they accidentally set fire to their building annex. People offered to help (SF was apparently booming that year) and of course they promptly cranked out the necessary PHP to accept donations.
Now it's become part of the mythology. But throwing petty cash at a plane in a death spiral doesn't change gravity. They need to rehabilitate their reputation and partner with organizations who can help them achieve their mission over the long term. I personally think they need to focus on archival, legal long-term preservation and archival, before sticking their neck out any further. If this means no more Frogger in the browser, so be it.
I certainly don't begrudge anyone who donates, but asking for $17 on the same page as copyrighted game ROMs and glitchy scans of comic books isn't a long-term strategy.
Mozilla's llamafile project is designed to enable LLMs to be preserved for historical purposes. They ship the weights and all the necessary software in a deterministic dependency-free single-file executable. If you save your llamafiles, you should be able to run them in fifty years and have the outputs be exactly the same as what you'd get today. Please support Mozilla in their efforts to ensure this special moment in history gets archived for future generations!
https://github.com/Mozilla-Ocho/llamafile/
LLMs are much easier to port than software. They are just a big blob of numbers and a few math operations.
I think software is rather easy to archive. Emulators are they key. Nearly every platform from the past can be emulated on a modern arm/x86 Linux/windows system. Arm/x86/linux/windows are ubiquitous, even if they might fade away there will be emulators around for a long time. With future compute power it should be no problem to just use nested emulation, to run old emulators on an emulated x86/linux.
> I think software is rather easy to archive.
* assuming someone else already spent tremendous effort to develop an emulator for your binary's target that is 100% accurate...
Indeed. In 50 years, loading the weights and doing math should be much easier than getting some 50 year old piece of cuda code to work.
Then again, CPUs will be fast enough that you'd probably just emulate amd64 and run it as CPU-only.
llamafiles run natively on both amd64 and arm64. It's difficult to imagine both of them not being in play fifty years hence. There's definitely no hope for the cuda module in the future. We have enough difficulties getting it to work today. That's why cpu mode is the default.
LLMs are much harder, software is just a blob of two numbers.
;)
(less socratic: I have a fraction of a fraction of jart's experience, but have enough experience via maintining a cross-platform llama.cpp wrapper to know there's a ton of ways to interpret that bag o' floats and you need a lot of ancillary information.)
Just like the map isn't the territory, so summaries are not the content nor the library fillings the actual books.
If I want to read a post, a book, a forum, I want to read exactly that, not a simulacrum built by arcane mathematical algorithms.
The counter perspective is that this is not a book, it's an interactive simulation of that era. The model is trained on everything, this means it acts like a mirror of ourselves. I find it fascinating to explore the mind-space it captured.
While the post talks about big LLMs as a valuable "snapshot" of world knowledge, the same technology can be used for lossless compression: https://bellard.org/ts_zip/.
And they all undertrained, according to the papers.
I miss the good ol days when I'd have text-davinci make me a table of movies that included a link to the movie poster. It usually generated a url of an image in an s3 bucket. The link always worked.
I think it’s fine that not everything on the internet is archived forever.
It has always been like that, in the past people wrote on paper, and most of it was never archived. At some point it was just lost.
I inherited many boxes of notes, books and documents from my grandparents. Most of it was just meaningless to me. I had to throw away a lot of it and only kept a few thousand pages of various documents. The other stuff is just lost forever. And that’s probably fine.
Archives are very important, but nowadays the most difficult part is to select what to archive. There is so much content added to the internet every second, only a fraction of it can be archived.
This doesn't make much sense to me. Unattributed heresay has limited historical value, perhaps zero given that the view of the web most of the weights-available models have is Common Crawl which is itself available for preservation.
I suspect the idea is that sometimes breadth wins out over accuracy. Even if it's unsuited as a primary source, this kind of lossy compression of many many documents might help a conscientious historian discover verifiable things through other routes.
> Scientific papers and processes that are lost forever as publishers fail, their websites shut down.
I don't think the big scientific publishers (now, in our time) will ever fail, they are RICH!
That means nothing. Big companies fail all the time. There is no guarantee any of them will be here in 50 years, let alone 500.
Perhaps a shorter term risk is the publishers consider some papers less profitable, so they stop preserving them.
So was the Roman Empire
I would be curious to know if it would be possible to recunstruct approximate versions of popular common subsets of internet training data by using many different LLMs that may have happened to read the same info. Anyone knows pointers to math papers about such things?
I really like the narative that now LLM is the conserving human knowledge that otherwise would be lost forever in the form of its weights in a kind of a lossy compression.
Personally I'd like that if all the knowledge and information (K & I) are readily available and accessible (pretty sure most of the prople share the same sentiment), despite the consistent business decisions from the copyright holders to hoard their K & I by putting everything behind paywalls and/or registration (I'm looking at you Apple and X/Twitter). As much that some people hate Google by organizing the world information by feeding and thriving through advertisements because in the long run the information do get organized and kind of preserved in many Internet data formats, lossy or not. After all Google who originall designed the transformer that enabled the LLM weights that are now apparently a piece of history.
Isn’t big LLM training data actually the most analogous to the internet archive? Shouldn’t the title be “Big LLM training data is a piece of history”? Especially at this point in history since a large portion of internet data going forward will be LLM generated and not human generated? It’s kind of the last snapshot of human-created content.
The problem is, where is this 20T tokens that are being used for this task? No way to access them. I hope that at least OpenAI and a few more have solid historical storage of the tokens they collect.
https://xkcd.com/1683/
I wonder whether it'll become like pre-WW2 steel that doesn't have nuclear contamination.
Just with a pre-LLM knowledge
Enjoy the insight, but the title makes my eye twitch. How about "LLM weights are pieces of history"?
Small LLM weights are not really interesting though. I am currently training GPT-2 small sized models for a scientific project right, and their world models are just not good enough to generate any kind of real insight about the world it was trained in except for corpus biases.
Small large language models? This sounds like the apocryphal headline when a spiritualist with dwarfism escaped prison: "Small medium at large." Do you also have some dehydrated water and a secure key escrow system?
A collection of newspapers is generally a better source than a single leaflet, but even a leaflet is a piece of history.
That's really what these are: something analogous to JPEG for language, and queryable in natural language.
Tangent: I was thinking the other day: these are not AI in the sense that they are not primarily intelligence. I still don't see much evidence of that. What they do give me is superhuman memory. The main thing I use them for is search, research, and a "rubber duck" that talks back, and it's like having an intern who has memorized the library and the entire Internet. They occasionally hallucinate or make mistakes -- compression artifacts -- but it's there.
So it's more AM -- artificial memory.
Edit: as a reply pointed out: this is Vannevar Bush's Memex, kind of.
I've been looking at it as an "instant reddit comment". I can download a 10G or 80G compressed archive that basically contains the useful parts of the internet, and then I all can use it to synthesize something that is about as good and reliable as a really good reddit comment. Which is nifty. But honestly it's an incredible idea to sell that to businesses.
Reddit seems to puppet humans via engagement farming to do what LLMs do in some cases. Posts are prompts, replies are responses.
Of course they vary widely in quality.
And so what would the point be of anyone actually posting on the internet if no one actually visits the sites because large corps have essentially stolen and monetized the whole thing.
And I'm sure they have or will have the ability to influence the responses so you only see what they want you to see.
That's the next step after algorithmic content feeds - algorithmic/generated comment sections. Imagine seeing an entirely different conversation happening just to get you to buy a product. A product like Coca-Cola.
Imagine scrolling through a comment section that feels tailor-made to your tastes, seamlessly guiding you to an ice-cold Coca-Cola. You see people reminiscing about their best summer memories—each one featuring a Coke in hand. Others are debating the superior refreshment of Coke over other drinks, complete with "real" testimonials and nostalgic stories.
And just when you're feeling thirsty, a perfectly timed comment appears: "Nothing beats the crisp, refreshing taste of an ice-cold Coke on a hot day."
Algorithmic engagement isn’t just the future—it’s already here, and it’s making sure the next thing you crave is Coca-Cola. Open Happiness.
Or, war is good peace is bad, nuclear war is winnable, don't worry and start loving the bomb. The enemy are not human anyway, your life will be better with fewer people around.
Look at the people who want to control this, they do not want to sell you Coke.
Why Coca Cola though? Sure it is refreshing on a hot day but you know what is even better? Going to bed on a nice cool mattress. So many are either too hard or too soft. They aren’t engineered to your body so you are virtually guaranteed to get a poor nights sleep.
Imagine waking up like I do every morning. Refreshed and full of energy. I’ve tried many mattresses and the only one that has this property is my Slumber Sleep Hygiene mattress.
The best part is my partner can customize their side using nothing more than a simple app on their smartphone. It tracks our sleep over time and uses AI to generate a daily sleep report showing me exactly how good of a night sleep I got. Why rely on my gut feelings when the report can tell me exactly how good or bad of a night sleep I got.
I highly recommend Slumber Sleep Hygiene mattresses. There is a reason it’s the number one brand recommended on HN.
Isn't that how Reddit gained momentum? Posting fake posts/comments?
Now we can mass-produce it!
Another insidious one: fake replies designed to console you if there isn't enough people to validate your opinion or answer your question.
>like having an intern who has memorized the library and the entire Internet. They occasionally hallucinate or make mistakes
Correction: you occasionally notice when they hallucinate or make mistakes.
Or 80 years to MVP memex
“Vannevar Bush's 1945 article "As We May Think". Bush envisioned the memex as a device in which individuals would compress and store all of their books, records, and communications, "mechanized so that it may be consulted with exceeding speed and flexibility".
https://en.m.wikipedia.org/wiki/Memex
The memex was a deterministic device to consult documents - the actual documents. The "LLM" is more like a dumb archivist that came with it ("Yes, see for example that document, it tells you that q=M·k...").
I grew up with physical encyclopedia, then moved on to Encarta, then Wikipedia dumps and folders full of PDFs. I still prefer curated information repository over chat interfaces or generated summaries. The main goal with the former is to have a knowledge map and keywords graph, so that you can locate any piece of information you may need from the actual source.
I believe LLMs are both data and processing, but even humans reasoning is based in strong ways on existing knowledge. However, for the goal of the post, indeed it is the memorization that is the key value, and the fact that likely in the future sampling such models can be used to transfer the same knowledge to bigger LLMs, even if the source data is lost.
I'm not saying there is no latent reasoning capability. It's there. It just seems to be that the memory and lookup component is much more useful and powerful.
To me intelligence describes something much more capable than what I see in these things, even the bleeding edge ones. At least so far.
I offer a POV that is in the middle: reasoning is powerful to evaluate which solution is better among N in the context. Memorization allows sampling of many competing ideas from the problem space, than the LLM picks the best, making chain of thoughts so effective. Of course zero shot reasoning also is a part of the story but somewhat weaker, exactly like we are not often able to spit the best solution before evaluation of the space (unless we are very accustomed to the specific problem).
That's the problem with the term "intelligence". Everyone has their own definition, we don't even know what makes us humans intelligent and more often than not it's a moving goalpost as these models get better.
If you want to see what this would actually be like:
https://lcamtuf.coredump.cx/lossifizer/
I think a fun experiment could be to see at what setting the average human can no longer decipher the text.
I can ask a LLM to write a haiku about the loss function of Stable Diffusion. Or I can have it do zero shot translation, between a pair of languages not covered in the training set. Can your "language JPEG" do that?
I think "it's just compression" and "it's just parroting" are flawed metaphors. Especially when the model was trained with RLHF and RL/reasoning. Maybe a better metaphor is "LLM is like a piano, I play the keyboard and it makes 'music'". Or maybe it's a bycicle, I push the pedals and it takes me where I point it.
There's a great article recently by Ted Chiang that elaborated on this idea: https://www.newyorker.com/tech/annals-of-technology/chatgpt-...
> JPEG for [a body of] language
Yes!
> artificial memory
Well, "yes", kind of.
> Memex
After a flood?! Not really. Vannevar Bush - As we may think - http://web.mit.edu/STS.035/www/PDFs/think.pdf
Having memory is fine but choosing the relevant parts requires intelligence
This is an excellent viewpoint.
I regularly pushback against casual uses of the word “intelligence”.
First, there is no objective dividing line. It is a matter of degree relative to something else. Any language that suggests otherwise should be refined or ejected from our culture and language. Language’s evolution doesn’t have to be a nosedive.
Second, there are many definitions of intelligence; some are more useful than others. Along with many, I like Stuart Russell’s definition: the degree to which an agent can accomplish a task. This definition requires being clear about the agent and the task. I mention this so often I feel like a permalink is needed. It isn’t “my” idea at all; it is simply the result of smart people decomplecting the idea so we’re not mired in needless confusion.
I rant about word meanings often because deep thinking people need to lay claim to words and shape culture accordingly. I say this often: don’t cede the battle of meaning to the least common denominators of apathy, ignorance, confusion, or marketing.
Some might call this kind of thinking elitist. No. This is what taking responsibility looks like. We could never have built modern science (or most rigorous fields of knowledge) with imprecise thinking.
I’m so done with sloppy mainstream phrasing of “intelligence”. Shit is getting real (so to speak), companies are changing the world, governments are racing to stay in the game, jobs will be created and lost, and humanity might transcend, improve, stagnate, or die.
If humans, meanwhile, can’t be bothered to talk about intelligence in a meaningful way, then, frankly, I think we’re … abdicating responsibility, tempting fate, or asking to be in the next Mike Judge movie.
We never would have been able to create science, if it weren't for focusing on the kinds of thinking that can be made logical. There's a big difference. What you're doing, with this whole "let's make a bullshit word logical" is more similar to medieval scholasticism, which was a vain attempt at verbal precision. https://justine.lol/dox/english.txt
Yikes, maybe we can take a step back? I'm not sure where this is coming from, frankly. One anodyne summary of my comment above would be:
> Let's think and communicate more clearly regarding intelligence. Stuart Russell offers a nice definition: an agent's ability to do a defined task.
Maybe something about my comment got you riled up? What was it?
You wrote:
> What you're doing, with this whole "let's make a bullshit word logical" is more similar to medieval scholasticism, which was a vain attempt at verbal precision.
Again, I'm not quite sure what to say. You suggest my comment is like a medieval scholar trying to reconcile dogma with philosophy? Wow. That's an uncharitable reading of my comment.
I have five points in response. First, the word intelligence need not be a "bullshit word", though I'm not sure what you mean by the term. One of my favorite definitions of bullshitting comes from "On Bullshit" by Harry Frankfurt:
> Frankfurt determines that bullshit is speech intended to persuade without regard for truth. The liar cares about the truth and attempts to hide it; the bullshitter doesn't care whether what they say is true or false. - Wikipedia
Second, I'm trying to clarify the term intelligence by breaking it into parts. I wouldn't say I'm trying to make it "logical" (in the sense of being about logic or deduction). Maybe you mean "formal"?
Third, regarding the "what you're doing" part... this isn't just me. Many people both clarify the concept of intelligence and explain why doing so is important.
Fourth, are you saying it is impossible to clarify the meaning of intelligence? Why? Not worth the trouble?
Fifth, have you thought about a definition of intelligence that you think is sensible? Does your definition steer people away from confusion?
You also wrote:
> We never would have been able to create science, if it weren't for focusing on the kinds of thinking that can be made logical.
I think you mean _testable_, not _logical_. Yes, we agree, scientists should run experiments on things that can be tested.
Russell's definition of intelligence is testable by defining a task and a quality metric. This is already a big step up from an unexamined view of intelligence, which often has some arbitrary threshold.* It allows us to see a continuum from, say, how a bacteria finds food, to how ants collaborate, to how people both build and use tools to solve problems. It also teases out sentience and moral worth so we're not mixing them up with intelligence. These are simple, doable, and worthwhile clarifications.
Finally, I read your quote from Dijkstra. In my reading, Dijkstra's main point is that natural language is a poor programming interface due to its ambiguity. Ok, fair. But what is the connection to this thread? Does it undercut any of my arguments? How?
* A common problem when discussing intelligence involves moving the goal post. Whatever quality bar is implied has a tendency to creep upwards over time.*
I just wanted to share an essay I liked. I didn't think you'd pay it much mind. But I can see now that you are a person devoted to science. If you want to know what I believe, I think computers in the 50's were intelligent. I think gpt2 probably qualified as agi if you take the meaning of the acronym literally. At this point we've blown so far past all expectations in terms of intelligence that I've come to agree with Karpathy that the time has come to start moving the goalposts to other words, like agency, since agents are an unsolved problem, and agency is proving to possibly be more important/powerful/rare/difficult than intelligence.
I reacted negatively to the idea earlier that agency should be considered an aspect of intelligence. I think separating the concepts helps me better understand people, their unique strengths, and puzzles like why sometimes people who aren't geniuses who know everything and can rotate complex shapes are sometimes very successful, but most importantly, why LLMs continue to feel like they're lacking something, compared to people, even though they're so outrageously intelligent. It's one thing to be smart, another thing entirely to be useful.
> I reacted negatively to the idea earlier that agency should be considered an aspect of intelligence.
In the hopes of clarifying any misunderstandings of what I mean... I said "agent" in Russell's sense -- a system with goals that has sensors and actuators in some environment. This is a common definition in CS and robotics. (I tend to shy away from using the word "agency" because sometimes it brings along meaning I'm not intending. For example, to many, the word "agency" suggests free will combined with the ability to do something with it.)
I recommend Russell to anyone willing to give him a try. I selected part of his writing that explains why his definition is important to his goals. From page 2 of https://people.eecs.berkeley.edu/~russell/papers/aij-cnt.pdf
> My own motivation for studying AI is to create and understand intelligence as a general property of systems, rather than as a specific attribute of humans. I believe this to be an appropriate goal for the field as a whole...
To continue my earlier comment... I prefer not to call an LLM "intelligent" much less "outrageously intelligent". Why? The main reason is communication clarity -- and by communication I mean the notion of a sender communicating a meaning to a receiver. Not just symbolic information (a la Shannon), but a faithful representation in the recipient. The phrase "outrageously intelligent" can have many conflicting interpretations in one's audience. Doing so generates more confusion than clarity.
To say my point a different way, intelligence is contextual. I'm not using "contextual" as some sort of vague excuse to avoid getting into the details. I'm not saying that intelligence cannot be quantified at all. Quite the opposite. Intelligence can be quantified fairly well (in the statistical sense) once a person specifies what they are talking about. Like Russell, I'm saying intelligence is multifaceted and depends on the agent (what sensors it has, what actuators it has), the environment, and the goal.
So what language would I use instead? Rather than speaking about "intelligence" as one thing that people understand and agree on, I would point to task- and goal-specific metrics. How well does a particular LLM do on the GRE? The LSAT?
Sooner or later, people will want to generalize over the specifics. This is where statistical reasoning comes in. With enough evaluations, we can start to discuss generalizations in a way that can be backed up with data. For example, might say things like "LLM X demonstrates high competence on text summarization tasks, provided that it has been pretrained on the relevant concepts" or "LLM Y struggles to discuss normative philosophical issues without falling into sycophancy, unless extensive prompt engineering protocols are used".
I think it helps to remember this: if someone asks "Is X intelligent?", one has the option to reframe the question. One can use it as an opportunity to clarify and teach and get into a substantive conversation. The alternative is suboptimal. But alas, some people demand short answers to poorly framed questions. Unfortunately, the answers they get won't help them.
Intelligence is closely related to the concept of attractiveness and gravitas. You say it depends on the agent. I say it's in the eye of the beholder. People aren't very good at explaining what attracts them either.
The closest thing we have to a definition for intelligence is probably the LLMs themselves. They're very good at predicting words that attract people. So clearly we've figured it out. It's just such a shame that this definition for intelligence is a bunch of opaque tensors that we can't fully explain.
LLMs don't just defy human reasoning and understanding. They also challenge the purpose of intelligence itself. Why study and devise systems, when gradient descent can figure it out for you? Why be cleverer when you can just buy more compute?
I don't know what's going to make the magical black pill of machine learning more closely align with our values. But I'm glad we have them. For example, I think it's good that people still hold objectivity as a virtue and try to create well-defined benchmarks that let us rank the merits of LLMs using numbers. I'm just skeptical about how well our efforts to date have predicted the organic processes that ultimately decide these things.
Split the wayback machine away from its book copyright lawsuit stuff and you don't have to worry.
Imagine future historians piecing together our culture from hallucinated AI memories - inaccurate, sure, but maybe even more fascinating than reality itself.
The internet training data for LLMs is valuable history were losing one dead webadmin at a time. The regurgitated slop less so.
People wanting this would be better off using memory architectures, like how the brain does it. For ML, the simplest approach is putting in memory layers with content-addressible schemes. I have a few links on prototypes in this comment:
https://news.ycombinator.com/item?id=42824960
Animal brains do not separate long term memory and processing - they are one and the same thing - columnar neural assemblies in the cortex that have learnt to recognize repeated patterns, and in turn activate others.
So large large language model?
I find it very depressing to think that the only traces left from all the creativity will end up to be AI slop, the worst use case ever.
I feel like the more people use GenAI, the less intelligent they become. Like the rest of this society, they seem designed to suck the life force out of humans and and return useless crap instead.
"big large" lol
fwiw i've added a summary of the discussion here: https://extraakt.com/extraakts/67d708bc9844db151612d782
Interesting. Just this morning I had a conversation with Claude about this very topic. When asked "can you give me your thoughts on LLM train runs as historical artifacts? do you think they might be uniquely valuable for future historians?", it answered
Interesting indeed to be able to directly interrogate the median experience of being online in 2025.(also my apologies for slop-posting; i slapped so many custom prompting on it that I hope you'll find the output to be amusing enough)
what's the prompt?