A major problem with LLM AIs is their core nature is not understood by the vast majority of everyone - developers included. They are an embodiment of literature, and if that confuses you you're probably operating on an incorrect definition of them.
I like to think of them as idiot savants with exponential more savant than your typical fictional idiot savant. They pivot on every word you use, each word in your series activating areas of training knowledge, until your prompt completes and then the LLM is logically located at some biased perspective of the topic you seek (if your wording was not vague and using implied references). Few seem to realize there is no "one topic" for each topic an LLM knows, there are numerous perspectives on every topic. Those perspectives reflect the reason one person/group is using that topic, and their technical seriousness within that topic. How you word your prompts dictates which of these perspectives your ultimate answer is generated.
When people say their use of AI reflects a mid level understanding of whatever they prompted, that is because the prompt is worded with the language used by "mid level understanding persons". If you want the LLM to respond with expert guidance, you have to prompt it using the same language and terms that the expert you want would use. That is how you activate their area of training to generate a response from them.
This goes further when using coding AI. If your code has the coding structure of a mid level developer, that causes a strong preference for mid level developer guidance - because that is relevant to your code structure. It requires a well written prompt using PhD/Professorial terminology in computer science to operate with a mid level code base and then get advice that would improve that code above it's mid level architecture.
In more words, "of course it's stupid, it's as complex as a mid-sized rodent where we taught it purely by selective breeding on getting answers right while carefully preventing any mutations which made their brains any bigger".
Not to put too fine a point on your metaphor, but the different training methods deployed by ChatGPT vs Claude, for example, changes that a bit regarding who did the “selective breeding”, arguably nurture vs nature, respectively
It's basically plotting the dots of all easily accessible written word to find your words, finding the words that are answers to your words and then charting a line through them no matter how scattered those points may be, and spitting that back out. It doesn't "know" anything nor is it reasoning even if the results are similar.
You have to come into it with the same "these people are only stupid and lack the experience to answer my questions despite thinking they do, they lack the world view to even process how I arrived at the parameters of my question(s)" apprehension like you would if asking reddit about some hazardous thing that would make them all screech. AI is the margarine to that butter.
It's a technology with potential to deliver great value, but there are limitations...
AI reminds of listening to any person who seems like an intellectual authority on multiple subjects on YouTube and is not afraid to wax confidently on any topic. They seem very intelligent and knowledgable until they actually talk about something you know.
In other words, I try to learn from it whenever it does something I can't do but when it does something I can do or something I'm really good at it I find myself wanting to correct it cause it doesn't do it that well.
It just seems like a really quick thinking and fast executing but, ultimately, mid skilled / novice person.
I think the even worse problem is that by extension now *everyone* sounds like an expert, even if they aren’t.
In the past when someone wrote an RFC they would need to study to formulate it well. Now anyone can create content sounding like an expert and it becomes difficult for the reader to differentiate real expertise and depth vs shallow fancy words.
In the last few years I have come to realize that first impression of anything is extremely important if your first few uses were good and wowed you you will be positive about it. If it was not you will be negative about it the bias of the first encounter stays with us no matter what.
Dont know about that, I dont think I had any superb first experience with it but even if I had, I got more turned on to it when I started using it for toy program/code solutions I needed on a one-off basis occasionally. If it didnt give me the code I needed to get various things done on the fly, I would maybe be more agnostic.
On non-code stuff, I think its improved or there are better options for making it get to the point and be concise more and I find when I correct it, quite often we actually get somewhere. The answers I remember from my initial use of it ofbasically how to do anything or most subjects was practically a 10 pager with some weird action plan that you were never gonna go thru.
Most long-term gamblers will tell you that the first games they played, they won. This is a real thing, yet we cannot apply it by making one bet and then stopping, because so are the probabilities being fair and un-biased.
What squares these two things is that most of the people who played and lost their first games, did not get addicted to gambling.
You can sometime run a quick second check by taking the AI's claim and asking it for an evaluation within a fresh context. It won't be misled by the surrounding text and its answer will at least be somewhat unbiased (though it might still be quite wrong).
It helps if you phrase the question openly, not obviously fishing for a yes-or-no answer. Or, if you have to ask for a yes-or-no question, make it sound like you're obviously expecting the answer that's actually less likely, so the AI will (1) either be more willing to argue against it, or (2) provide good arguments for it you might not have considered, because it "knows" the answer is unexpected and it wants to flatter your judgment.
> It helps if you phrase the question openly, not obviously fishing for a yes-or-no answer. Or, if you have to ask for a yes-or-no question, make it sound like you're obviously expecting the answer that's actually less likely,
I do this all the time and hate that I have to do it, with the additional "do not yes-man me, be critical."
Yeah, enormously.
People will hedge depending on how sure they are about something. They might also have credentials in whatever you ask them, if you get legal advice from a lawyer, that can be judged to be more reliable than from a lay person.
Relationships with real people are pretty cool actually.
If you talk to people that you have a longer relationship with, you might also be able to judge their areas of expertise and how prone to bullshitting they are.
Not just when running out of context, it's always. Once it fixates on a goal, all hell breaks loose and there's nothing that it won't be sacrificed to get there. At least that's my experience with Claude Code, I am pressing the figurative breaks all the time.
It was sort of funny when codex switched mid-session from "patch complete, I'll now automatically run the tests and verify results" to "patch complete, if you want to run the tests, just paste this instruction into a terminal somewhere: ..."
Apparently this was caused by the context window getting full!
(At least I assume that because it went back to the old behavior after I triggered a compaction)
> In other words, I try to learn from it whenever it does something I can't do...
So you know it can be full of sh1t on all kinds of topics, and you start learning from it the moment it's 'talking' about subjects you know you don't know about? To me that sounds like the moment to stop, not the moment to start. Or am I missing something?
It's a good analogy to comfort yourself. But remember AI is now being deployed on the frontline of mathematics and coming up with new theories.
The reality is much more stark then your description. Yes, in MANY instances it fails at things you know and you're an expert at. But in MANY instances it also beats you at what you're good at.
People who say stuff like the parent poster are completely mischaracterizing the current situation. We are not in a place where AI is "good" but we are "better". No... we are approaching a place of we are good and AI is starting to beat us at our own game. That is the prominent topic that is what is trending and that is the impending reality.
Yet everywhere on HN I see stuff like, oh AI fails here, or AI fails there. Yeah AI failing is obvious. It's been failing for most of my life. What's unique about the last couple years is that it's starting to beat us. Why? Because your typical HNer holds programming as not just a tool, but an identity. Your skill in programming is also a status symbol and when AI attacks your identity, the first thing you do to defend your identity is to bend reality and try to cast to a different conclusion by looking at everything from a different angle.
I think that's a category error. Current AI is not better or worse than us but fundamentally different. Its main strength and weakness is that it knows too much about everything. It usually knows more than the user about the topic at hand, but it doesn't know what is actually relevant in that particular situation.
If you nudge the AI in the right direction, it may surprise you with what it's capable of. But if you nudge it in a wrong direction or just don't give it sufficient context, it can be very confidently wrong.
Yeah this is another way to comfort yourself. Yes AI is different but again FACE reality.
Ai is different but it is also similar, for example it can speak language. So it is different and similar at the same time. I am obviously referring to AI beating us where we are similar to it. Best example: software engineering. This is obvious. The fact that I need to spell it out shows how deep the delusion goes.
The rest of your response is just regurgitating what the parent post said. Sure. But it doesn’t address the fact that while everything you said is true part of the time the other part of the time it beats us (that includes both nudging and no nudging it in the right direction).
Additionally all of this is also completely ignoring the fact that AI leap frogged in capability in the past year with my entire company now foregoing the use of text editing or using IDEs and having Claude write everything. If what you say is true only now and we see this much velocity in improvement then wait a couple more years and everything you said can be completely false if the trendline continues.
And here’s the craziest part. Everything im saying is obvious. I’m basically being captain obvious here. The question is why are so many people in total denial.
So you do not need a text editor anymore, the LLM writes everything? That is not my experience. My experience is that it is for sure much smarter than i am, but still i need to hand-hold it to ensure the code stays readible and the amount of code does not blow up in relation to the actual use case. It usually does not know enough about the actual problem space. It's also not so good at keeping the correct abstractions at the correct place of the code architecture. When not hand-holding, at a certain point the codebase becomes unmaintainable be human nor LLM.
You already stated the reason you think why so many people are in total denial. If indeed the reason is indeed a sense of threat to what people take themselves to be then I would be very surprised if the response would have been any different. Whether this indeed is the case is what remains to be seen. I for one do believe there is something different going on here than yet another technological advancement, but again - time will tell
It's an interesting parallel to, especially right wingers, want project intelligence into 1 dimension so things all humans can be ordered from inferior to superior.
That logic was already strained with humans, but with the introduction of AI the wheels are really coming off for that model.
I am pretty sure this comment, as many others of the same poster were LLM-generated with slight prompt tweak, which is against TOS. Check their history for proofs.
Have you been actively using paid versions of the flagship models from Ant / OpenAI? I’m just curious if the conclusion was made within the last 6 months or not.
Gell-Mann amnesia. The things it tells you about things you don't know are things that would make a knowledgeable person go "dude, wtf? That's totally wrong."
You can really only use AI for: things that are easy to verify; things that you already know how to do but want done faster; things you're learning to do and are just one step out of your reach (so it's still comprehensible to you); or, things that just plain don't matter.
That's a lot of stuff, but it also doesn't include a lot of the stuff people claim AI can do.
> Across studies, participants with higher trust in AI and lower need for cognition and fluid intelligence showed greater surrender to System 3
So the smart get smarter and the dumb get dumber?
Well, not exactly, but at least for now with AI "highly jagged", and unreliable, it pays to know enough to NOT trust it, and indeed be mentally capable enough that you don't need to surrender to it, and can spot the failures.
I think the potential problems come later, when AI is more capable/reliable, and even the intelligentsia perhaps stop questioning it's output, and stop exercising/developing their own reasoning skills. Maybe AI accelerates us towards some version of "Idiocracy" where human intelligence is even less relevant to evolutionary success (i.e. having/supporting lots of kids) than it is today, and gets bred out of the human species? Maybe this is the inevitable trajectory: species gets smarter when they develop language and tool creation, then peak, and get dumber after having created tools that do the thinking for them?
Pre-AI, a long time ago, I used to think/joke we might go in the other direction - evolve into a pulsating brain, eyes, genitalia and vestigial limbs, as mental work took over from physical, but maybe I got that reversed!
I think everyone who believes that they can personally resist the detrimental psychological effects of exposure to LLMs by "remaining aware" or "being careful", because they have cultivated an understanding of how language models work, is falling into precisely the same fallacy as people who think they can't be conned or that marketing doesn't work on them.
Don't kid yourself. If you use this junk, it's making you dumber and damaging your critical thinking skills, full-stop. This is delegation of core competency. You may feel smarter, or that you're learning faster, of that you're more productive, but to people who aren't addicted to LLMs it sounds exactly like gamblers insisting they have a foolproof system for slots, or alcoholics insisting that a few beers make them a better driver. Nobody outside the bubble is impressed with the results.
I fully agree that it’s close to impossible to not eventually fall into the trap of overrelying on them. However, it’s also true that I was able to do things with them that I would never have done otherwise for a lack of time or skill (all sorts of small personal apps, tools, and scripts for my hobbies). Maybe it’s a bit similar to only reading the comment section in a newspaper instead of the news? They will introduce you to new perspectives but if you stop reading the underlying news you’ll harm your own critical thinking? So it’s maybe a bit more grey than black & white?
> If you use this junk, it's making you dumber and damaging your critical thinking skills, full-stop.
Arguably I've been using my critical thinking skills more now that I have a smooth talking, but ultimately not actually intelligent companion.
Every time I put undue trust in it, I regret it, so I got used to veryfing what it outputs via documentation and sometimes even library code.
That being said worst part of this mess is that my usual sources of knowledge like search engines or developer forums dried up, as everyone else is also using LLMs.
I think this is too broad. If, for example, I get Claude to set up a fine tuning pipeline for rf-detr and it one shots it for me, what have I lost? A learning opportunity to understand the details of how to go about this process, sure. But you could argue the same about relying on PyTorch. Ultimately we all have an overarching goal when engaged in these projects and the learning opportunity might be happening at an entirely different level than worrying about the nuts and bolts of how you build component A of your larger project.
> Don't kid yourself. If you use this junk, it's making you dumber and damaging your critical thinking skills, full-stop. This is delegation of core competency.
This is a good way to frame the problem. Consider the offshoring (delegation) of American manufacturing to China, followed by the realization decades later that the US has forgotten how to actually make things and the subsequent frenzied attempt to remember.
I expect the timelines and second-order (third-order...) effects to play out on a similar decadal scale - long after everybody has realized their profits and the western brain has atrophied into slop.
Maybe this is the solution to the Fermi paradox. Intelligent species make thinking machines, loose capacity for thinking in a few generations, then a emp wipes out the computers and everyone is too stupid to survive.
Evolution is questionable science. i am not trying to be contrarian. it's not dogma nor it is established, scientifically proved theory. Proponents, usually when cornered, shrug and say: 'well, this is the best explanation we have so far'. That's not science. Best possible scenario is speculation by a group of people with mediocre thinking skills.
Mentioning this here because just like your comment, this 'theory' is usually slid inside arguments to make it appear as established science or fact. Kinda like this AI debacle.
I've noticed this in my own work with financial data. I used to manually sanity-check numbers from SEC filings and catch weird stuff all the time. Started leaning on LLMs to parse them faster and realized after a few weeks I was just... accepting whatever came back without thinking about it. Had to consciously force myself to go back to spot-checking.
The "System 3" framing is interesting but I think what's really happening is more like cognitive autopilot. We're not gaining a new reasoning system, we're just offloading the old ones and not noticing.
I suggest everyone interested in learning how these theories emerge, and how the social sciences work, to give it a read. Also, it kind of dismantles the whole idea of System 1 and 2, which then I guess would question the theoretical foundations of this paper too.
This framing points at something important that I think the alignment evaluation literature often misses: the distinction between what a model represents internally and what it does behaviorally. Probing can tell you what's in the representations, and linear probes can be surprisingly accurate. But in experiments I've run on DeepSeek and Qwen models, high probe accuracy for a given behavior doesn't predict whether the model actually routes through that behavior at inference time. The detection layer and the routing layer are architecturally separable, and most evaluation benchmarks are measuring the former while claiming to measure the latter.
Contrary to the general opinion, I feel that AI has IMPROVED my cognitive skills. I find myself discovering solutions to problems I've always struggled with (without asking AI about it, of course). I also find myself becoming much better at thinking on my feet during regular conversations. I believe I'm spending more time deep thinking than ever before because I can leave the boring cognitive stuff to AI, and that's giving my mind tougher workouts and making it stronger; but I could be completely wrong.
Without an empirical methodology it's hard to know how true this is. There are known and well-documented human biases (e.g., placebo effect) that could easily be involved here. And besides that, there's a convincing (but often overlooked on HN) argument to be made that modern LLMs are optimized in the same manner as other attention economy technologies. That is to say, they're addictive in the same general way that the YouTube/TikTok/Facebook/etc. feed algorithms are. They may be useful, but they also manipulate your attention, and it's difficult to disentangle those when the person evaluating the claims is the same person (potentially) being manipulated.
I'd love to see an empirical study that actually dives into this and attempts to show one way or another how true it is. Otherwise it's just all anecdotes.
At least in some instances you could frame it that way: You believe that doctors and medicine are effective at treating disease, so when you are sick and a doctor gives you a bottle of sugar pills and you take them, you now interpret your state through the lens that you should feel better. A bias on how you perceive your condition
That's not all that the placebo effect is. But it's probably the aspect that best fits the framing as bias
I keep asking it questions, and as I dialogue about the problem, I walk right into the conclusion myself, classic rubber duck. Or occasionally it will say something back, and it’s like “of course! That’s exactly what I’ve been circling without realizing it!”
This mostly happens with things I’ve already had long cognitive loops on myself, and I’m feeling stuck for some reason. The conversation with the model is usually multiple iterations of explaining to the model what I’m working through.
Same here, I observe what AI does as a spectator and it leads me to find problems and solutions way faster than I would have done so alone and much faster than AI could do it (if it could solve the problem at all).
This in turn has given me the ability to "double" think. I am conciously thinking while I have another part of my brain also thinking about it on a bigger scope that I could conciously grasp.
You are not wrong. AI is an amplifier. You chose to amplify something in particular and it works for you. That's good enough. (Give this as a prompt to your ai as I sense self-doubt here)
Because most people either don't know how to use it (multiple reasons, that ai itself can help them solve) or don't have the right mindset going into it (deeper work needed)
This is it for me. I am doing much better high level work since I don’t have to spend much time on lower level work. I have time to think and explore reframe and reanalyse
In the technophile's future people aren't just getting dumber, not wanting to think or forgetting how - they aren't allowed to think. Maybe about anything. It's too big liability, costs too much to support, moreover detracts from the product. Like Sam A telling those Indian students they aren't worth the energy and water. That's what we're dealing with.
When humans have an easy way to do something that is almost as good, we choose that easy way. Call it laziness, energy conservation, coddling, etc. The hard thing then becomes hard to do even when the easy thing isn't available, because the cognitive muscle and the discipline atrophy.
Like kids who are never taught to do things for themselves.
Do you refuse to use a calculator or spreadsheet, because doing long hand division helps you exercise your mental muscle? Do you refuse to use a database, because it will make your memory weaker? Or, do you refuse to use a car, because it makes you less able to walk when the car is unavailable? No. Because the car empowers you to do something that, at the very least, takes a lot longer on foot.
People have worried with every single new technology that it will enfeeble the masses, rather than empower them, and yet in the end, we usually find ourselves better off.
The car seems like a great example of a technology with a lot of problematic side effects. Places that had a more measured adoption ended up a lot better than those that replaced all public transit with cars and routinely demolished neighborhoods to make space for bigger highways
Cars are an essential part of modern life, but the sweetspot for car adoption isn't on either of the extremes
In some parts of the world perhaps? They're not an essential part of life in urban areas designed to work well without them. As in, many people can live their lives never using one, let alone owning one.
I'd call it bad on both levels. The costs imposed by car infrastructure are a tragedy of the commons. But even if you were the only person with a modern car you'd still be hit with the social effects of traveling in the isolation of your private metal box and the health effects of walking or biking less
On the other hand there are also big positives on both the societal and individual level. That's where the balance comes in. You want some individual travel and part of your logistics to run on cars, but not all of it. And probably a lot less of it than what most people in the 60s to 90s thought
> But even if you were the only person with a modern car you'd still be hit with the social effects of traveling in the isolation of your private metal box
For real, the amount of hate and vitriol I see expressed by people behind the “safety” of their steering wheel is unbelievable. Surely driving (excessively) leads to misanthropy like cigarettes to cancer.
I do refuse to use a car frequently, I’ll bike or walk because although it’s harder and sometimes scary, there are other times when it’s really great and I feel more connected to the world around me. Also more relaxed after the little bit of exercise.
Personally I also hurt my learning of trig identities and stuff because the symbolic algebra engine on my ti-89 was so good that I could rely on it instead of learning the material. Caught up to me in college with harder calc and physics classes.
I aced algebra and geometry in high school. Next was trigonometry and we had a new teacher who espoused the use of a thick pink and black trig book. It was absolutely alien, as well as ugly, to me. Once I realized the sine, cosine and tangent and co-relations were defined as geometric ratios, I put my mind at rest and determined to use my geometry skills to the max to avoid memorization. The teacher accepted my somewhat odd methodology for the time being.
That was good for a half-semester but then a formidable classroom opponent arose: a "new" boy who had been educated in another state using the very same textbook! I realized I'd have to commit at least a handful of the most useful trig identities to memory to solve problems quickly and remain at the head of the class. A weekend of furious comparison and selection ensued, but that was enough to carry me across the finish line in trig class.
For about 8y I biked for every possible local trip, usually daily. I wanted to reduce local pollution and get the exercise. It was rough in the wind and cold. I'd do it again if I could.
Sometimes I take breaks from the calculator and even review math videos because it's embarrassing when I can't help my kid with their homework.
Taking care in how and when we use AI seems very sensible. Just like we take care how often and how much refined sugar we eat, or how many hours we spend sedentary.
> Do you refuse to use a calculator or spreadsheet, because doing long hand division helps you exercise your mental muscle
Yeah when I was learning in school we weren't allowed electronics for division, and I think I absolutely would be dumber if I had never done that
> People have worried with every single new technology that it will enfeeble the masses, rather than empower them, and yet in the end, we usually find ourselves better off.
If you're posting this from America, you're living in a society that is fatter than ever thanks to cars. So there's surely some nuance here, not every technology upgrade is strictly better with no downsides
The paper puts AI next to System 1 and 2, but those are ways you think. With AI the thinking still happens, you just can't see or control it anymore.
When you googled something and got five contradictory results, that told you the question was hard. A clean AI answer doesn't give you that signal. Coherence looks the same whether the answer is right or wrong.
The failure mode didn't get worse. It got quieter.
The main problem with "System 3" is that it have its own kind of "cognitive biases", like System 1, but those new cognitive biases are designed by marketing, politics, culture and whatever censor or makes visible the original training. Even if the process, the processing and whatever else around was perfect (that is not, i.e. hallucinations)
But, we still have the System 1, and survived and reached this stage because of it, because even a bad guess is better than the slowness of doing things right. It have its problems, but sometimes you must reach a compromise.
I suppose the publishing process has always existed as system 3. It's just that now we have a new way to read and write with an abstract "rest of the world".
I'm not sure if this is saying people were given a task and the option to consult an AI. When they did they were influenced by its answer.
Which is kind of duh? Of course. They have some cool language like calling the AI system 3 and calling taking advice 'cognitive surrender' but I'm not sure how this differs from asking your mate Bob and taking his advice?
Why shouldn’t it be? You can gather and form your thoughts, create a draft, and then have a LLM rewrite it for you. You can write in the style you prefer so you can focus on thoughts and then have the LLM rewrite it in the appropriate style for the audience.
One might worry that it would increase the authors' confidence even following their LLM rewrite errors and reduce accuracy overall regardless of moderators.
I mean... I don't really check calculations made by a computer (e.g. by my own programs) all that often either and I think I'm completely fine :). But I guess the difference is that we kind of know how computers work and that they're generally super accurate and make mistakes incredibly rarely. The "AI" (although I disagree with "I" part) is wrong incredibly often, and I don't think people appreciate that the difference to the "traditional" approach isn't just significant, it's astronomical: LLMs make things up at least 5% of the time, whereas CPUs male mistakes maybe (10^-12)% of time or less. It's 12 orders of magnitude or so.
Can it design and implement a plutonium electric fuel cell with a 24,000 year half life? We have yet to witness it. Can it automate Farming and Agriculture? These are the real questions. #Born-Crusty
Damn. I came up with a hypothetical "System 3" last year! I didn't find AI very helpful in that regard though.
Current status: partially solved.
Problem: System 2 is supposed to be rational, but I found this to be far from the case. Massive unnecessary suffering.
Solution (WIP): Ask: What is the goal? What are my assumptions? Is there anything I am missing?
--
So, I repeatedly found myself getting into lots of trouble due to unquestioned assumptions. System 2 is supposed to be rational, but I found this to be far from the case.
So I tried inventing an "actually rational system" that I could "operate manually", or with a little help. I called it System 3, a system where you use a Thinking Tool to help you think more effectively.
Initial attempt was a "rational LLM prompt", but these mostly devolve into unhelpful nitpicking. (Maybe it's solvable, but I didn't get very far.)
Then I realized, wouldn't you get better results with a bunch of questions on pen and paper? Guided writing exercises?
I'm not sure what's a good way to get yourself "out of a rut" in terms of thinking about a problem. It seems like the longer you've thought about it, the less likely you are to explore beyond the confines of the "known" (i.e. your probably dodgy/incomplete assumptions).
I haven't solved System 3 yet, but a few months later found myself in an even more harrowing situation which could have been avoided if I had a System 3.
The solution turned out to be trivial, but I missed it for weeks... In this case, I had incorrectly named the project, and thus doomed it to limbo. Turns out naming things is just as important in real life as it is in programming!
So I joked "if being pedantic didn't solve the problem, you weren't being pedantic enough." But it's not a joke! It's about clear thinking. (The negative aspect of pedantry is inappropriate communication. But the positive aspect is "seeing the situation clearly", which is obviously the part you want to keep!)
"Time pressure (Study 2) and per-item incentives and feedback (Study 3) shifted baseline performance but did not eliminate this pattern: when accurate, AI buffered time-pressure costs and amplified incentive gains; when faulty, it consistently reduced accuracy regardless of situational moderators."
Have been curious what it could look like (and whether it might be an interesting new type of “post” people make) if readers could see the human prompts and pivots and steering of the LLM inline within the final polished AI output.
A major problem with LLM AIs is their core nature is not understood by the vast majority of everyone - developers included. They are an embodiment of literature, and if that confuses you you're probably operating on an incorrect definition of them.
I like to think of them as idiot savants with exponential more savant than your typical fictional idiot savant. They pivot on every word you use, each word in your series activating areas of training knowledge, until your prompt completes and then the LLM is logically located at some biased perspective of the topic you seek (if your wording was not vague and using implied references). Few seem to realize there is no "one topic" for each topic an LLM knows, there are numerous perspectives on every topic. Those perspectives reflect the reason one person/group is using that topic, and their technical seriousness within that topic. How you word your prompts dictates which of these perspectives your ultimate answer is generated.
When people say their use of AI reflects a mid level understanding of whatever they prompted, that is because the prompt is worded with the language used by "mid level understanding persons". If you want the LLM to respond with expert guidance, you have to prompt it using the same language and terms that the expert you want would use. That is how you activate their area of training to generate a response from them.
This goes further when using coding AI. If your code has the coding structure of a mid level developer, that causes a strong preference for mid level developer guidance - because that is relevant to your code structure. It requires a well written prompt using PhD/Professorial terminology in computer science to operate with a mid level code base and then get advice that would improve that code above it's mid level architecture.
In two words "book smart".
In more words, "of course it's stupid, it's as complex as a mid-sized rodent where we taught it purely by selective breeding on getting answers right while carefully preventing any mutations which made their brains any bigger".
Not to put too fine a point on your metaphor, but the different training methods deployed by ChatGPT vs Claude, for example, changes that a bit regarding who did the “selective breeding”, arguably nurture vs nature, respectively
It's basically plotting the dots of all easily accessible written word to find your words, finding the words that are answers to your words and then charting a line through them no matter how scattered those points may be, and spitting that back out. It doesn't "know" anything nor is it reasoning even if the results are similar.
You have to come into it with the same "these people are only stupid and lack the experience to answer my questions despite thinking they do, they lack the world view to even process how I arrived at the parameters of my question(s)" apprehension like you would if asking reddit about some hazardous thing that would make them all screech. AI is the margarine to that butter.
It's a technology with potential to deliver great value, but there are limitations...
Therein lies the argument that SWEs could become operators (in a much reduced capacity) between AIs and the world.
AI reminds of listening to any person who seems like an intellectual authority on multiple subjects on YouTube and is not afraid to wax confidently on any topic. They seem very intelligent and knowledgable until they actually talk about something you know.
In other words, I try to learn from it whenever it does something I can't do but when it does something I can do or something I'm really good at it I find myself wanting to correct it cause it doesn't do it that well.
It just seems like a really quick thinking and fast executing but, ultimately, mid skilled / novice person.
I think the even worse problem is that by extension now *everyone* sounds like an expert, even if they aren’t. In the past when someone wrote an RFC they would need to study to formulate it well. Now anyone can create content sounding like an expert and it becomes difficult for the reader to differentiate real expertise and depth vs shallow fancy words.
I also discussed this a bit more here: https://www.dev-log.me/everyone_is_an_expert_now/
In the last few years I have come to realize that first impression of anything is extremely important if your first few uses were good and wowed you you will be positive about it. If it was not you will be negative about it the bias of the first encounter stays with us no matter what.
Dont know about that, I dont think I had any superb first experience with it but even if I had, I got more turned on to it when I started using it for toy program/code solutions I needed on a one-off basis occasionally. If it didnt give me the code I needed to get various things done on the fly, I would maybe be more agnostic.
On non-code stuff, I think its improved or there are better options for making it get to the point and be concise more and I find when I correct it, quite often we actually get somewhere. The answers I remember from my initial use of it ofbasically how to do anything or most subjects was practically a 10 pager with some weird action plan that you were never gonna go thru.
Hence beginner's luck.
Most long-term gamblers will tell you that the first games they played, they won. This is a real thing, yet we cannot apply it by making one bet and then stopping, because so are the probabilities being fair and un-biased.
What squares these two things is that most of the people who played and lost their first games, did not get addicted to gambling.
AI's mistakes are sometimes so subtle.
Just yesterday I asked Gemini Pro 3.0 this question:
> Find such colors A and B:
> A and B are both valid sRGB color.
> Interpolating between them in CIELAB space like this
> C_cielab = (A_cielab + B_cielab) / 2
> results in a color C that can't be represented in sRGB
It gave me a correct answer, great!
...and then it proceeded to tell me to use Oklab, claiming it doesn't have this problem because the sRGB gamut is convex in Oklab.
If I didn't know Oklab does have the exact same problem I would have been fooled. It just sounds too reasonable.
You can sometime run a quick second check by taking the AI's claim and asking it for an evaluation within a fresh context. It won't be misled by the surrounding text and its answer will at least be somewhat unbiased (though it might still be quite wrong).
It helps if you phrase the question openly, not obviously fishing for a yes-or-no answer. Or, if you have to ask for a yes-or-no question, make it sound like you're obviously expecting the answer that's actually less likely, so the AI will (1) either be more willing to argue against it, or (2) provide good arguments for it you might not have considered, because it "knows" the answer is unexpected and it wants to flatter your judgment.
> It helps if you phrase the question openly, not obviously fishing for a yes-or-no answer. Or, if you have to ask for a yes-or-no question, make it sound like you're obviously expecting the answer that's actually less likely,
I do this all the time and hate that I have to do it, with the additional "do not yes-man me, be critical."
Great, now I have two answers and still no clue which one is the right one.
Now get a third opinion, and marvel at all the thinking that you have accomplished
In my experience the last answer it gives is usually the right one
Ah, so the trick is to figure out which one will be the last answer. The halting problem....
Any difference that asking two humans?
Yeah, enormously. People will hedge depending on how sure they are about something. They might also have credentials in whatever you ask them, if you get legal advice from a lawyer, that can be judged to be more reliable than from a lay person.
Relationships with real people are pretty cool actually. If you talk to people that you have a longer relationship with, you might also be able to judge their areas of expertise and how prone to bullshitting they are.
And it starts showing impatience when its about to run out of context, more like someone who wants to get out of the office exactly at 5.
Not just when running out of context, it's always. Once it fixates on a goal, all hell breaks loose and there's nothing that it won't be sacrificed to get there. At least that's my experience with Claude Code, I am pressing the figurative breaks all the time.
"I'm Mr. Meeseeks, look at meeee!"
Claude, I need you to help me take two strokes off my golf game.
It was sort of funny when codex switched mid-session from "patch complete, I'll now automatically run the tests and verify results" to "patch complete, if you want to run the tests, just paste this instruction into a terminal somewhere: ..."
Apparently this was caused by the context window getting full!
(At least I assume that because it went back to the old behavior after I triggered a compaction)
> In other words, I try to learn from it whenever it does something I can't do...
So you know it can be full of sh1t on all kinds of topics, and you start learning from it the moment it's 'talking' about subjects you know you don't know about? To me that sounds like the moment to stop, not the moment to start. Or am I missing something?
Sounds like an extension of https://en.wiktionary.org/wiki/Gell-Mann_Amnesia_effect, but sub newspaper for large language model.
It's a good analogy to comfort yourself. But remember AI is now being deployed on the frontline of mathematics and coming up with new theories.
The reality is much more stark then your description. Yes, in MANY instances it fails at things you know and you're an expert at. But in MANY instances it also beats you at what you're good at.
People who say stuff like the parent poster are completely mischaracterizing the current situation. We are not in a place where AI is "good" but we are "better". No... we are approaching a place of we are good and AI is starting to beat us at our own game. That is the prominent topic that is what is trending and that is the impending reality.
Yet everywhere on HN I see stuff like, oh AI fails here, or AI fails there. Yeah AI failing is obvious. It's been failing for most of my life. What's unique about the last couple years is that it's starting to beat us. Why? Because your typical HNer holds programming as not just a tool, but an identity. Your skill in programming is also a status symbol and when AI attacks your identity, the first thing you do to defend your identity is to bend reality and try to cast to a different conclusion by looking at everything from a different angle.
Face Reality.
I think that's a category error. Current AI is not better or worse than us but fundamentally different. Its main strength and weakness is that it knows too much about everything. It usually knows more than the user about the topic at hand, but it doesn't know what is actually relevant in that particular situation.
If you nudge the AI in the right direction, it may surprise you with what it's capable of. But if you nudge it in a wrong direction or just don't give it sufficient context, it can be very confidently wrong.
Yeah this is another way to comfort yourself. Yes AI is different but again FACE reality.
Ai is different but it is also similar, for example it can speak language. So it is different and similar at the same time. I am obviously referring to AI beating us where we are similar to it. Best example: software engineering. This is obvious. The fact that I need to spell it out shows how deep the delusion goes.
The rest of your response is just regurgitating what the parent post said. Sure. But it doesn’t address the fact that while everything you said is true part of the time the other part of the time it beats us (that includes both nudging and no nudging it in the right direction).
Additionally all of this is also completely ignoring the fact that AI leap frogged in capability in the past year with my entire company now foregoing the use of text editing or using IDEs and having Claude write everything. If what you say is true only now and we see this much velocity in improvement then wait a couple more years and everything you said can be completely false if the trendline continues.
And here’s the craziest part. Everything im saying is obvious. I’m basically being captain obvious here. The question is why are so many people in total denial.
So you do not need a text editor anymore, the LLM writes everything? That is not my experience. My experience is that it is for sure much smarter than i am, but still i need to hand-hold it to ensure the code stays readible and the amount of code does not blow up in relation to the actual use case. It usually does not know enough about the actual problem space. It's also not so good at keeping the correct abstractions at the correct place of the code architecture. When not hand-holding, at a certain point the codebase becomes unmaintainable be human nor LLM.
That’s not just my experience. That’s the experience of everyone where I work. Nobody uses a text editor anymore at my company.
We still hand hold it a bit. If it makes a mistake we just tell it to fix the mistake or do it in a different way. It’s that good.
You already stated the reason you think why so many people are in total denial. If indeed the reason is indeed a sense of threat to what people take themselves to be then I would be very surprised if the response would have been any different. Whether this indeed is the case is what remains to be seen. I for one do believe there is something different going on here than yet another technological advancement, but again - time will tell
It's an interesting parallel to, especially right wingers, want project intelligence into 1 dimension so things all humans can be ordered from inferior to superior. That logic was already strained with humans, but with the introduction of AI the wheels are really coming off for that model.
I am pretty sure this comment, as many others of the same poster were LLM-generated with slight prompt tweak, which is against TOS. Check their history for proofs.
Have you been actively using paid versions of the flagship models from Ant / OpenAI? I’m just curious if the conclusion was made within the last 6 months or not.
I got that experience 3 hours ago.
Gell-Mann amnesia. The things it tells you about things you don't know are things that would make a knowledgeable person go "dude, wtf? That's totally wrong."
You can really only use AI for: things that are easy to verify; things that you already know how to do but want done faster; things you're learning to do and are just one step out of your reach (so it's still comprehensible to you); or, things that just plain don't matter.
That's a lot of stuff, but it also doesn't include a lot of the stuff people claim AI can do.
> Across studies, participants with higher trust in AI and lower need for cognition and fluid intelligence showed greater surrender to System 3
So the smart get smarter and the dumb get dumber?
Well, not exactly, but at least for now with AI "highly jagged", and unreliable, it pays to know enough to NOT trust it, and indeed be mentally capable enough that you don't need to surrender to it, and can spot the failures.
I think the potential problems come later, when AI is more capable/reliable, and even the intelligentsia perhaps stop questioning it's output, and stop exercising/developing their own reasoning skills. Maybe AI accelerates us towards some version of "Idiocracy" where human intelligence is even less relevant to evolutionary success (i.e. having/supporting lots of kids) than it is today, and gets bred out of the human species? Maybe this is the inevitable trajectory: species gets smarter when they develop language and tool creation, then peak, and get dumber after having created tools that do the thinking for them?
Pre-AI, a long time ago, I used to think/joke we might go in the other direction - evolve into a pulsating brain, eyes, genitalia and vestigial limbs, as mental work took over from physical, but maybe I got that reversed!
I think everyone who believes that they can personally resist the detrimental psychological effects of exposure to LLMs by "remaining aware" or "being careful", because they have cultivated an understanding of how language models work, is falling into precisely the same fallacy as people who think they can't be conned or that marketing doesn't work on them.
Don't kid yourself. If you use this junk, it's making you dumber and damaging your critical thinking skills, full-stop. This is delegation of core competency. You may feel smarter, or that you're learning faster, of that you're more productive, but to people who aren't addicted to LLMs it sounds exactly like gamblers insisting they have a foolproof system for slots, or alcoholics insisting that a few beers make them a better driver. Nobody outside the bubble is impressed with the results.
I fully agree that it’s close to impossible to not eventually fall into the trap of overrelying on them. However, it’s also true that I was able to do things with them that I would never have done otherwise for a lack of time or skill (all sorts of small personal apps, tools, and scripts for my hobbies). Maybe it’s a bit similar to only reading the comment section in a newspaper instead of the news? They will introduce you to new perspectives but if you stop reading the underlying news you’ll harm your own critical thinking? So it’s maybe a bit more grey than black & white?
> If you use this junk, it's making you dumber and damaging your critical thinking skills, full-stop.
Arguably I've been using my critical thinking skills more now that I have a smooth talking, but ultimately not actually intelligent companion.
Every time I put undue trust in it, I regret it, so I got used to veryfing what it outputs via documentation and sometimes even library code.
That being said worst part of this mess is that my usual sources of knowledge like search engines or developer forums dried up, as everyone else is also using LLMs.
I think this is too broad. If, for example, I get Claude to set up a fine tuning pipeline for rf-detr and it one shots it for me, what have I lost? A learning opportunity to understand the details of how to go about this process, sure. But you could argue the same about relying on PyTorch. Ultimately we all have an overarching goal when engaged in these projects and the learning opportunity might be happening at an entirely different level than worrying about the nuts and bolts of how you build component A of your larger project.
> Don't kid yourself. If you use this junk, it's making you dumber and damaging your critical thinking skills, full-stop. This is delegation of core competency.
This is a good way to frame the problem. Consider the offshoring (delegation) of American manufacturing to China, followed by the realization decades later that the US has forgotten how to actually make things and the subsequent frenzied attempt to remember.
I expect the timelines and second-order (third-order...) effects to play out on a similar decadal scale - long after everybody has realized their profits and the western brain has atrophied into slop.
My mind is already going, old age. You only really try anything when you are already losing it. Especially if you had it once.
Maybe this is the solution to the Fermi paradox. Intelligent species make thinking machines, loose capacity for thinking in a few generations, then a emp wipes out the computers and everyone is too stupid to survive.
1909 short story "The Machine Stops"
(Minus the Fermi paradox part)
Evolution is questionable science. i am not trying to be contrarian. it's not dogma nor it is established, scientifically proved theory. Proponents, usually when cornered, shrug and say: 'well, this is the best explanation we have so far'. That's not science. Best possible scenario is speculation by a group of people with mediocre thinking skills.
Mentioning this here because just like your comment, this 'theory' is usually slid inside arguments to make it appear as established science or fact. Kinda like this AI debacle.
I've noticed this in my own work with financial data. I used to manually sanity-check numbers from SEC filings and catch weird stuff all the time. Started leaning on LLMs to parse them faster and realized after a few weeks I was just... accepting whatever came back without thinking about it. Had to consciously force myself to go back to spot-checking.
The "System 3" framing is interesting but I think what's really happening is more like cognitive autopilot. We're not gaining a new reasoning system, we're just offloading the old ones and not noticing.
There's a very interesting critique of Kahneman's "Thinking fast and slow" from German psychologist Gerd Gigerenzen: https://www.researchgate.net/publication/397923694_The_Legac...
I suggest everyone interested in learning how these theories emerge, and how the social sciences work, to give it a read. Also, it kind of dismantles the whole idea of System 1 and 2, which then I guess would question the theoretical foundations of this paper too.
This framing points at something important that I think the alignment evaluation literature often misses: the distinction between what a model represents internally and what it does behaviorally. Probing can tell you what's in the representations, and linear probes can be surprisingly accurate. But in experiments I've run on DeepSeek and Qwen models, high probe accuracy for a given behavior doesn't predict whether the model actually routes through that behavior at inference time. The detection layer and the routing layer are architecturally separable, and most evaluation benchmarks are measuring the former while claiming to measure the latter.
Contrary to the general opinion, I feel that AI has IMPROVED my cognitive skills. I find myself discovering solutions to problems I've always struggled with (without asking AI about it, of course). I also find myself becoming much better at thinking on my feet during regular conversations. I believe I'm spending more time deep thinking than ever before because I can leave the boring cognitive stuff to AI, and that's giving my mind tougher workouts and making it stronger; but I could be completely wrong.
Without an empirical methodology it's hard to know how true this is. There are known and well-documented human biases (e.g., placebo effect) that could easily be involved here. And besides that, there's a convincing (but often overlooked on HN) argument to be made that modern LLMs are optimized in the same manner as other attention economy technologies. That is to say, they're addictive in the same general way that the YouTube/TikTok/Facebook/etc. feed algorithms are. They may be useful, but they also manipulate your attention, and it's difficult to disentangle those when the person evaluating the claims is the same person (potentially) being manipulated.
I'd love to see an empirical study that actually dives into this and attempts to show one way or another how true it is. Otherwise it's just all anecdotes.
I don't understand how the placebo effect is a human bias. Is it?
At least in some instances you could frame it that way: You believe that doctors and medicine are effective at treating disease, so when you are sick and a doctor gives you a bottle of sugar pills and you take them, you now interpret your state through the lens that you should feel better. A bias on how you perceive your condition
That's not all that the placebo effect is. But it's probably the aspect that best fits the framing as bias
It's much more than a bias.
You actually get better through placebo, as long as there's a pathway to it that is available to your body.
It's a really weird effect.
The fight isn't against triggering placebo, it's against letting it muddle study results.
I really love the back-and-forth in this mini-thread, I learned a lot about good thinking skills here. Thanks everyone.
I keep asking it questions, and as I dialogue about the problem, I walk right into the conclusion myself, classic rubber duck. Or occasionally it will say something back, and it’s like “of course! That’s exactly what I’ve been circling without realizing it!”
This mostly happens with things I’ve already had long cognitive loops on myself, and I’m feeling stuck for some reason. The conversation with the model is usually multiple iterations of explaining to the model what I’m working through.
Same here, I observe what AI does as a spectator and it leads me to find problems and solutions way faster than I would have done so alone and much faster than AI could do it (if it could solve the problem at all).
This in turn has given me the ability to "double" think. I am conciously thinking while I have another part of my brain also thinking about it on a bigger scope that I could conciously grasp.
You are not wrong. AI is an amplifier. You chose to amplify something in particular and it works for you. That's good enough. (Give this as a prompt to your ai as I sense self-doubt here)
It's so fascinating, i feel the same but at the same i feel like most people get dumber than before ai (and most seem to struggle adapting ai)
Because most people either don't know how to use it (multiple reasons, that ai itself can help them solve) or don't have the right mindset going into it (deeper work needed)
This is it for me. I am doing much better high level work since I don’t have to spend much time on lower level work. I have time to think and explore reframe and reanalyse
In the technophile's future people aren't just getting dumber, not wanting to think or forgetting how - they aren't allowed to think. Maybe about anything. It's too big liability, costs too much to support, moreover detracts from the product. Like Sam A telling those Indian students they aren't worth the energy and water. That's what we're dealing with.
That's reminiscent of Kurt Vonnegut's vision of the future in Harrison Bergeron.
When humans have an easy way to do something that is almost as good, we choose that easy way. Call it laziness, energy conservation, coddling, etc. The hard thing then becomes hard to do even when the easy thing isn't available, because the cognitive muscle and the discipline atrophy.
Like kids who are never taught to do things for themselves.
Do you refuse to use a calculator or spreadsheet, because doing long hand division helps you exercise your mental muscle? Do you refuse to use a database, because it will make your memory weaker? Or, do you refuse to use a car, because it makes you less able to walk when the car is unavailable? No. Because the car empowers you to do something that, at the very least, takes a lot longer on foot.
People have worried with every single new technology that it will enfeeble the masses, rather than empower them, and yet in the end, we usually find ourselves better off.
The car seems like a great example of a technology with a lot of problematic side effects. Places that had a more measured adoption ended up a lot better than those that replaced all public transit with cars and routinely demolished neighborhoods to make space for bigger highways
Cars are an essential part of modern life, but the sweetspot for car adoption isn't on either of the extremes
> Cars are an essential part of modern life
In some parts of the world perhaps? They're not an essential part of life in urban areas designed to work well without them. As in, many people can live their lives never using one, let alone owning one.
Tragedy of the commons perhaps ? Good for the individual, bad for society and finding solutions that can balance both
I'd call it bad on both levels. The costs imposed by car infrastructure are a tragedy of the commons. But even if you were the only person with a modern car you'd still be hit with the social effects of traveling in the isolation of your private metal box and the health effects of walking or biking less
On the other hand there are also big positives on both the societal and individual level. That's where the balance comes in. You want some individual travel and part of your logistics to run on cars, but not all of it. And probably a lot less of it than what most people in the 60s to 90s thought
> But even if you were the only person with a modern car you'd still be hit with the social effects of traveling in the isolation of your private metal box
For real, the amount of hate and vitriol I see expressed by people behind the “safety” of their steering wheel is unbelievable. Surely driving (excessively) leads to misanthropy like cigarettes to cancer.
I do refuse to use a car frequently, I’ll bike or walk because although it’s harder and sometimes scary, there are other times when it’s really great and I feel more connected to the world around me. Also more relaxed after the little bit of exercise.
Personally I also hurt my learning of trig identities and stuff because the symbolic algebra engine on my ti-89 was so good that I could rely on it instead of learning the material. Caught up to me in college with harder calc and physics classes.
I aced algebra and geometry in high school. Next was trigonometry and we had a new teacher who espoused the use of a thick pink and black trig book. It was absolutely alien, as well as ugly, to me. Once I realized the sine, cosine and tangent and co-relations were defined as geometric ratios, I put my mind at rest and determined to use my geometry skills to the max to avoid memorization. The teacher accepted my somewhat odd methodology for the time being.
That was good for a half-semester but then a formidable classroom opponent arose: a "new" boy who had been educated in another state using the very same textbook! I realized I'd have to commit at least a handful of the most useful trig identities to memory to solve problems quickly and remain at the head of the class. A weekend of furious comparison and selection ensued, but that was enough to carry me across the finish line in trig class.
For about 8y I biked for every possible local trip, usually daily. I wanted to reduce local pollution and get the exercise. It was rough in the wind and cold. I'd do it again if I could.
Sometimes I take breaks from the calculator and even review math videos because it's embarrassing when I can't help my kid with their homework.
Taking care in how and when we use AI seems very sensible. Just like we take care how often and how much refined sugar we eat, or how many hours we spend sedentary.
Long division (tilling fields, weaving cloth, whatever facile comparison this argument dredges up) doesn't define me as a creature, cognition does.
You cannot live by thinking alone.
You can only live by thinking. It's how you experience the world and how you move yr limbs.
Says who? Trillionaire capitalist overlords?
Actually, yea, I do a lot of mental calculations to avoid losing my edge on thinking about numbers. I avoid gps navigators for similar reasons.
But the analogy doesn’t actually hold up anyhow because the calculator and the navigator are deterministic. I can rely on their output.
LLMs have a probabilistic output that absolutely needs verification every time. I cannot trust them the same way I can trust a calculator.
> Do you refuse to use a calculator or spreadsheet, because doing long hand division helps you exercise your mental muscle
Yeah when I was learning in school we weren't allowed electronics for division, and I think I absolutely would be dumber if I had never done that
> People have worried with every single new technology that it will enfeeble the masses, rather than empower them, and yet in the end, we usually find ourselves better off.
If you're posting this from America, you're living in a society that is fatter than ever thanks to cars. So there's surely some nuance here, not every technology upgrade is strictly better with no downsides
I calculate tips and such in my head because I can do it faster than whipping out the calculator app on my phone and poking the numbers in.
I still memorize phone numbers. Hey, today that counts as "not using a database".
I think we are, in fact, getting dumber.
I play around with adding, subtracting, or multiplying license plate numbers. Does that count?
I am so rusty, that I just do add and subtract.
On the other hand, my grandparents, and father, could look at financial documents and do the calculations in their head.
People I know who stayed in finance longer than me, can crunch numbers rapidly.
I am around numerate people most of the time, so the occasions where I find I am the faster calculator around are jarring.
There are many conversations that go adrift because we can’t crunch numbers fast enough.
Is it a net loss to humanity in the face of the gains we obtained. Nope.
Is mental fitness of value to me, the same way physical fitness is of value to me? Yes, very much.
The paper puts AI next to System 1 and 2, but those are ways you think. With AI the thinking still happens, you just can't see or control it anymore.
When you googled something and got five contradictory results, that told you the question was hard. A clean AI answer doesn't give you that signal. Coherence looks the same whether the answer is right or wrong.
The failure mode didn't get worse. It got quieter.
The main problem with "System 3" is that it have its own kind of "cognitive biases", like System 1, but those new cognitive biases are designed by marketing, politics, culture and whatever censor or makes visible the original training. Even if the process, the processing and whatever else around was perfect (that is not, i.e. hallucinations)
But, we still have the System 1, and survived and reached this stage because of it, because even a bad guess is better than the slowness of doing things right. It have its problems, but sometimes you must reach a compromise.
I suppose the publishing process has always existed as system 3. It's just that now we have a new way to read and write with an abstract "rest of the world".
I'm conflicted about this. As I was reading the paper, my AI detector senses were tingling all over the place.
Large parts of the paper score very high probability of being written entirely by AI in gptzero.
I'm not sure if I could trust anything written in it.
I'm not sure if this is saying people were given a task and the option to consult an AI. When they did they were influenced by its answer.
Which is kind of duh? Of course. They have some cool language like calling the AI system 3 and calling taking advice 'cognitive surrender' but I'm not sure how this differs from asking your mate Bob and taking his advice?
Anyone else get the distinct impression that parts of this paper were written by AI?
Why shouldn’t it be? You can gather and form your thoughts, create a draft, and then have a LLM rewrite it for you. You can write in the style you prefer so you can focus on thoughts and then have the LLM rewrite it in the appropriate style for the audience.
One might worry that it would increase the authors' confidence even following their LLM rewrite errors and reduce accuracy overall regardless of moderators.
You still need to review and edit.
I mean... I don't really check calculations made by a computer (e.g. by my own programs) all that often either and I think I'm completely fine :). But I guess the difference is that we kind of know how computers work and that they're generally super accurate and make mistakes incredibly rarely. The "AI" (although I disagree with "I" part) is wrong incredibly often, and I don't think people appreciate that the difference to the "traditional" approach isn't just significant, it's astronomical: LLMs make things up at least 5% of the time, whereas CPUs male mistakes maybe (10^-12)% of time or less. It's 12 orders of magnitude or so.
I couldn't figure if this was published to a journal? Or is it only published to a pre-print server?
SSRN is a preprint server and that is the only published version.
The original reseaech around thinking fast and slow (aka system 1 system 2 thinking) failed to be replicated when researchers tried.
Can it design and implement a plutonium electric fuel cell with a 24,000 year half life? We have yet to witness it. Can it automate Farming and Agriculture? These are the real questions. #Born-Crusty
blocking access to a site because you don't enable javascript is diabolical
Damn. I came up with a hypothetical "System 3" last year! I didn't find AI very helpful in that regard though.
Current status: partially solved.
Problem: System 2 is supposed to be rational, but I found this to be far from the case. Massive unnecessary suffering.
Solution (WIP): Ask: What is the goal? What are my assumptions? Is there anything I am missing?
--
So, I repeatedly found myself getting into lots of trouble due to unquestioned assumptions. System 2 is supposed to be rational, but I found this to be far from the case.
So I tried inventing an "actually rational system" that I could "operate manually", or with a little help. I called it System 3, a system where you use a Thinking Tool to help you think more effectively.
Initial attempt was a "rational LLM prompt", but these mostly devolve into unhelpful nitpicking. (Maybe it's solvable, but I didn't get very far.)
Then I realized, wouldn't you get better results with a bunch of questions on pen and paper? Guided writing exercises?
So here are my attempts so far:
reflect.py - https://gist.github.com/a-n-d-a-i/d54bc03b0ceeb06b4cd61ed173...
unstuck.py - https://gist.github.com/a-n-d-a-i/d54bc03b0ceeb06b4cd61ed173...
--
I'm not sure what's a good way to get yourself "out of a rut" in terms of thinking about a problem. It seems like the longer you've thought about it, the less likely you are to explore beyond the confines of the "known" (i.e. your probably dodgy/incomplete assumptions).
I haven't solved System 3 yet, but a few months later found myself in an even more harrowing situation which could have been avoided if I had a System 3.
The solution turned out to be trivial, but I missed it for weeks... In this case, I had incorrectly named the project, and thus doomed it to limbo. Turns out naming things is just as important in real life as it is in programming!
So I joked "if being pedantic didn't solve the problem, you weren't being pedantic enough." But it's not a joke! It's about clear thinking. (The negative aspect of pedantry is inappropriate communication. But the positive aspect is "seeing the situation clearly", which is obviously the part you want to keep!)
"Time pressure (Study 2) and per-item incentives and feedback (Study 3) shifted baseline performance but did not eliminate this pattern: when accurate, AI buffered time-pressure costs and amplified incentive gains; when faulty, it consistently reduced accuracy regardless of situational moderators."
I LOLed.
Have been curious what it could look like (and whether it might be an interesting new type of “post” people make) if readers could see the human prompts and pivots and steering of the LLM inline within the final polished AI output.