My thesis is actually simpler. For the longest time until the Industrial Revolution humans have done uninteresting work for the large part. There was a routine and little else. Intellectuals worked through a very terse knowledge base and it was handed down master to apprentice. Post renaissance and industrial age the amount of known knowledge has exploded, the specializations have exploded. Most of what white collar work is today is managing and searching through this explosion of knowledge and rules. AI (well the LLM part) is mostly targeted towards that - making that automated. That’s all it is. Here is the problem though, it’s for the clueless. Those who are truly clueless fall victim to the hallucinations. Those who have expertise in their field will be able to be more efficient.
AI isn’t replacing innovation or original thought. It is just working off an existing body of knowledge.
I disagree that ancient work was uninteresting. If you've ever looked at truly old architecture, walls, carvings etc you can see that people really took pride in their work, adding things that absolutely weren't just pure utility. In my mind that's the sign of someone that considers their work interesting.
But in general, in the past there was much less specialization. That means each individual was responsible for a lot more stuff, and likely had a lot more varied work day. The apprentice blacksmith didn't just hammer out nail after nail all day with no breaks. They made all sorts of tools, cutlery, horseshoes. But they also carried water, operated bellows, went to fetch coke etc, sometimes even spending days without actually hammering metal at all - freeing up mental energy and separation to be able to enjoy it when they actually got to do it.
Similarly, farm laborers had massively varied lives. Their daily tasks of a given week or month would look totally different depending on the season, with winter essentially being time off to go fix or make other stuff because you can't do much more than wait to make plants grow faster
People might make the criticism and say "oh but that was only for rich people/government" etc, but look at for example old street lights, bollards etc. Old works tend to be
Specialization allows us to curse ourselves with efficiency, and a curse it is indeed. Now if you're good at hammering nails, nails are all you'll get, morning to night, and rewarded the shittier and cheaper and faster you make your nails, sucking all incentive to do any more than the minimum
I have gotten much more value out of AI tools by focusing on the process and not the product. By this I mean that I treat it as a loosely-defined brainstorming tool that expands my “zone of knowledge”, and not as a way to create some particular thing.
In this way, I am infinitely more tolerant of minor problems in the output, because I’m not using the tool to create a specific output, I’m using it to enhance the thing I’m making myself.
To be more concrete: let’s say I’m writing a book about a novel philosophical concept. I don’t use the AI to actually write the book itself, but to research thinkers/works that are similar, critique my arguments, make suggestions on topics to cover, etc. It functions more as a researcher and editor, not a writer – and in that sense it is extremely useful.
I think it's a U-shaped utility curve where abstract planning is on one side (your comment) and the chore implementation is on the other.
Your role is between the two: deciding on the architecture, writing the top-level types, deciding on the concrete system design.
And then AI tools help you zoom in and glue things together in an easily verifiable way.
I suspect that people who still haven't figured out how to make use of LLMs, assuming it's not just resentful performative complaining which it probably is, are expecting it to do it all. Which never seemed very engineer-minded.
You don’t empathize with the humane opinion “why bother?” I like to program so it resonates. I’m fortunate to enjoy my work so why would I want to stop doing what I enjoy?
Agree - I tend to think of it as offloading thinking time. Delegating work to an agent just becomes more work for me, with the quality I've seen. But conversations where I control the context are both fun and generally insightful, even if I decide the initial idea isn't a good one.
That is a good metaphor. I frequently use ChatGPT in a way that basically boils down to: I could spend an hour thinking about and researching X basic thing I know little about, or I could have the AI write me a summary that is 95% good enough but only takes a few seconds of my time.
A Danish audio newspaper host / podcaster had the exact apposite conclusion when he used ChatGPT to write the manuscript for one his episodes. He ended up spending as much time as he usually does because he had to fact check everything that the LLM came up with. Spoiler: It made up a lot of stuff despite it being very clear in the prompt, that it should not do so. To him, it was the most fun part, that is writing the manuscript, that the chatbot could help him with. His conclusion about artificial intelligence was this:
“We thought we were getting an accountant, but we got a poet.”
I love this turn of phrase. It quite nicely evokes the difference between how the reader thinks vs how the LLM does.
It also invites reflections on what “sentience” means. In my experience — make of it what you will — correct fact retrieval isn’t really necessary or sufficient for there to be a lived, first-person experience.
It's not the exact opposite*, the author said that if you're doing boilerplate _code_ it's probably fine.
The thing is that since it can't think, it's absolutely useless when it comes to things that hasn't been done before, because if you are creating something new, the software won't have had any chance to train on what you are doing.
So if you are in a situation in which it is a good idea to create a new DSL for your problem **, then the autocruise control magic won't work because it's a new language.
Now if you're just mashing out propaganda like some brainwashed soviet apparatchik propagandist, maybe it helps. So maybe people who writes predictable slop like this the guardian article (https://archive.is/6hrKo) would be really grateful that their computer has a cruise control for their political spam.
) if that's what you meant
*) which you statistically speaking might not want to do, but this is about actually interesting work where it's more likely to happen*
In a world where the AI can understand your function library near flawlessly and compose it in to all sorts of things, why would you put the effort into a DSL that humans will have to learn and the AI will trip over? This is a dead pattern.
As a writer I find his take appalling and incomprehensible. So, apparently not all writers agree that writing with AI is fun. To me, it’s a sickening violation of integrity.
Yeah, if I were their reader, I'd most likely never read anything from them again, since nothing's stopping them from doing away with integrity altogether and just stitching together a bunch of scripts ('agents') into an LLM slop pipeline.
It's so weird how people use LLMs to automate the most important and rewarding parts of the creative process. I get that companies have no clue how to market the things, but it really shows a lack of imagination and self-awareness when a 'creative' repackages slop for their audience and calls it 'fun'.
The one thing AI is good at is building greenfield projects from scratch using established tools. If want you want to accomplish can be done by a moderately capable coder with some time reading the documentation for the various frameworks involved, then I view AI as fairly similar to the scaffolding that happened with Ruby on Rails back in the day when I typed "rails new myproject".
So LLMs are awesome if I want to say "create a dashboard in Next.js and whatever visualization library you think is appropriate that will hit these endpoints [dumping some API specs in there] and display the results to a non-technical user", along with some other context here and there, and get a working first pass to hack on.
When they are not awesome is if I am working on adding a map visualization to that dashboard a year or two later, and then I need to talk to the team that handles some of the API endpoints to discuss how to feed me the map data. Then I need to figure out how to handle large map pin datasets. Oh, and the map shows regions of activity that were clustered with DBSCAN, so I need to know that Alpha shape will provide a generalization of a convex hull that will allow me to perfectly visualize the cluster regions from DBSCAN's epsilon parameter with the corresponding choice of alpha parameter. Etc, etc, etc.
I very rarely write code for greenfield projects these days, sadly. I can see how startup founders are head over heels over this stuff because that's what their founding engineers are doing, and LLMs let them get it cranking very very fast. You just have to hope that they are prudent enough to review and tweak what's written so that you're not saddled with tech debt. And when inevitable tech debt needs paying (or working around) later, you have to hope that said founders aren't forcing their engineers to keep using LLMs for decisions that could cut across many different teams and systems.
I get what point you're trying to make, and agree, but you've picked a bad example.
That boilerplate heavy, skill-less, frontend stuff like configuring a map control with something like react-leaflet seems to be precisely what AI is good at.
Yeah it will make a map and plot some stuff on it. It might do well at handling 20 millions pins on the map gracefully even. I doubt it's gonna know to use alpha shapes to complement DBSCAN quite so gracefully.
edit: Just spot checked it and it thinks it's a good idea to use convex hulls.
Those kinds of thought processes are the kinds that produce value.
Deciding what to build and how to build it is often harder than building.
What LLMs of today do is basically super-autocomplete. It's a continuation of the history of programming automation: compilers, more advanced compilers, IDEs, code generators, LINTers, autocomplete, codeinsight, etc.
> Meanwhile, I feel like if I tried to offload my work to an LLM, I would both lose context and be violating the do-one-thing-and-do-it-well principle I half-heartedly try to live by.
He should use it as a Stack Overflow on steroids. I assume he uses Stack Overflow without remorse.
I used to have 1y streaks on being on SO, now I'm there around once or twice per week.
While I didn't agree with the "junior developer" analogy in the past, I am finding that it is beginning to be a bit more like that. The new Codex tool from OpenAI feels a lot more like this. It seems to work best if you already have a few examples of something that you want to do and now want to add another. My tactic is to spell it out very clearly in the prompt and really focus on having it consistently implement another similar thing with a narrow scope. Because it takes quite a while, I will usually just fix any issues myself as opposed to asking it to fix them. I'm still experimenting but I think a well crafted spec / AGENTS.md file begins to become quite important. For me, this + regular ChatGPT interactions are much more valuable than synchronous / Windsurf / Cursor style usage. I'd prefer to review a more meaningful PR than a million little diffs synchronously.
There's a hundred ways to use AI for any given work. For example if you are doing interesting work and aren't using AI-assisted research tools (e.g., OpenAI Deep Research) then you are missing out on making the work that more interesting by understanding the context and history of the subject or adjacent subjects.
This thesis only makes sense if the work is somehow interesting and you also have no desire to extend, expand, or enrich the work. That's not a plausible position.
> This thesis only makes sense if the work is somehow interesting and you also have no desire to extend, expand, or enrich the work. That's not a plausible position.
Or your interesting work wasn't appearing in training set often enough.
Currently I am writing a compiler and runtime for some niche modeling language, and every model I poke for help was rather useless except some obvious things I already know.
If AI can do the easiest 50% of our tasks, then it means we will end up spending all of our time on what we previously considered to be the most difficult 50% of tasks. This has a lot of implications, but it does generally result in the job being more interesting overall.
Or, alternatively, the difficult 50% are difficult because they're uninteresting, like trying to find an obscure workaround for an unfixed bug in excel, or re-authing for the n-th time today, or updating a Jira ticket, or getting the only person with access to a database to send you a dataset when they never as much as reply to your emails...
> we will end up spending all of our time on what we previously considered to be the most difficult 50% of tasks
Either that, or replacing the time with slacking off and not even getting whatever benefits doing the easiest tasks might have had (learning, the feeling of accomplishing something), like what some teachers see with writing essays in schools and homework.
The tech has the potential to let us do less busywork (which is great, even regular codegen for boilerplate and ORM mappings etc. can save time), it's just that it might take conscious effort not to be lazy with this freed up time.
The industry has already gone through many, many examples of software reducing developer effort. It always results in developers becoming more productive.
In my experience, the 50% most difficult part of a problem is often the most boring. E.g. writing tests, tracking down obscure bugs, trying to understand API or library documentation, etc. It's often stuff that is very difficult but doesn't take all that much creativity.
You'll potentially be building on flimsy foundations if it gets the foundational stuff wrong (see anecdote in sibling post). I fear for those who aren't so diligent, especially if there are consequences involved.
The strategy is to have it write tests, and spend your time making sure the tests are really comprehensive and correct, then mostly just trust the code. If stuff breaks down the line, add regression tests, fix the problem and continue with your day.
But. "Interesting" is subjective, and there's no good definition for "intelligence", AI has so much associated hype. So we could debate endlessly on HN.
Supposing "interesting" means something like coming up with a new Fast Fourier Transform algorithm. I seriously doubt an LLM could do something there. OTOH AI did do new stuff with protein folding.
I feel much more confident that I can take on a project in a domain that im not very familiar with. Ive been digging into llvm ir and I had not prior experience with it. ChatGPT is a much better guide to getting started than the documentation, which is very low quality.
I have been exploring local AI tools for coding (ollama + aider) with a small stock market simulator (~200 lines of python).
First I tried making the AI extract the dataclasses representing events to a separated file. It decided to extract some extra classes, leave behind some others, and delete parts of the code.
Then I tried to make it explain one of the actors called LongVol_player_v1, around 15 lines of code. It successfully concluded it does options delta hedging, but it jumped to the conclusion that it calculates the implied volatility. I set it as a constant, because I'm simulating specific interactions between volatility players and option dealers. It hasn't caught yet the bug where the vol player buys 3000 options but accounts only for 2000.
When asking for improvements, it is obsessed with splitting the initialization and the execution.
So far I wasted half of Saturday trying to make the machine do simple refactors. Refactors I could do myself in half of an hour.
Could you link the repo and prompts? What you described seems like the type of thing I’ve done before with no issue so you may have an interesting code base that is presenting some issues for the LM.
The vast majority of any interesting project is boilerplate. There's a small kernel of interesting 'business logic'/novel algorithm/whatever buried in a sea of CRUD: user account creation, subscription management, password resets, sending emails, whatever.
Yes so why would you spend tons of time and introduce a huge amount of technical debt by rewriting the boring parts, instead of just using a ready made off the shelf solution in that case.
You'd think that there be someone who'd be nice enough to create a library or a framework or something that's well documented and is popular enough to get support and updates. Maybe you should consider offloading the boring part to such a project, maybe even pay someone to do it?
That was a solved problem in the 00's with the advent of Rails, or so i thought. Then came the JS framework craze and everything needed to be reinvented. Not just that, but frameworks which had all these battle-tested boring parts were not trendy anymore. Micro framworks became the new default and idiots after idiots jumped on that bandwagon only to reimplement everything from scratch because almost any app will grow to a point where it will need authn, user mgmt, mail, groups and so on...
Most places I worked the setting up of that kind of boilerplate was done a long time ago. Yes it needs maintaining and extending. But rarely building from the ground up.
This depends entirely on the type of programming you do. If all you build is CRUD apps then sure. Personally I’ve never actually made any of those things — with or without AI
You are both right. B2B for instance is mostly fairly template stuff built from CRUD and some business rules. Even some of the more perceived as 'creative' niches such as music scoring or 3D games are fairly route interactions with some 'engine'.
And I'm not even sure these 'template adjacent' regurgitations are what the crude LLM is best at, as the output needs to pass some rigorous inflexible test to 'pass'. Hallucinating some non-existing function in an API will be a hard fail.
LLM's have a far easier time in domains where failures are 'soft'. This is why 'Elisa' passed as a therapist in the 60's, long before auto-programmers were a thing.
Also, in 'academic' research, LLM use has reached nearly 100%, not just for embelishing writeups to the expected 20 pages, but in each stage of the'game' including 'ideation'.
And if as a CIO you believe that your prohibition on using LLMs for coding because of 'divulging company secrets' holds, you are either strip searching your employees on the way in and out, or wilfully blind.
I'm not saing 'nobody' exists that is not using AI in anything created on a computer, just like some woodworker still handcrafts exclusive bespoke furniture in a time of presses, glue and CNC, but adoption is skyrocketing and not just because the C-suite pressures their serves into using the shiny new toy.
> "And if as a CIO you believe that your prohibition on using LLMs for coding because of 'divulging company secrets' holds, you are either strip searching your employees on the way in and out, or wilfully blind."
Right so if you are in certain areas you'll be legally required not to send your work to whatever 3:rd party that promises to handle it the cheapest.
Also so since this is about actually "interesting" work if you are doing cutting edge research on lets say military or medical applications** you definitely should take things like this seriously.
Obviously you can do LLM's locally if you don't feel like paying up for programmers who likes to code, and who wants to have in-depth knowledge of whatever they are doing.
Of course you should not violate company policy, and some environments will indeed have more stringent controls and measures, but there is a whole world of grey were the CIO has put in place a moratorium on LLM but where some people will quickly crunch out the day's work at home with an AI anyways so they look more productive.
You can of course run consider running your own LLM.
I suppose the problem isn't really the technology itself but rather the quality of the employees. There would've been a lot of people cheating the system before, lets say just by copy pasting or tricking your coworkers into doing the work for you.
However if you are working with something actually interesting, chances are that you're not working with disingenuous grifters and uneducated and lazy backstabbers, so that's less of a concern as well. If you are working on interesting projects hopefully these people would've been filtered out somewhere along the line.
I don't have LLM/AI write or generate any code or document for me. Partly because the quality is not good enough, and partly I worry about copyright/licensing/academic rigor, partly because I worry about losing my own edge.
But I do use LLM/AI, as a rubber duck that talks back, as a google on steroids - but one who needs his work double checked. And as domain discovery tool when quickly trying to get a grasp of a new area.
Its just another tool in the toolbox for me. But the toolbox is like a box of chocolates - you never know what you are going to get.
In the new world that's emerging, you are losing your edge by not learning how to master and leverage AI agents. Quality not good enough? Instruct them in how you want them to code, and make sure a sufficient quantity of the codebase is loaded into their context so they can see examples of what you consider good enough.
Writing SQL, I'll give ChatGPT the schema for 5 different tables. It habitually generates solutions with columns that don't exist. So, naturally, I append, "By the way, TableA has no column FieldB." Then it just imagines a different one. Or, I'll say, "Do not generate a solution with any table-col pair not provided above." It doesn't listen to that at all.
Thesis: Using the word “thesis” is a great way to disguise a whiny op-ed as the writings of a learned soul
> interesting work (i.e., work worth doing)
Let me guess, the work you do is interesting work (i.e., work worth doing) and the work other people do is uninteresting work (i.e., work not worth doing).
Curious to see examples of interesting non-boilerplate work that is now possible with AI. Most examples of what I've seen are a repeat of what has been done many times (i.e. probably occurs many times in the training data), but with a small tweak, or for different applications.
And I don't mean cutting-edge research like funsearch discovering new algorithm implementations, but more like what the typical coder can now do with off-the-shelf LLM+ offerings.
Such a cool review! thanks for posting it. Great to see that authoritative experts are sharing their time and thoughts, lots to learn from this review. Despite the caveats mentioned by Neil, I still think this is a good example of a "non trivial / not boilerplate thing done w/ LLMs". To think we got from chatgpt's cute "looks like python" scripts 2.5 years ago to these kinds of libraries is amazing in my book.
I'd be curious to see how the same exercise would go with Neil guiding claude. There's no debating that LLMs + domain knowledge >>> vibe coding, and I would be curious to see how that would go, and how much time/effort would an expert "save" by using the latest models.
It's definitely real that a lot of smart productive people don't get good results when they use AI to write software.
It's also definitely real that a lot of other smart productive people are more productive when they use it.
These sort of articles and comments here seem to be saying I'm proof it can't be done. When really there's enough proof it can be that you're just proving you'll be left behind.
My thesis is actually simpler. For the longest time until the Industrial Revolution humans have done uninteresting work for the large part. There was a routine and little else. Intellectuals worked through a very terse knowledge base and it was handed down master to apprentice. Post renaissance and industrial age the amount of known knowledge has exploded, the specializations have exploded. Most of what white collar work is today is managing and searching through this explosion of knowledge and rules. AI (well the LLM part) is mostly targeted towards that - making that automated. That’s all it is. Here is the problem though, it’s for the clueless. Those who are truly clueless fall victim to the hallucinations. Those who have expertise in their field will be able to be more efficient.
AI isn’t replacing innovation or original thought. It is just working off an existing body of knowledge.
I disagree that ancient work was uninteresting. If you've ever looked at truly old architecture, walls, carvings etc you can see that people really took pride in their work, adding things that absolutely weren't just pure utility. In my mind that's the sign of someone that considers their work interesting.
But in general, in the past there was much less specialization. That means each individual was responsible for a lot more stuff, and likely had a lot more varied work day. The apprentice blacksmith didn't just hammer out nail after nail all day with no breaks. They made all sorts of tools, cutlery, horseshoes. But they also carried water, operated bellows, went to fetch coke etc, sometimes even spending days without actually hammering metal at all - freeing up mental energy and separation to be able to enjoy it when they actually got to do it.
Similarly, farm laborers had massively varied lives. Their daily tasks of a given week or month would look totally different depending on the season, with winter essentially being time off to go fix or make other stuff because you can't do much more than wait to make plants grow faster
People might make the criticism and say "oh but that was only for rich people/government" etc, but look at for example old street lights, bollards etc. Old works tend to be
Specialization allows us to curse ourselves with efficiency, and a curse it is indeed. Now if you're good at hammering nails, nails are all you'll get, morning to night, and rewarded the shittier and cheaper and faster you make your nails, sucking all incentive to do any more than the minimum
I have gotten much more value out of AI tools by focusing on the process and not the product. By this I mean that I treat it as a loosely-defined brainstorming tool that expands my “zone of knowledge”, and not as a way to create some particular thing.
In this way, I am infinitely more tolerant of minor problems in the output, because I’m not using the tool to create a specific output, I’m using it to enhance the thing I’m making myself.
To be more concrete: let’s say I’m writing a book about a novel philosophical concept. I don’t use the AI to actually write the book itself, but to research thinkers/works that are similar, critique my arguments, make suggestions on topics to cover, etc. It functions more as a researcher and editor, not a writer – and in that sense it is extremely useful.
I think it's a U-shaped utility curve where abstract planning is on one side (your comment) and the chore implementation is on the other.
Your role is between the two: deciding on the architecture, writing the top-level types, deciding on the concrete system design.
And then AI tools help you zoom in and glue things together in an easily verifiable way.
I suspect that people who still haven't figured out how to make use of LLMs, assuming it's not just resentful performative complaining which it probably is, are expecting it to do it all. Which never seemed very engineer-minded.
You don’t empathize with the humane opinion “why bother?” I like to program so it resonates. I’m fortunate to enjoy my work so why would I want to stop doing what I enjoy?
Agree - I tend to think of it as offloading thinking time. Delegating work to an agent just becomes more work for me, with the quality I've seen. But conversations where I control the context are both fun and generally insightful, even if I decide the initial idea isn't a good one.
That is a good metaphor. I frequently use ChatGPT in a way that basically boils down to: I could spend an hour thinking about and researching X basic thing I know little about, or I could have the AI write me a summary that is 95% good enough but only takes a few seconds of my time.
A Danish audio newspaper host / podcaster had the exact apposite conclusion when he used ChatGPT to write the manuscript for one his episodes. He ended up spending as much time as he usually does because he had to fact check everything that the LLM came up with. Spoiler: It made up a lot of stuff despite it being very clear in the prompt, that it should not do so. To him, it was the most fun part, that is writing the manuscript, that the chatbot could help him with. His conclusion about artificial intelligence was this:
“We thought we were getting an accountant, but we got a poet.”
Frederik Kulager: Jeg fik ChatGPT til at skrive dette afsnit, og testede, om min chefredaktør ville opdage det. https://open.spotify.com/episode/22HBze1k55lFnnsLtRlEu1?si=h...
> It made up a lot of stuff despite it being very clear in the prompt, that it should not do so.
LLMs are not sentient. They are designed to make stuff up based on probability.
Why would sentience be required for logically sound reasoning (or the reverse, for that matter)?
I love this turn of phrase. It quite nicely evokes the difference between how the reader thinks vs how the LLM does.
It also invites reflections on what “sentience” means. In my experience — make of it what you will — correct fact retrieval isn’t really necessary or sufficient for there to be a lived, first-person experience.
Unfortunately, they could have been thinking, but the designation of the training/inference separation made them all specimens.
https://news.ycombinator.com/item?id=44488126
It's not the exact opposite*, the author said that if you're doing boilerplate _code_ it's probably fine.
The thing is that since it can't think, it's absolutely useless when it comes to things that hasn't been done before, because if you are creating something new, the software won't have had any chance to train on what you are doing.
So if you are in a situation in which it is a good idea to create a new DSL for your problem **, then the autocruise control magic won't work because it's a new language.
Now if you're just mashing out propaganda like some brainwashed soviet apparatchik propagandist, maybe it helps. So maybe people who writes predictable slop like this the guardian article (https://archive.is/6hrKo) would be really grateful that their computer has a cruise control for their political spam.
) if that's what you meant *) which you statistically speaking might not want to do, but this is about actually interesting work where it's more likely to happen*
Maybe reconsider assumptions? Maybe DSLs shouldn't be done anymore if they're not able to be utilized by AI agents easily
In a world where the AI can understand your function library near flawlessly and compose it in to all sorts of things, why would you put the effort into a DSL that humans will have to learn and the AI will trip over? This is a dead pattern.
As a writer I find his take appalling and incomprehensible. So, apparently not all writers agree that writing with AI is fun. To me, it’s a sickening violation of integrity.
It's all fine as long as you keep that fetish in your dungeon.
Yeah, if I were their reader, I'd most likely never read anything from them again, since nothing's stopping them from doing away with integrity altogether and just stitching together a bunch of scripts ('agents') into an LLM slop pipeline.
It's so weird how people use LLMs to automate the most important and rewarding parts of the creative process. I get that companies have no clue how to market the things, but it really shows a lack of imagination and self-awareness when a 'creative' repackages slop for their audience and calls it 'fun'.
The one thing AI is good at is building greenfield projects from scratch using established tools. If want you want to accomplish can be done by a moderately capable coder with some time reading the documentation for the various frameworks involved, then I view AI as fairly similar to the scaffolding that happened with Ruby on Rails back in the day when I typed "rails new myproject".
So LLMs are awesome if I want to say "create a dashboard in Next.js and whatever visualization library you think is appropriate that will hit these endpoints [dumping some API specs in there] and display the results to a non-technical user", along with some other context here and there, and get a working first pass to hack on.
When they are not awesome is if I am working on adding a map visualization to that dashboard a year or two later, and then I need to talk to the team that handles some of the API endpoints to discuss how to feed me the map data. Then I need to figure out how to handle large map pin datasets. Oh, and the map shows regions of activity that were clustered with DBSCAN, so I need to know that Alpha shape will provide a generalization of a convex hull that will allow me to perfectly visualize the cluster regions from DBSCAN's epsilon parameter with the corresponding choice of alpha parameter. Etc, etc, etc.
I very rarely write code for greenfield projects these days, sadly. I can see how startup founders are head over heels over this stuff because that's what their founding engineers are doing, and LLMs let them get it cranking very very fast. You just have to hope that they are prudent enough to review and tweak what's written so that you're not saddled with tech debt. And when inevitable tech debt needs paying (or working around) later, you have to hope that said founders aren't forcing their engineers to keep using LLMs for decisions that could cut across many different teams and systems.
I get what point you're trying to make, and agree, but you've picked a bad example.
That boilerplate heavy, skill-less, frontend stuff like configuring a map control with something like react-leaflet seems to be precisely what AI is good at.
Yeah it will make a map and plot some stuff on it. It might do well at handling 20 millions pins on the map gracefully even. I doubt it's gonna know to use alpha shapes to complement DBSCAN quite so gracefully.
edit: Just spot checked it and it thinks it's a good idea to use convex hulls.
I have found it fascinating how AI has forced me to reflect on what I actually do at work and whether it has value or not.
Those kinds of thought processes are the kinds that produce value.
Deciding what to build and how to build it is often harder than building.
What LLMs of today do is basically super-autocomplete. It's a continuation of the history of programming automation: compilers, more advanced compilers, IDEs, code generators, LINTers, autocomplete, codeinsight, etc.
> Meanwhile, I feel like if I tried to offload my work to an LLM, I would both lose context and be violating the do-one-thing-and-do-it-well principle I half-heartedly try to live by.
He should use it as a Stack Overflow on steroids. I assume he uses Stack Overflow without remorse.
I used to have 1y streaks on being on SO, now I'm there around once or twice per week.
While I didn't agree with the "junior developer" analogy in the past, I am finding that it is beginning to be a bit more like that. The new Codex tool from OpenAI feels a lot more like this. It seems to work best if you already have a few examples of something that you want to do and now want to add another. My tactic is to spell it out very clearly in the prompt and really focus on having it consistently implement another similar thing with a narrow scope. Because it takes quite a while, I will usually just fix any issues myself as opposed to asking it to fix them. I'm still experimenting but I think a well crafted spec / AGENTS.md file begins to become quite important. For me, this + regular ChatGPT interactions are much more valuable than synchronous / Windsurf / Cursor style usage. I'd prefer to review a more meaningful PR than a million little diffs synchronously.
There's a hundred ways to use AI for any given work. For example if you are doing interesting work and aren't using AI-assisted research tools (e.g., OpenAI Deep Research) then you are missing out on making the work that more interesting by understanding the context and history of the subject or adjacent subjects.
This thesis only makes sense if the work is somehow interesting and you also have no desire to extend, expand, or enrich the work. That's not a plausible position.
> This thesis only makes sense if the work is somehow interesting and you also have no desire to extend, expand, or enrich the work. That's not a plausible position.
Or your interesting work wasn't appearing in training set often enough. Currently I am writing a compiler and runtime for some niche modeling language, and every model I poke for help was rather useless except some obvious things I already know.
Yes, asking an LLM to "think outside the box" won't work. It is the box.
If AI can do the easiest 50% of our tasks, then it means we will end up spending all of our time on what we previously considered to be the most difficult 50% of tasks. This has a lot of implications, but it does generally result in the job being more interesting overall.
Or, alternatively, the difficult 50% are difficult because they're uninteresting, like trying to find an obscure workaround for an unfixed bug in excel, or re-authing for the n-th time today, or updating a Jira ticket, or getting the only person with access to a database to send you a dataset when they never as much as reply to your emails...
> we will end up spending all of our time on what we previously considered to be the most difficult 50% of tasks
Either that, or replacing the time with slacking off and not even getting whatever benefits doing the easiest tasks might have had (learning, the feeling of accomplishing something), like what some teachers see with writing essays in schools and homework.
The tech has the potential to let us do less busywork (which is great, even regular codegen for boilerplate and ORM mappings etc. can save time), it's just that it might take conscious effort not to be lazy with this freed up time.
The industry has already gone through many, many examples of software reducing developer effort. It always results in developers becoming more productive.
In my experience, the 50% most difficult part of a problem is often the most boring. E.g. writing tests, tracking down obscure bugs, trying to understand API or library documentation, etc. It's often stuff that is very difficult but doesn't take all that much creativity.
I disagree with all of those. Tracking down obscure bugs is interesting, and all the other examples are easy.
>This has a lot of implications, but it does generally result in the job being more interesting overall.
One implication is that when AI providers claim that "AI can make a person TWICE as productive!"
... business owners seem to be hearing that as "Those users should cost me HALF as much!"
You'll potentially be building on flimsy foundations if it gets the foundational stuff wrong (see anecdote in sibling post). I fear for those who aren't so diligent, especially if there are consequences involved.
The strategy is to have it write tests, and spend your time making sure the tests are really comprehensive and correct, then mostly just trust the code. If stuff breaks down the line, add regression tests, fix the problem and continue with your day.
> If AI can do the easiest 50% of our tasks
...But it can't, which means your inference has no implications, because it evaluates to False.
Here we go again.
But. "Interesting" is subjective, and there's no good definition for "intelligence", AI has so much associated hype. So we could debate endlessly on HN.
Supposing "interesting" means something like coming up with a new Fast Fourier Transform algorithm. I seriously doubt an LLM could do something there. OTOH AI did do new stuff with protein folding.
So, we can keep debating I guess.
I feel much more confident that I can take on a project in a domain that im not very familiar with. Ive been digging into llvm ir and I had not prior experience with it. ChatGPT is a much better guide to getting started than the documentation, which is very low quality.
Careful - if you’re not familiar with the domain how are you going to spot when the LLM gives you suboptimal or even outright wrong answers?
Just like anything else, stackoverflow, advice from a coworker or expert. If it doesn’t work, it will become clear that it’s not fixing your problem.
Testing
Good luck with that.
I have been exploring local AI tools for coding (ollama + aider) with a small stock market simulator (~200 lines of python).
First I tried making the AI extract the dataclasses representing events to a separated file. It decided to extract some extra classes, leave behind some others, and delete parts of the code.
Then I tried to make it explain one of the actors called LongVol_player_v1, around 15 lines of code. It successfully concluded it does options delta hedging, but it jumped to the conclusion that it calculates the implied volatility. I set it as a constant, because I'm simulating specific interactions between volatility players and option dealers. It hasn't caught yet the bug where the vol player buys 3000 options but accounts only for 2000.
When asking for improvements, it is obsessed with splitting the initialization and the execution.
So far I wasted half of Saturday trying to make the machine do simple refactors. Refactors I could do myself in half of an hour.
I'm yet to see the wonders of AI.
If you are using Ollama that suggests you are using local models - which ones?
My experience is that the hosted frontier models (o3, Gemini 2.5, Claude 4) would handle those problems with ease.
Local models that fit on a laptop are a lot less capable, sadly.
Could you link the repo and prompts? What you described seems like the type of thing I’ve done before with no issue so you may have an interesting code base that is presenting some issues for the LM.
The vast majority of any interesting project is boilerplate. There's a small kernel of interesting 'business logic'/novel algorithm/whatever buried in a sea of CRUD: user account creation, subscription management, password resets, sending emails, whatever.
Yes so why would you spend tons of time and introduce a huge amount of technical debt by rewriting the boring parts, instead of just using a ready made off the shelf solution in that case.
You'd think that there be someone who'd be nice enough to create a library or a framework or something that's well documented and is popular enough to get support and updates. Maybe you should consider offloading the boring part to such a project, maybe even pay someone to do it?
That was a solved problem in the 00's with the advent of Rails, or so i thought. Then came the JS framework craze and everything needed to be reinvented. Not just that, but frameworks which had all these battle-tested boring parts were not trendy anymore. Micro framworks became the new default and idiots after idiots jumped on that bandwagon only to reimplement everything from scratch because almost any app will grow to a point where it will need authn, user mgmt, mail, groups and so on...
Most places I worked the setting up of that kind of boilerplate was done a long time ago. Yes it needs maintaining and extending. But rarely building from the ground up.
This depends entirely on the type of programming you do. If all you build is CRUD apps then sure. Personally I’ve never actually made any of those things — with or without AI
You are both right. B2B for instance is mostly fairly template stuff built from CRUD and some business rules. Even some of the more perceived as 'creative' niches such as music scoring or 3D games are fairly route interactions with some 'engine'.
And I'm not even sure these 'template adjacent' regurgitations are what the crude LLM is best at, as the output needs to pass some rigorous inflexible test to 'pass'. Hallucinating some non-existing function in an API will be a hard fail.
LLM's have a far easier time in domains where failures are 'soft'. This is why 'Elisa' passed as a therapist in the 60's, long before auto-programmers were a thing.
Also, in 'academic' research, LLM use has reached nearly 100%, not just for embelishing writeups to the expected 20 pages, but in each stage of the'game' including 'ideation'.
And if as a CIO you believe that your prohibition on using LLMs for coding because of 'divulging company secrets' holds, you are either strip searching your employees on the way in and out, or wilfully blind.
I'm not saing 'nobody' exists that is not using AI in anything created on a computer, just like some woodworker still handcrafts exclusive bespoke furniture in a time of presses, glue and CNC, but adoption is skyrocketing and not just because the C-suite pressures their serves into using the shiny new toy.
> "And if as a CIO you believe that your prohibition on using LLMs for coding because of 'divulging company secrets' holds, you are either strip searching your employees on the way in and out, or wilfully blind."
Right so if you are in certain areas you'll be legally required not to send your work to whatever 3:rd party that promises to handle it the cheapest.
Also so since this is about actually "interesting" work if you are doing cutting edge research on lets say military or medical applications** you definitely should take things like this seriously.
Obviously you can do LLM's locally if you don't feel like paying up for programmers who likes to code, and who wants to have in-depth knowledge of whatever they are doing.
** https://www.bbc.co.uk/news/articles/c2eeg9gygyno
Of course you should not violate company policy, and some environments will indeed have more stringent controls and measures, but there is a whole world of grey were the CIO has put in place a moratorium on LLM but where some people will quickly crunch out the day's work at home with an AI anyways so they look more productive.
You can of course run consider running your own LLM.
I suppose the problem isn't really the technology itself but rather the quality of the employees. There would've been a lot of people cheating the system before, lets say just by copy pasting or tricking your coworkers into doing the work for you.
However if you are working with something actually interesting, chances are that you're not working with disingenuous grifters and uneducated and lazy backstabbers, so that's less of a concern as well. If you are working on interesting projects hopefully these people would've been filtered out somewhere along the line.
I don't have LLM/AI write or generate any code or document for me. Partly because the quality is not good enough, and partly I worry about copyright/licensing/academic rigor, partly because I worry about losing my own edge.
But I do use LLM/AI, as a rubber duck that talks back, as a google on steroids - but one who needs his work double checked. And as domain discovery tool when quickly trying to get a grasp of a new area.
Its just another tool in the toolbox for me. But the toolbox is like a box of chocolates - you never know what you are going to get.
In the new world that's emerging, you are losing your edge by not learning how to master and leverage AI agents. Quality not good enough? Instruct them in how you want them to code, and make sure a sufficient quantity of the codebase is loaded into their context so they can see examples of what you consider good enough.
>Instruct them in how you want them to code
They don't always listen.
Writing SQL, I'll give ChatGPT the schema for 5 different tables. It habitually generates solutions with columns that don't exist. So, naturally, I append, "By the way, TableA has no column FieldB." Then it just imagines a different one. Or, I'll say, "Do not generate a solution with any table-col pair not provided above." It doesn't listen to that at all.
I haven't had that problem with Gemini 2.5 pro or O3, are you on the free tier of ChatGPT?
I am 100% sure that horse-breeders and carriage-decorators also had very high interest in their work and craft.
Thesis: Using the word “thesis” is a great way to disguise a whiny op-ed as the writings of a learned soul
> interesting work (i.e., work worth doing)
Let me guess, the work you do is interesting work (i.e., work worth doing) and the work other people do is uninteresting work (i.e., work not worth doing).
Funny how that always happens!
But... agentic changes everything!
... for the worse. :-)
I remember I thought cars were pretty shit when I didn't know how to drive.
[flagged]
Curious to see examples of interesting non-boilerplate work that is now possible with AI. Most examples of what I've seen are a repeat of what has been done many times (i.e. probably occurs many times in the training data), but with a small tweak, or for different applications.
And I don't mean cutting-edge research like funsearch discovering new algorithm implementations, but more like what the typical coder can now do with off-the-shelf LLM+ offerings.
Here's a couple examples: https://lucumr.pocoo.org/2025/6/21/my-first-ai-library/ https://www.indragie.com/blog/i-shipped-a-macos-app-built-en...
> Curious to see examples of interesting non-boilerplate work that is now possible with AI.
Previously discussed on HN - oAuth library at cloudflare - https://news.ycombinator.com/item?id=44159166
For a review of this library see https://neilmadden.blog/2025/06/06/a-look-at-cloudflares-ai-...
Upshot: though it's possible to attempt this with (heavily supervised) LLMs, it's not recommended.
Such a cool review! thanks for posting it. Great to see that authoritative experts are sharing their time and thoughts, lots to learn from this review. Despite the caveats mentioned by Neil, I still think this is a good example of a "non trivial / not boilerplate thing done w/ LLMs". To think we got from chatgpt's cute "looks like python" scripts 2.5 years ago to these kinds of libraries is amazing in my book.
I'd be curious to see how the same exercise would go with Neil guiding claude. There's no debating that LLMs + domain knowledge >>> vibe coding, and I would be curious to see how that would go, and how much time/effort would an expert "save" by using the latest models.
Oh it's feels like crypto again. Outlandish statements but no argument. "Few Understand" as they say.
It has basically ruined this bored with stupid thoughtless comments like this on every fucking article.
yes
It's definitely real that a lot of smart productive people don't get good results when they use AI to write software.
It's also definitely real that a lot of other smart productive people are more productive when they use it.
These sort of articles and comments here seem to be saying I'm proof it can't be done. When really there's enough proof it can be that you're just proving you'll be left behind.
>you're just proving you'll be left behind.
... said every grifter ever since the beginning of time.