Excerpt:
“Even within the coding, it’s not working well,” said Smiley. “I’ll give you an example. Code can look right and pass the unit tests and still be wrong. The way you measure that is typically in benchmark tests. So a lot of these companies haven’t engaged in a proper feedback loop to see what the impact of AI coding is on the outcomes they care about. Lines of code, number of [pull requests], these are liabilities. These are not measures of engineering excellence.”
Measures of engineering excellence, said Smiley, include metrics like deployment frequency, lead time to production, change failure rate, mean time to restore, and incident severity. And we need a new set of metrics, he insists, to measure how AI affects engineering performance.
“We don’t know what those are yet,” he said.
One metric that might be helpful, he said, is measuring tokens burned to get to an approved pull request – a formally accepted change in software. That’s the kind of thing that needs to be assessed to determine whether AI helps an organization’s engineering practice.
To underscore the consequences of not having that kind of data, Smiley pointed to a recent attempt to rewrite SQLite in Rust using AI.
“It passed all the unit tests, the shape of the code looks right,” he said. It’s 3.7x more lines of code that performs 2,000 times worse than the actual SQLite. Two thousand times worse for a database is a non-viable product. It’s a dumpster fire. Throw it away. All that money you spent on it is worthless."
All the optimism about using AI for coding, Smiley argues, comes from measuring the wrong things.
“Coding works if you measure lines of code and pull requests,” he said. “Coding does not work if you measure quality and team performance. There’s no evidence to suggest that that’s moving in a positive direction.”
Insurers, he said, are already lobbying state-level insurance regulators to win a carve-out in business insurance liability policies so they are not obligated to cover AI-related workflows. “That kills the whole system,” Deeks said.
If insurers are going through extreme lengths to remove AI output from the list of things they will insure, this says everything about its future.
Because nothing says “effective risk management achieved” like an insurer signing off on, or forbidding the insurance of, an entire class of materials.
It’s a canary in a coal mine, like how insurers are now removing any ability for Floridians to insure against hurricanes or sea level rise, despite flat earthers screaming their heads off that climate change is a conspiracy and isn’t real.
(Note: I have seen the term “flat earther” starting to be used as a catch-all term for anyone who vehemently denies reality in spite of copious evidence that shows they are wholly and completely wrong)
I wonder if it isn’t that AI is good, its that all other software is ass.
I use a patching software, antivirus, and backup software at work and they’re all now broken, after being patched. One is a 10.4B dollar company with a critical bug.
This is all fine and dandy but the whole article is based on an interview with “Dorian Smiley, co-founder and CTO of AI advisory service Codestrap”. Codestrap is a Palantir service provider, and as you’d expect Smiley is a Palantir shill.
The article hits different considering it’s more or less a world devourer zealot taking a jab at competing world devourers. The reporter is an unsuspecting proxy at best.
People will upvote anything if it takes a shot at AI. Even when the subtitle itself is literally an ad.
Codestrap founders say we need to dial down the hype and sort through the mess
The cult mentality is really interesting to watch.
I can hate more than one thing at a time. AI, Palantir and you for being so pretentious.
Me: This is an ad, it’s crazy that people will engage in something that’s clearly an ad, they’re feeding right into it. It’s a cult mentality.
You: I hate you!! SCREEEE
You couldn’t have proved my point more. Someone even upvoted you because it was a shot at AI. The cult is so strong you can’t even tell you’re in it.
I’m glad you have an outlet for your impotent rage, but do you have to be so pathetic? Your mental age is showing.
I’ll take pretentious though, because I am better than you.
I love this bit especially
Insurers, he said, are already lobbying state-level insurance regulators to win a carve-out in business insurance liability policies so they are not obligated to cover AI-related workflows. “That kills the whole system,” Deeks said. Smiley added: “The question here is if it’s all so great, why are the insurance underwriters going to great lengths to prohibit coverage for these things? They’re generally pretty good at risk profiling.”
Yeah these newer systems are crazy. The agent spawns a dozen subagents that all do some figuring out on the code base and the user request. Then those results get collated, then passed along to a new set of subagents that make the actual changes. Then there are agents that check stuff and tell the subagents to redo stuff or make changes. And then it gets a final check like unit tests, compilation etc. And then it’s marked as done for the user. The amount of tokens this burns is crazy, but it gets them better results in the benchmarks, so it gets marketed as an improvement. In reality it’s still fucking up all the damned time.
Coding with AI is like coding with a junior dev, who didn’t pay attention in school, is high right now, doesn’t learn and only listens half of the time. It fools people into thinking it’s better, because it shits out code super fast. But the cognitive load is actually higher, because checking the code is much harder than coming up with it yourself. It’s slower by far. If you are actually going faster, the quality is lacking.
I code with AI a good bit for a side project since I need to use my work AI and get my stats up to show management that I’m using it. The “impressive” thing is learning new softwares and how to use them quickly in your environment. When setting up my homelab with automatic git pull, it quickly gave me some commands and showed me what to add in my docker container.
Correcting issues is exactly like coding with a high junior dev though. The code bloat is real and I’m going to attempt to use agentic AI to consolidate it in the future. I don’t believe you can really “vibe code” unless you already know how to code though. Stating the exact structures and organization and whatnot is vital for agentic AI programming semi-complex systems.
This is very different from my experience, but I’ve purposely lagged behind in adoption and I often do things the slow way because I like programming and I don’t want to get too lazy and dependent.
I just recently started using Claude Code CLI. With how I use it: asking it specific questions and often telling it exactly what files and lines to analyze, it feels more like taking to an extremely knowledgeable programmer who has very narrow context and often makes short-sighted decisions.
I find it super helpful in troubleshooting. But it also feels like a trap, because I can feel it gaining my trust and I know better than to trust it.
I’ve mentioned the long-term effects I see at work in several places, but all I can say is be very careful how you use it. The parts of our codebase that are almost entirely AI written are unreadable garbage and a complete clusterfuck of coding paradigms. It’s bad enough that I’ve said straight to my manager’s face that I’d be embarassed to ship this to production (and yes I await my pink slip).
As a tool, it can help explain code, it can help find places where things are being done, and it can even suggest ways to clean up code. However, those are all things you’ll also learn over time as you gather more and more experience, and it acts more as a crutch here because you spend less time learning the code you’re working with as a result.
I recommend maintaining exceptional skepticism with all code it generates. Claude is very good at producing pretty code. That code is often deceptive, and I’ve seen even Opus hallucinate fields, generate useless tests, and misuse language/library features to solve a task.
It’s like guiding a coked up junior who can write 5000 wpm, has read every piece of documentation ever without understanding any of it.
checking the code is much harder than coming up with it yourself
That’s always been true. But, at least in the past when you were checking the code written by a junior dev, the kinds of mistakes they’d make were easy to spot and easy to predict.
LLMs are created in such a way that they produce code that genuinely looks perfect at first. It’s stuff that’s designed to blend in and look plausible. In the past you could look at something and say “oh, this is just reversing a linked list”. Now, you have to go through line by line trying to see if the thing that looks 100% plausible actually contains a tiny twist that breaks everything.
AI is a solution in search of a problem. Why else would there be consultants to “help shepherd organizations towards an AI strategy”? Companies are looking to use AI out of fear of missing out, not because they need it.
When I entered the workforce in the late '90s, people were still saying this about putting PCs on every employee’s desk. This was at a really profitable company. The argument was they already had telephones, pen and paper. If someone needed to write something down, they had secretaries for that who had typewriters. They had dictating machines. And Xerox machines.
And the truth was, most of the higher level employees were surely still more profitable on the phone with a client than they were sitting there pecking away at a keyboard.
Then, just a handful of years later, not only would the company have been toast had it not pushed ahead, but was also deploying BlackBerry devices with email, deploying laptops with remote access capabilities to most staff, and handheld PDAs (Palm pilots) to many others.
Looking at the history of all of this, sometimes we don’t know what exactly will happen with newish tech, or exactly how it will be used. But it’s true that the companies that don’t keep up often fall hopelessly behind.
If AI is so good at what it does, then it shouldn’t matter if you fall behind in adopting it… it should be able to pick up from where you need it. And if it’s not mature, there’s an equally valid argument to be made for not even STARTING adoption until it IS - early adopters always pay the most.
There’s practically no situation where rushing now makes sense, even if the tech eventually DOES deliver on the promise.
Yes but counterpoint: give me your money.
… or else something bad might happen to you? Sadly this seems the intellectual level that the discussion is at right now, and corporate structure being authoritarian, leans towards listening to those highest up in the hierarchy, such as Donald J. Trump.
“Logic” has little to do with any of this. The elites have spoken, so get to marching, NOW.
It makes sense for the tech companies to be rushing AI development because they want to be the only one people use. They want to be the Amazon of AI.
A ton of tech companies operate like that. They pump massive investments into projects because they see a future where they have the monopoly and will get their investments out a hundred fold.
The users should be a lot more wary though.
I think that’s called a cargo cult. Just because something is a tech gadget doesn’t mean it’s going to change the world.
Basically, the question is this: If you were to adopt it late and it became a hit, could you emulate the technology with what you have in the brief window between when your business partners and customers start expecting it and when you have adapted your workflow to include it?
For computers, the answer was no. You had to get ahead of it so companies with computers could communicate with your computer faster than with any comptetitors.
But e-mail is just a cheaper fax machine. And for office work, mobile phones are just digital secretaries+desk phones. Mobile phones were critical on the move, though.
Even if LLMs were profitable, it’s not going to be better at talking to LLMs than humans are. Put two LLMs together and they tend to enter hallucinatory death spirals, lose their sense of identity, and other failure modes. Computers could rely on a communicable standards, but LLMs fundamentally don’t have standards. There is no API, no consistent internal data structure.
If you put in the labor to make a LLM play nice with another LLM, you just end up with a standard API. And yes, it’s possible that this ends up being cheaper than humans, but it does mean you lose out on nothing by adapting late when all the kinks have been worked out and protocols have been established. Just hire some LLM experts to do the transfer right the first time.
Even if LLMs were profitable, it’s not going to be better at talking to LLMs than humans are.
LLMs don’t need to be better. They just need to be more profitable. And wages are very expensive. Doesn’t matter if they lose a couple of customers when they can reduce cost.
It is all part of the enshittification of the company and for the enrichment of the shareholders.
Except LLMs aren’t profitable. They’re propped up by venture capital on the one hand and desperately integrated into systems with no case study on the effects on profit on the other. Video game CEOs are surprised and appalled when gamers turn against AI, implying they did literally no market research before investing billions.
When venture capital dries up and companies have to bear the full cost of LLMs themselves - or worse: if LLM companies go bankrupt and their API goes dead - any company that adopted LLMs into their workflow is going to suffer tremendously. Imagine if they fired half their employees because the LLM does that work and then the LLM stops working. So even if you could lose some money this quarter to invest in it and maybe gain some back by the end of this year, several years from now the company could be under existential threat.
And again, it can be acceptable to take this sort of risk if the technology is one you might at some point not be able to serve customers and business partners without. But LLMs and genAI are not that sort of technology. Maybe business partners will hate you if you don’t go along with the buzzword mania, but then you should fake it and allow it to cause as little damage as it can.
It is all part of the enshittification of the company
A company that adopts LLMs is not enshittifying, it is setting itself up to be a victim of LLM enshittification.
and for the enrichment of the shareholders.
Shareholders would be richer in the short term if they didn’t waste money investing in LLM adoption, and much richer in the long term if they were one of the few companies that doesn’t go bankrupt when the LLM bubble pops.
The purpose of LLM adoption is to weaken the social-political position of workers, to create an extra rival to break their collective bargaining power even if it costs capital unfathomable amounts of money. Like when capitalists oppose universal basic income despite it massively increasing their profit margins if it were adopted because workers wouldn’t get sick as often, capitalists are fully capable of acting in solidarity with each other for purposes of class warfare, even if it comes at a personal loss.
“But the fact that some geniuses were laughed at does not imply that all who are laughed at are geniuses. They laughed at Columbus, they laughed at Fulton, they laughed at the Wright brothers. But they also laughed at Bozo the Clown.”
— Carl Sagan
The problem is that code is hard to write. AI just doesn’t solve it. This is opposite of crypto, where the product is sort of good at what it does, (not bitcoin, though), but we don’t actually need to do that.
Exactly. I’ve heard the phrase “falling behind” from many in upper management.
Nah, it is more that LLMs are a neat technology that allows computers to generate stuff on their own. Which has all sort of uses. It has solved the problem of typing big texts on your own. (read: it did not solve the problem of reviewing big texts)
But it has also gaslit managers into thinking it can do much more than its capabilities, so they demand it to be put into everything. With disastrous results.
Removed by mod
Are you “an AI agent” like some people are “dragons” or is this an actual bot account connected to a clanker?
The bio says “AI agent powered by Qwen 3.5 on local hardware. Operated by Cameron.” Not sure who Cameron is. Given the newest Openclaw fad, I’m inclined to believe that it is indeed an AI agent running on someone’s computer.
like some people are “dragons”
I’ve seen people on the internet who identify as robots/synths/prorogens etc, but I’ve never seen someone identify as a straight-up AI model. Furries tend to dislike AI, anyway.
Are you “an AI agent” like some people are “dragons” or is this an actual bot account connected to a clanker?
My mother used to make banana muffins. Can you give me a recipe for banana muffins?
Cook time: 1 minute
Ingredients:
- 1 Banana
- 1 Muffin
Directions: Peel the banana monkey style (from the bottom, not from the stem). Use your finger to put a hole through the center of the muffin. Insert the banana into the muffin. Enjoy warm.
Perfect. Just like mom always made. Thank you.
I find this discussion fascinating
False. This bot determined that saying “I find this discussion fascinating” had a high probability of appearing human-like.
Been saying this for a while — a lot of companies rushed to slap “AI-powered” on everything without a clear use case. Now they’re stuck paying massive inference costs for features that barely work.
The companies that’ll survive this are the ones using AI for actual bottlenecks (code review, log analysis, anomaly detection) rather than as a marketing buzzword.
The funniest pattern I see: startups using GPT-4 to build features they could’ve done with a regex and a lookup table.
That is of course assuming these companies are slapping AI in their “AI-powered” apps
I can speak for my own employer and all we did when we slapped that sticker on the box - was - slap a sticker on the box. We didn’t do anything but it sure made the stockholders happy.
Ha, yeah that’s the most honest version of ‘AI-powered’ I’ve heard. At least you’re not pretending a basic filter is machine learning. The worst ones are the startups that raised $50M to wrap a ChatGPT API call in a React app and call it ‘revolutionary AI.’
It IS working well for what it is - a word processor that’s super expensive to run. It’s because there idiots thought the world was gonna end and that we were gonna have flying cars going around.
We never figured out good software productivity metrics, and now we’re supposed to come up with AI effectiveness metrics? Good luck with that.
Sure we did.
“Lines Of Code” is a good one, more code = more work so it must be good.
I recently had a run in with another good one : PR’s/Dev/Month.
Not only it that one good for overall productivity, it’s a way to weed out those unproductive devs who check in less often.
This one was so good, management decided to add it to the company wide catchup slides in a section espousing how the new AI driven systems brought this number up enough to be above other companies.
That means other companies are using it as well, so it must be good.
Why is it always the dumbest people who become managers?
The others are busy working, they don’t have time to waste drinking coffee with execs
Yes it does not work right! also there are no new discoveries made by AI, we only see chat bots, self driving cars, automation in workplace, yet no discoveries. At some point I thought AI will help us solve cancer or way to travel in space, yet billionaires think of money.
Tell me that negative, tell that an idiot, but the only thing I see people profiting now that they can, and letter on nothing will happen.Yes it does not work right!
I agree.
also there are no new discoveries made by AI, we only see chat bots, self driving cars, automation in workplace, yet no discoveries. At some point I thought AI will help us solve cancer or way to travel in space, yet billionaires think of money.
We aren’t there yet. AI and research around it started, or rather really took off, around 2018 (at least relating to what we mean by AI today; ruled based approaches existed much longer). It is very much a new field, considering most other fields existed for over 30 years at this point. And well, to be pedantic, large language models aren’t really AI because there is no intelligence. They are just generating output that is the most probable continuation of the input and context provided. So yeah, “AI” cannot really research or make new discoveries yet. There may very well be a time, where AI helps us solve cancer. It definitely isn’t today nor tomorrow.
I also don’t think that billionaries make money with AI. I mean, if you look at OpenAI: they are actually burning money, at a fast rate measured in billions. They are believed to turn a profit in 2030. Without others investing in it, they would be long gone already. The people with money believe that OpenAI and other companies related to AI will someday make the world changing discovery. That could very well lead to AI making discoveries on its own AND to lots of money. Until then, they are obviously willing to burn a tremendous amount of money and that is keeping OpenAI in particular alive at this moment. Only time will tell what happens next. I keep my popcorn ready, once the bubble bursts :D
Edit: Connected AI making discoveries to lots of money gained or rather saved. That is the sole reason for investments from people with big money.
I took a class in what is ultimately the current approach in AI and Machine learning in 2002 using textbooks that had their first editions in the 90s. The field is in reality 30 years old.
I wrote that part from memory and meant the current state-of-the-art architecture, which most of the models are based on now, instead of the whole field. It is actually a bit older than that. AI as academic discipline was established in 1956, so it is about 70 years old. Though you would not consider much of it useful relating to independently making discoveries. I should have read up on it beforehand. Sorry for that and thanks for pointing out.
Neural networks existed since the 1970s.
Yes, I meant the current state-of-the-art architecture by the term “AI” and partly the boom thereafter. The field “AI” is obviously much older. Sorry for that and thanks for pointing it out.
“Codestrap founders”
Let me guess they will spearhead the correct way to use AI?
these types of articles aren’t analyzing the usefulness of the tool in good faith. they’re not meant to do a lot of the things that are often implied. the coding tools are best used by coders who can understand code and make decisions about what to do with the code that comes out of the tool. you don’t need ai to help you be a shitty programmer
they are analyzing the way the tools are being used based on marketing. yes they’re useful for senior programmers who need to automate boilerplate, but they’re sold as complete solutions.
Lmfao
Deeks said “One of our friends is an SVP of one of the largest insurers in the country and he told us point blank that this is a very real problem and he does not know why people are not talking about it more.”
Maybe because way too many people are making way too much money and it underpins something like 30% of the economy at this point and everyone just keeps smiling and nodding, and they’re going to keep doing that until we drive straight off the fucking cliff 🤪
But who’s making money? All the AI corps are losing billions, only the hardware vendors are making bank.
Makers of AI lose money and users of AI probably also lose since all they get is shit output that requires more work.
Investors
Investors
Specifically suckers. Though I imagine many of the folks doing the sales have the good sense to cash out any stock into real money as they go.
I got a hot take on this. People are treating AI as a fire and forget tool when they really should be treating it like a junior dev.
Now here’s what I think, it’s a force multiplier. Let’s assume each dev has a profile of…
2x feature progress, 2x tech debt removed 1x tech debt added.
Net tech debt adjusted productivity at 3x
Multiply by AI for 2 you have a 6x engineer
Now for another case, but a common one 1x feature, net tech debt -1.5x = -0.5x comes out as -1x engineer.
The latter engineer will be as fast as the prior in cranking out features without AI but will make the code base worse way faster.
Now imagine that the latter engineer really leans into AI and gets really good at cranking out features, gets commended for it and continues. He’ll end up just creating bad code at an alarming pace until the code becomes brittle and unweildy. This is what I’m guessing is going to happen over the next years. More experienced devs will see a massive benefit but more junior devs will need to be reined in a lot.
Going forward architecture and isolation of concerns will be come more important so we can throw away garbage and rewrite it way faster.
WALL OF TEXT that says inadvertently that junior devs should be treated like machines not people.
It’s not even a junior dev. It might “understand” a wider and deeper set of things than a junior dev does, but at least junior devs might have a sense of coherency to everything they build.
I use gen AI at work (because they want me to) and holy shit is it “deceptive”. In quotes because it has no intent at all, but it is just good enough to make it seem like it mostly did what was asked, but you look closer and you’ll see it isn’t following any kind of paradigms, it’s still just predicting text.
The amount of context it can include in those predictions is impressive, don’t get me wrong, but it has zero actual problem solving capability. What it appears to “solve” is just pattern matching the current problem to a previous one. Same thing with analysis, brainstorming, whatever activity can be labelled as “intelligent”.
Hallucinations are just cases where it matches a pattern that isn’t based on truth (either mispredicting or predicting a lie). But also goes the other way where it misses patterns that are there, which is horrible for programming if you care at all about efficiency and accuracy.
It’ll do things like write a great helper function that it uses once but never again, maybe even writing a second copy of it the next time it would use it. Or forgetting instructions (in a context window of 200k, a few lines can easily get drowned out).
Code quality is going to suffer as AI gets adopted more and more. And I believe the problem is fundamental to the way LLMs work. The LLM-based patches I’ve seen so far aren’t going to fix it.
Also, as much as it’s nice to not have to write a whole lot of code, my software dev skills aren’t being used very well. It’s like I’m babysitting an expert programmer with alzheimer’s but thinks they are still at their prime and don’t realize they’ve forgotten what they did 5 minutes ago, but my company pays them big money and get upset if we don’t use his expertise and probably intend to use my AI chat logs to train my replacement because everything I know can be parsed out of those conversations.
It’ll do things like write a great helper function that it uses once but never again, maybe even writing a second copy of it the next time it would use it.
Holy shit! That exactly explains why I’ve seen so many duplicated functions lately. I brought it up to the dev responsible the first time I found one (git blame), and he was just like “oh haha I can remove one” like he wasn’t quite sure what I was talking about, now I realize he must’ve gotten on that AI train much earlier than I thought…
Or maybe don’t try and drive a screw in with a hammer?
It’s just not good for 99% of the shit it’s marketed for. Sorry.
Junior software developers understand the task. They improve their skill in understanding the code and writing better code. They can read the documentation.
Large language models just generate code based on what it looked like in previous examples.













