The Karpathy Year

There is a particular sound that the AI commentariat makes when Andrej Karpathy publishes a tweet, and the sound is not subtle. It is the sound of several thousand engineers refreshing the same screen at the same time. It is the sound of an industry that has, in some specific and slightly unhealthy way, made an individual researcher into its index of consensus. It is the sound of a community that takes its cues — its vocabulary, its critiques, its embarrassments, its sense of what is and isn’t real — from a single person, and that has done so for long enough that the practice has stopped being remarked on.

Karpathy is, in some sense, an unlikely candidate for the role. He is not a chief executive. He is not, by the conventional accounting, an operator of comparable scale to Sam Altman or Dario Amodei or Demis Hassabis. He has not raised a hundred-million-dollar funding round. He has not, in the way that the figures who usually anchor the industry’s narrative do, been forced to choose what to defend on a daily basis. What he has done, instead, is teach the field how to think. He wrote the lecture notes that a generation of researchers used to learn deep learning. He led the Tesla autopilot vision team through the years when the autopilot conversation went from absurd to plausible to plausible-but-difficult. He returned to OpenAI for a stint that, in his telling, was about figuring out what the post-GPT generation of work actually looked like. And then, in early 2024, he left a second time, with the kind of measured exit announcement that the AI field has come to read as a tell that something is being built.

What he was building, when he announced Eureka Labs in the summer of 2024, was the AI-native university he had been talking about for years in interviews and on Twitter. The first product was LLM101n, a deliberately structured course on building a working language model from scratch. The course was not designed to be the easy path. It was designed to be the path that produces engineers who actually understand what they have built, in a year when the dominant trade-press story was about engineers who don’t and don’t need to. Eureka Labs was, at its founding, the most public articulation Karpathy had ever made of his pedagogical position, which is, on a clean reading, that AI is too important to be left to the people who don’t bother to learn how it works.

The pedagogical position would, over the course of the next eighteen months, undergo a series of public stress tests that would change the shape of the AI conversation in ways that almost no individual researcher had previously managed. The stress tests came one at a time. They began, in retrospect, with a single tweet in February 2025, in which Karpathy coined a phrase that would, by the end of the year, be a job title.

The word

The phrase was vibe coding. The tweet — preserved in Karpathy’s Wikipedia page and replicated, by then, on every AI newsletter in active circulation — described a new mode of building software in which a developer simply prompts a language model to write the application and accepts the output without reading the code. The phrase, as Karpathy used it, was not strictly an endorsement. It was a description. He was naming a behavior he had observed, in his own work and in the work of the people around him, that had become common in 2024 and was about to become dominant in 2025. The phrase was attached to a specific tonal register — a half-bemused, half-impressed account of building applications in a few hours that, a year earlier, would have taken weeks.

What happened next is the part of the story that surprised even the people who had been watching the AI tooling ecosystem closely. The phrase, within thirty days, became the dominant frame for an entire category of consumer software. Lovable, the European prompt-to-app builder, branded itself around it. Cursor and Replit and Bolt and v0 all, in slightly different ways, positioned themselves as the vibe-coding stack. By the summer of 2025, the phrase was on entry-level job postings in the form of “vibe coding developer skills.” By the fall, Ramp had advertised a Vibe Growth Marketing Manager role that explicitly extended the metaphor to marketing operations. By the spring of 2026, HBR was publishing endorsement essays about the redesign of marketing organizations around vibe-style workflows. A casual phrase, coined in the way that researcher Twitter coins phrases, had become a category.

The phrase was Karpathy’s. The category was, by the end of 2025, a multibillion-dollar one. The relationship between the two facts is the thing that the AI field’s commentary class has not, in any settled way, figured out how to talk about. Karpathy did not build the category. He did not, by any standard reading, profit from it. He did not, in the months after the tweet, even particularly endorse it. What he did, by naming it, was give the category permission to exist. The naming was the thing.

"The single most consequential AI essay of the year was a thirty-word tweet. The author was not paid for it. The author did not, in any of the standard senses, sell it. The author simply named a behavior. The behavior, once named, became a market."

The critique

The second stress test came in the form of a long blog post, published in the late summer of 2025, in which Karpathy used a word for the broader category of AI-generated output that would prove almost as durable as vibe coding. The word was slop. The post — preserved in the same Wikipedia entry that catalogs the rest of his year — argued that the current generation of agentic AI systems, including the coding agents that had been the central story of the venture press for twelve straight months, was producing output that, on any clean reading, was not yet good enough to be relied on for the work it was being marketed for. The argument was specific. It was technical. It cited examples. It was, by every standard reading of the post, a thoughtful critique by a senior researcher of a category in which he had been an early and prominent participant.

The reception of the post is the part of the story that explains why Karpathy occupies the position he occupies in the industry’s commentary class. The post was not, in the standard sense, a viral one. It was a long technical essay on a blog that the AI commentariat had been reading for a decade. What the post did, in a way that a less-positioned author’s post would not have done, was license a whole category of critique that had, until then, been muted by the social pressure of the funding environment. After Karpathy used the word, every senior researcher in the field who had been privately skeptical of the agent boom had cover to be publicly skeptical. The cover was the point.

The critique was particularly cutting because it landed in the middle of the loudest revenue year in the history of coding agents. By the time Karpathy published, Claude Code was on a $2.5B run-rate. Cursor was on a $2B ARR pace. Devin’s ARR had grown from $1M to $73M in nine months. The numbers, by any reasonable accounting, were not the numbers of slop. They were the numbers of a category that, whatever its limitations, was being adopted by the customers it had been built for at a velocity the software industry had almost never seen.

This is the contradiction that the post forced the field to confront, and that the field has been arguing about ever since. Karpathy said slop. The revenue said adoption. Both could be true at once — the output could be unreliable in the specific ways that Karpathy described, and the unreliability could be acceptable enough to a sufficient fraction of customers that the products would, in fact, generate billions of dollars in annual revenue. The two facts, on the most-careful reading available, were not in contradiction. They were two sides of the same coin. The coding agents were, in fact, often producing slop. The customers, in fact, did not entirely mind.

The implication, which Karpathy’s post stopped short of stating but which the field’s most-honest commentary worked out in the following months, was that the AI industry had, without quite intending to, produced a category of products whose commercial viability did not require the level of correctness that the engineering culture of the industry had previously demanded. The products worked well enough to sell. The selling, in some specific way, was the only ground truth the market was prepared to credit. Karpathy’s critique was, in retrospect, the moment the industry’s engineering culture and the industry’s commercial culture diverged in public for the first time.

The hands that stopped writing code

The third stress test was the smallest and, in some specific way, the most unsettling. In a March 2026 conversation that Karpathy later acknowledged on his own social channels, and that is preserved in his Wikipedia page, he said that he had not personally written a line of code since December 2025. The remark was offered in passing. It was not, by any reading, a celebratory remark. It was a confession. The man who had written the lecture notes that taught a generation of researchers to think in PyTorch, the man who had spent a quarter-century building the muscle memory of a working programmer, the man who had publicly endorsed neither the most enthusiastic nor the most dismissive read on the AI tooling ecosystem, had stopped writing code with his own hands.

The remark landed harder than its surface suggested. Inside the field, the question that the remark forced was not whether Karpathy was a good or bad programmer. The question was what it meant that the most-listened-to programmer of the previous decade had, by his own account, delegated his programming entirely to the systems he had once taught people to build. The most generous reading of the remark was that the systems had become good enough to do the work, and that the senior researcher’s job was now to direct them, not to perform the work itself. The least generous reading was that the senior researcher had, in a way he was not yet entirely comfortable with, conceded a position he had spent his career defending.

Both readings, in some specific way, were right. Karpathy did not, in the months after the remark, retract it. He did not, in the months after the remark, walk it back into a more comfortable framing. He let it stand. He kept the position. The position was, in the most-honest reading available, that the tooling had gotten good enough that even the most-disciplined practitioner could no longer reasonably justify the cost of writing code by hand for the day-to-day work he was doing, and that the resulting trade — speed for muscle memory, leverage for fluency — was a trade he had, on net, decided to take. The trade was the position. The position was, by the spring of 2026, the most-discussed personal position in the field.

The reason the remark mattered, beyond the personal arithmetic of Karpathy’s own work, is that it forced the field to grapple with a question that the year’s earlier conversations had managed to defer. The earlier conversations had, in roughly equal measure, treated AI tooling as either a productivity multiplier for skilled engineers or a replacement for unskilled ones. The remark forced a third reading. The third reading was that AI tooling, at its current best, was a replacement for the day-to-day work of even the most-skilled engineers — that the skilled-engineer category had not been spared by the tooling, only displaced upward into a role that looked less and less like the role the category had occupied for the previous fifty years. The remark, in some specific way, was a public admission that the displacement had reached the top of the field, and that the top of the field, on close inspection, had decided to accept it.

The choice

The fourth stress test came in May 2026, and the AI field has not, as of this writing, finished metabolizing it. On May 19, 2026, Karpathy announced — through the channels he uses for these things, which is to say a short tweet and a slightly longer blog post and the cooperation of the Wikipedia editing community — that he was joining Anthropic to lead a pretraining research team. The announcement was, by Karpathy standards, terse. The terseness was the signal.

The announcement had three immediate consequences. The first was that Eureka Labs, which had been the central public project of his post-OpenAI period, became, on the available evidence, a part-time concern. The second was that the most-listened-to individual in the field had, after eighteen months of being publicly unaligned, chosen a closed lab. The third was that the closed lab he chose was not, by any reading, the obvious one.

The obvious choice, in the spring of 2026, would have been to rejoin OpenAI. Karpathy had been at OpenAI twice already. The institutional muscle memory was there. The personal relationships were there. The capital was there. The reputational gravity, on the standard reading, was there. The OpenAI of 2026 was, by every public signal, in the middle of a generational push on post-training and agentic systems, and Karpathy’s research interests had been, on every public account, perfectly aligned with that push. The obvious choice was to go home.

He did not go home. He went to Anthropic. The choice was, on its surface, a personal one — Karpathy’s choices have always been, on his telling, personal ones, and he has been consistent in describing his moves as functions of what he wants to work on rather than what the market wants him to work on. But the choice landed, in the AI commentary class, as something larger than a personal one. It landed as an endorsement.

Anthropic, in the version of the story the AI field had been telling itself for the previous two years, was the lab that had positioned itself as the responsible-actor alternative to OpenAI. The positioning was, on close inspection, a partial caricature — Anthropic ships closed-source frontier models, raises large rounds, pays competitive salaries, and operates in roughly the same commercial frame as OpenAI does. But the caricature had been doing real work in the field’s narrative. Anthropic was the lab whose safety team had not been eviscerated. Anthropic was the lab whose RSP framework was being treated, in the policy press, as the de facto industry standard. Anthropic was the lab whose Claude models, in the technical community’s quiet running consensus, were the ones the senior practitioners actually used when they had to ship something they cared about.

Karpathy choosing Anthropic was, on the most-cynical possible reading, simply a senior researcher choosing a senior research role. But on every other reading available, it was a vote. It was a vote for the version of the AI industry that took the post-training work seriously as a long-cycle project, that took the safety work seriously as a long-cycle project, that took the commercial work seriously as a downstream consequence of the long-cycle work and not as the work itself. The vote was, by some specific calculation, the year’s loudest one.

"The single most-listened-to individual in the field, after eighteen months of being publicly unaligned, chose a closed lab. The closed lab he chose was the one whose Claude models were the ones the senior practitioners actually used when they had to ship something they cared about. The choice was not subtle. The choice was a vote."

What the vote means

The vote means, on the most-charitable reading, that the senior research class of the AI field has, in 2026, decided that the closed-frontier-lab model is the model that produces the most-serious work. The vote does not, on this reading, foreclose the value of open-source efforts or of smaller-scale specialized labs. It simply asserts that, at the frontier, the closed-lab model is the one where the most-difficult work is being done. The assertion, on the available evidence, is hard to refute. The frontier-scale models that the technical community uses for its hardest work are, almost exclusively, the closed ones. The closed labs have, in 2025 and 2026, attracted the senior research class at a rate that the open-source labs have not been able to match. The vote, on this reading, is a recognition of where the gravity is.

The vote means, on the most-skeptical reading, something less flattering. It means that the financial incentives of the closed-lab system have become impossible to walk away from, even for the senior researchers who have spent their careers articulating a more-open vision of the field’s future. The compensation packages at Anthropic and OpenAI are, by 2026, comfortably in the eight-figure range for senior research staff. The equity stakes are in the same range. The infrastructure budgets are, by orders of magnitude, beyond what any open-source effort can credibly match. The skeptical reading is that the vote is not a vote for the closed-lab model’s seriousness. It is a vote for the closed-lab model’s resources. The vote is, in some specific way, the resignation letter of the open-research vision of the field’s future.

Both readings are, on the available evidence, partially right. The closed-lab model is, in 2026, producing the most-serious frontier work and offering the most-compelling resources. The two facts, on close inspection, are downstream of each other. The serious work is the work the resources make possible. The resources are the resources the serious work justifies. The vote is, in some specific way, a vote for the loop.

The implications for the rest of the field are the thing the commentary class has been working out since the May announcement, and the implications are not minor. Eureka Labs, on the version of its future that Karpathy’s blog post implicitly described, will continue. But it will continue as a side project for a senior researcher who is now the head of pretraining at Anthropic. The AI-native university — the idea that had been, in the version of the post-OpenAI story Karpathy was telling, the central project of his next decade — is now, on the most-honest reading, a longer-cycle bet than the Anthropic role will allow him to fully commit to. The bet is not dead. But it is, on every available signal, a slower bet than it was in the period before May 19.

The implications for the broader pedagogical project of the field are similar. Karpathy had been, in the years since LLM101n was announced, the most-prominent advocate for the position that AI’s educational stack needed a generational rewrite. The position was that the current generation of AI courses — the ones produced by the platforms, the bootcamps, the corporate certification programs — was not, on any close inspection, doing the work the field actually needed. The work the field needed was not a credential. It was a curriculum that produced people who could build the systems from first principles. Karpathy had been, more than any other individual, the voice for that position. The May announcement does not, in any direct sense, retract the position. But it concentrates Karpathy’s attention, for the foreseeable future, on the pretraining work at a closed lab. The pedagogical project will continue. The voice for the pedagogical project will, on every available signal, be slightly quieter.

What it means for the agent debate

The most-immediate consequence of the May move, for the year’s loudest argument, is harder to summarize cleanly. The agent debate — the argument over whether the current generation of coding agents is producing slop or producing the future, over whether the revenue numbers vindicate the technology or merely the marketing, over whether Karpathy’s August critique was right — has been, for nine months, the central technical conversation in the field. Karpathy’s choice of Anthropic was, on the most-direct reading, a choice to work alongside the team that builds Claude Code, the coding agent that has, in the same period, accumulated the largest run-rate of the category. The choice does not, in any literal sense, retract the August critique. But it is, in some specific way, a retraction of the framing.

The framing, in the August critique, was that the senior researcher’s role was to call out the gap between what the agents could actually do and what the marketing claimed they could do. The framing of the May move is that the senior researcher’s role is to close the gap from inside. The two framings are not, on close inspection, inconsistent. They are, in some specific way, the two halves of the same trajectory. The August critique was the public articulation of where the work was. The May move is the public commitment to where the work needs to go. Karpathy, on the most-charitable reading available, was not contradicting himself. He was, in some specific way, taking responsibility.

The taking-of-responsibility is the part of the move that has, in the months since, been least-discussed in the trade press and most-discussed in the technical communities. The technical communities have, with some justice, treated the move as the end of an arc and the start of another. The arc that ended was the arc of Karpathy as a public critic, free to articulate the gaps between the field’s marketing and its work without the social pressure of a corporate affiliation. The arc that started is the arc of Karpathy as a senior leader at the lab he has, in some specific way, endorsed as the most-credible actor in the category. The two arcs, on the most-honest reading, are not in tension. They are the same arc, told from different sides of a single decision.

The future the choice forecloses

The choice forecloses, by its nature, certain futures that were available to Karpathy in the period before May 19. It forecloses the future in which he remained an unaligned public commentator on the field, available to write the next big critique with the same independence he had when he wrote the August one. It forecloses the future in which Eureka Labs became, in the next five years, the AI-native university that he had described it as. It forecloses the future in which a senior researcher of his caliber demonstrated, by example, that the open-research vision of the field’s future was capable of producing not just commentary but commitments.

What the choice opens, by way of compensation, is the future in which Karpathy actually closes the gap his earlier work had described. The future in which Claude — and the post-training pipelines that produce it, and the agents that are built on top of it — does not just generate the most revenue in the category but actually produces the most reliable output. The future in which the gap between marketing and work, in the specific case of the lab he has now joined, gets smaller. The future in which the August critique is, retrospectively, the last critique that Karpathy had to write from the outside, because the inside, on the May 19 trade, is now his to fix.

Whether that future materializes is, of course, a separate question, and one that the field will be answering for years. The answer depends on the work. The work depends, in the version of the story that the field’s most-honest commentary will tell itself, on whether the senior researcher who has, for a decade, taught the field how to think can, in the next decade, teach a specific lab how to ship. The two skills are related. They are not, in any close inspection, the same. The May 19 announcement is, in some specific way, the bet that they can be made the same. The bet is on the senior researcher. The senior researcher is the same individual who, eighteen months ago, named the year’s defining word, and the year’s defining critique, and the personal decision that defined the year’s most-uncomfortable conversation. The bet is, on the available track record, not a bad one.

What is clear, in any case, is that the most-listened-to individual in the field has, after a long and conspicuously public period of being unaligned, chosen a side. The side is Anthropic. The choice is, for the foreseeable future of the field’s narrative, the loudest signal Karpathy has sent. The signal is not the August critique. It is the May move. The August critique was the analysis. The May move is the bet. The two, on the cleanest possible reading, are the same arc. They are an arc that ends in a closed lab, doing the work the senior researcher had once been positioned to critique from a distance. The closed lab is now the place where the work is. The work is now the work he has chosen to do.

Imogen Reilly writes for Frontier Bylines about the people who shape the cultural narrative of the AI field. This piece drew on Karpathy’s own public posts and the Wikipedia chronology, on SiliconRepublic’s coverage of Eureka Labs, and on the revenue figures reported by Lab7AI and TechCrunch. Karpathy declined a long-form interview request. Anthropic confirmed the May 19 hire and declined further comment.