Reflections on Gen AI: 2025

07 Jan, 2026

2025 was quite a year in the Gen AI space. Here are my reflections on the year that's been. This isn't meant to be a comprehensive analysis, just thoughts on the areas I dabbled in.

Limitations of LLMs

For all the progress in 2025, some fundamental limitations remain.

No Continual Learning - LLMs don’t really have a native way to continually learn (except in context). You essentially start from scratch in every session. Net new innovation is required for LLMs to continually learn. The expansion of context windows in some of the models has helped alleviate some of the challenges.
Vulnerability to Model Collapse - LLMs tend to overfit to their own biases. The output tends to lack the richness and diversity of human data. They keep revisiting the same few ideas (unless you explicitly, via prompts and context, force them to go broader)
Pre-Training Data limitations - The output is of higher quality if the LLM finds similar data in the pre training data set. LLMs are not great at coming up with net new stuff (there's no meaningful discovery made by LLMs so far despite essentially ingesting most available data sets on the web)
Middle to Middle - Today’s AI does tasks middle to middle (and not end to end) and that moves all costs to verification and prompting. Balaji has a wonderful article on this.
Hallucinations - While this is improving, hallucinations continue to be a challenge. So in high stake domains where mistakes are costly, that makes human verification a must.

While these limitations exist today, given the pace of innovation and the talent density in the AI space I suspect all of these problems will be solved over the next few years. But we are not there yet.

So what is important as a consequence

Prompt Engineering, Context Engineering and Verification - As AI does work middle to middle, prompting, the ability to share the right context and the ability to verify are key. These are three critical skills that every practitioner should pick up. Prompt and Context engineering are discussed a lot, but verifying output as a skill is not appreciated enough.
The lure of full autonomy- Karpathy in his brilliant presentation talks about building applications with autonomy sliders. Until the limitations shared above are solved for, partially autonomous solutions (agents) are the way to go. End to end automation should be limited to repeatable tasks that are easy to verify (which will only be a small subset of the work we do today). Human in the loop and human agency will continue to be the biggest differentiator. Workflows that enable human in the loop at every step will be key. We need to design systems that enable humans to demonstrate their agency, stay in control and verify easily.
Building in an era where models continue to improve - Models will continue to evolve and some of the limitations shared above eventually will go away. Keeping that in mind, doing the simplest thing that works is the way to go.

What excites me

Coding agents are scary good - Like I mentioned above, agents whose output is easily verifiable will continue to drastically improve. Coding agents are a great example of that. Opus 4.5 and GPT 5.2 are a step change improvement. I'm amazed at how good they are. They will continue to improve and we are already in a world where the vast majority of code will be written by AI.
Jevons Paradox for knowledge work - Anytime there's been a step change improvement in technology, the demand for the underlying resource has gone up. As coding agents continue to improve and write the vast majority of the code, this will only increase the addressable market for software. We'll build more software, we'll build software we've never built before and we'll use software to solve challenges that were unsolvable in the past. I don't see this as an existential risk to the software engineering profession - I see it as an opportunity to solve harder problems and build incredible software.
Ideas are the bottleneck - Ideas and the ability to execute on them will become key differentiators in the era where building software becomes easier and easier.
The experience layer of AI is in its infancy - Most AI experiences today default to the chat paradigm (or voice). This is understandable and is the easiest way to get existing users to experience AI first experiences. But there is a lot of opportunity for innovation in this space. I suspect in 2026, we will see a lot more dynamic experiences that evolve alongside the user.
Golden age of learning/curiosity - Learning and personalised education are the killer apps in the Gen AI space. The technology is finally there to help each individual learn based on their interests, strengths, weaknesses and at their own pace. The era of one size fits all, industrialised learning is coming to an end.

There was so much innovation this year. 2026 promises to be even better. As building gets easier, the bottleneck shifts to ideas worth building. That's where I want to spend my time - finding hard problems, staying curious, and building things that matter.