From AI winter to AI spring: The great transformation
15 min read
Last edited:

In Part 1 of The Effortless Podcast’s two-part series on the history of AI, we explored AI’s long cycles of hype and disappointment. For decades, AI cycled between hype and failure. Compute was scarce, models were brittle, and funding disappeared as soon as reality failed to match lofty promises.
These repeated setbacks led to AI winters we saw in Part 1, where optimism froze and progress stalled.
Then, in 2017, AI’s long winter ended with the arrival of transformers and the world saw the dawn of AI spring.
Instead of merely recognizing patterns, AI could now grasp meaning, context, and nuance. This breakthrough fueled the rise of large language models like GPT, shifting AI from rule-based automation to true learning.
How did we get from the cold, rigid systems of AI’s past to the dynamic, adaptable intelligence we see today?
In this concluding part of this miniseries, Dheeraj Pandey, Co-Founder and CEO of DevRev, and Amit Prakash, Co-Founder & CTO of ThoughtSpot, break down AI’s biggest inflection points. They’ve witnessed AI’s evolution firsthand—from the inefficiencies of early models to the moment AI finally woke up. Now, they’ll explore how embedding vectors, transformers, GPUs, and RLHF ignited the AI revolution—and what comes next.
2017: The year transformers set AI free
Before 2017, AI models had the memory of a goldfish. Recurrent Neural Networks (RNNs) were an improvement over earlier neural networks because they had a built-in mechanism for “remembering” past words in a sequence. But their memory was fragile.
They could recall a handful of words, maybe a short sentence, but as the sequence grew longer—like a paragraph or an entire document—the context faded.
The deeper flaw wasn’t forgetfulness, but instability. AI models adjust weights using backpropagation, but in RNNs, these updates became wildly unreliable. Sometimes, the signals guiding learning became too weak (vanishing gradient problem), making early words irrelevant to later predictions. Other times, they grew uncontrollably large (exploding gradient problem), throwing the entire training process into chaos.
Long short-term memory (LSTM), which was a type of RNN, helped mitigate these issues by selectively deciding what to remember and forget. But even they struggled with long-range dependencies.
Without reliable memory, AI couldn’t understand the bigger picture—context that makes language meaningful, like connecting the beginning of a story to its end. AI needed something better.
How transformers melted the long AI winter
The first level idea is that we are going to limit the context—not to infinite like in RNN, but to a fixed context. And then that allows us to go quadratic, and that solves the vanishing and exploding gradient problem by limiting the context and allowing a lot more parameters, more connectivity, more connectivity. What’s the nature of this connectivity? The nature of this connectivity is this attention mechanism.
The game-changing moment came when a team of Google researchers introduced a radically different model in their landmark 2017 paper, Attention Is All You Need.
This paper introduced the transformer architecture, which replaced sequential processing with something entirely new: self-attention. This means, instead of processing words one by one, AI could now scan an entire sentence and determine what mattered most.
Let’s break this down.
Traditional AI models treated language like a chain—each word was connected to the one before and after it. Transformers, on the other hand, treated language like a network—every word was connected to every other word, all at once.
Imagine you’re translating the sentence: “After a long journey, the tired traveler finally reached home.”
A traditional model would process each word in sequence, from left to right, predicting the best translation for each word based on the previous words. It would interpret “After” first, then “a long journey,” and so on, step by step.
A transformer-based model, on the other hand, examines the entire sentence at once, identifying which words are most relevant to one another. For example, it might recognize that “tired traveler” and “reached home” are closely related, as they convey the traveler’s fatigue and the relief of arriving home. It also understands that “after a long journey” provides essential context for the traveler’s exhaustion. This ability to understand relationships at scale is what powers real-time translations, intelligent search engines, and AI models like GPT that can write essays, code software, and hold coherent conversations.
“Some of the researchers thought, why don’t we train a network that will tell me which word to pay attention to during translation?” Amit recalls. “And so this mechanism was called attention.”
Dot products: math that makes AI pay attention
At a technical level, self-attention is made possible by a mathematical operation called a dot product.
Dot products help the model determine how much focus to place on each word relative to every other word. If two words are closely related, their dot product will be higher, meaning the model will pay more attention to their relationship. If they are unrelated, their dot product will be lower, and the model will ignore them.
To explain how dot products work, Amit suggested to Dheeraj that if they wanted to determine how similar they were to each other, they could compare specific dimensions of their lives. For example, one such dimension could be where they grew up. Since both Amit and Dheeraj hail from the same state in India, multiplying those values would result in a high number, indicating strong similarity.
On the other hand, if they considered a dimension where they differed, such as the cars they drove, the dot product would be low because the values would not align. By summing up these values across multiple dimensions, the dot product effectively measures overall similarity.
In other words, self-attention uses dot products to determine which words reinforce each other’s meaning. This allows transformers to:
- Retain long-term dependencies in text
- Understand context more effectively
- Focus on the most relevant words in a sentence
The attention mechanism is essentially each word generating a query, and each word also generating a key. And then you’re multiplying them, and then you’re adding the values. That allows you to statistically learn that in this situation, when the input is like this, I should be paying attention here.
Google’s grand TPU experiment—and why it fell short
By the mid-2010s, AI had outgrown traditional computing power. GPUs, originally designed for video games, had become the workhorses of deep learning, thanks to their ability to process billions of matrix multiplications in parallel. Companies like NVIDIA refined them to handle AI workloads, fueling breakthroughs in neural networks.
But as AI models grew exponentially, even GPUs had limits. Training massive neural networks consumed enormous power and took weeks to complete. Google saw an opportunity to go further. Instead of relying on repurposed gaming hardware, they built Tensor Processing Units or TPUs, which are custom-designed AI chips—AI accelerator application-specific integrated circuits (ASIC) to be precise—optimized solely for machine learning.
As Amit explains, TPUs were supposed to be the ultimate AI accelerators: faster, more efficient, and tightly integrated with Google’s software ecosystem. With TensorFlow already dominating the deep learning world, TPUs seemed like the perfect next step—hardware that worked seamlessly with Google’s AI stack.
So, what went wrong?
TPUs came with a major limitation: they were largely restricted to Google’s ecosystem. AI researchers, startups, and even other tech giants had limited access to these chips. This exclusivity made it difficult for the broader AI community to adopt TPUs.
This is where NVIDIA had a massive advantage. They didn’t force AI companies to use a specific ecosystem. They let AI researchers choose their own frameworks, tools, and cloud providers.
By the time GPT-3 and similar large models emerged, the industry had firmly settled on NVIDIA GPUs as the gold standard for AI computing. OpenAI, Meta, Microsoft, and countless startups scaled their models on NVIDIA’s hardware, not Google’s TPUs, proving that accessibility, not just innovation, drives revolutions.
Why AI needed human touch to become more reliable
By 2020, AI had reached an inflection point. The models were bigger. The algorithms were smarter. The computational speed was faster. Yet, a critical problem remained: AI didn’t understand human values.
Despite their sophistication, these models weren’t just flawed—they were unpredictable. AI-generated text pulled from diverse online sources, absorbing the best and worst of human language. Misinformation, biases, and even outright toxicity seeped into responses. Instead of assisting users, AI sometimes hallucinated facts, reinforced harmful stereotypes, or spread conspiracy theories. The technology was powerful, but it was far from safe.
The breakthrough came with Reinforcement Learning from Human Feedback (RLHF)—a method that put human oversight at the core of AI training. OpenAI realized that raw statistical learning wasn’t enough. AI needed to be trained with direct human guidance, ensuring that its responses aligned with ethical standards, factual accuracy, and social norms.
At its core, RLHF works by integrating real human judgment into AI training:
- AI generates multiple responses to the same prompt.
- Human reviewers rank them from best to worst, evaluating coherence, safety, and helpfulness.
- AI is then fine-tuned using reinforcement learning, optimizing it to prefer human-endorsed answers.
Rather than relying on predefined rules, AI now learned from how humans actually think—a shift that transformed ChatGPT from an unpredictable prototype into a usable, safe, and scalable product.
RLHF solved the last-mile problem—turning AI from an unpredictable generator into a system that could be trusted at scale. As Amit Prakash puts it, “It’s not a very interesting theoretical innovation, but there was innovation in there that caused these things to be a lot less toxic and usable.”
In December 2022, OpenAI launched ChatGPT, and the impact was immediate. What was once an experimental model became the defining AI product of the decade. Without RLHF, ChatGPT might have remained an impressive but unreliable research project. Instead, it became the AI assistant that reshaped how the world interacts with technology.
Google played it safe. OpenAI played to win.
Google’s AI teams had been building large-scale language models for years. But OpenAI made the first major consumer breakthrough because they were willing to take risks that Google wasn’t.
Google had its own chatbot, Meena, internally. It was just as powerful as OpenAI’s models—but it never saw the light of day.
Why?
The idea of an unpredictable AI chatbot, capable of generating misinformation or offensive content, was too great a reputational and legal liability. But beyond geopolitical risks, Google feared AI-generated hate speech, misinformation, and offensive content could lead to lawsuits, regulatory scrutiny, and brand erosion.
As Amit notes, “The powers that be didn’t want to expose it because they were afraid that, what if it says something bad about the King of Thailand and our entire Thailand business is shut down?”
Google had too much to lose. One rogue AI response could spark lawsuits, damage reputations, or incite geopolitical backlash. The risk? Too high. The solution? Keep Meena locked away.
OpenAI, on the other hand, had no legacy business to protect. They weren’t a trillion-dollar ad-driven empire worried about lawsuits. They were a startup with one mission: push AI into the hands of real users and refine it in the wild. Instead of waiting for the perfect model, they shipped fast, embraced uncertainty, and turned real-world feedback into a training advantage.
By launching ChatGPT at scale, OpenAI unlocked something Google never did: millions of real-world interactions. Every question asked, every clarification requested, every mistake corrected became fuel for rapid improvement. This created a feedback loop where ChatGPT evolved at an unprecedented pace, while Google’s models sat idle. In the race to consumer AI, speed wasn’t just an advantage—it was everything.
No more rules: How AI is rewriting the playbook
The last five, seven years of conversational AI—whatever the Intercoms of the world used to call conversational AI—was not AI. It was just rules, builders and decision trees. And the thing would just break because they had happy-path assumptions about what text to assume and things like that.
For decades, software systems were built on rules. If X happens, do Y. If a support ticket contained a keyword like “refund,” it would automatically be routed to billing. If an employee submitted an expense above a certain threshold, it would require an approval step.
Rules made sense. They were predictable, structured, and easy to understand. But rules also had a fatal flaw: they don’t adapt.
The moment something changed, the rules had to be rewritten. For instance, in healthcare, a rule-based system might route a critical symptom report incorrectly because it didn’t recognize new medical terminology. Humans had to intervene, rewrite the rules, and manually maintain an ever-growing list of conditions.
This worked when businesses were small, when the volume of data was manageable, and when human oversight was enough to tweak the logic. But as businesses scaled, rules began breaking down.
Now, AI is forcing a radical rethinking of how we approach automation—not with rules, but with embeddings.
Why embeddings are the death of rule-based automation
When you sit product managers and support people sit there triaging, but what are they really doing? They’re labelling, they’re deduplicating, they’re deflecting, they’re routing, they’re classifying their clustering. All these ‘-ings’ that I just said right now is basically what we’re expecting of AI now.
Embeddings are vector representations of meaning—mathematical structures that capture relationships between concepts. Rather than categorizing data manually, embeddings let AI find similarities naturally. Think of it like this:
- Instead of telling AI that “refunds” should go to the billing team, AI learns that refund-related queries share linguistic similarities and should be grouped.
- Instead of manually tagging thousands of support tickets, AI understands context and clusters similar cases together—without needing human intervention.
- Instead of assigning support agents based on rigid hierarchies, AI dynamically maps the best match based on past performance, current workload, and user sentiment.
This shift is profound. It means that businesses no longer need to micromanage every decision. Instead, AI learns through context—continuously refining itself based on data rather than static rules.
Dheeraj explains that if every concept—whether it’s a customer support ticket, a log entry, an alert, or even a developer’s skill set—can be mapped into an embedding space, then AI can automate tasks that once required meticulous human intervention. It can identify patterns, make associations, and streamline decision-making without needing explicit instructions for every possible case.
This is the essence of AI’s power. The most complex, high-volume, and repetitive tasks in business no longer require human-defined rules.
“To shun rules is the hardest thing in business software. The hardest thing,” Dheeraj points out. But it’s no longer a choice. The companies still clinging to rule-based automation are already obsolete. AI-driven businesses don’t follow static rules; they evolve. And that’s the only way to win in the AI era.
Wrapping up: AI doesn’t follow rules—it learns
At every point in time, you have to figure out what’s the right amount of constraints. As compute and data increase, those hand-designed networks and hand-designed constraints and objective functions become less and less relevant. And so over time, probably we’re going to see that it’s just going to be about a lot more data and a lot more compute.
As Amit highlights, the role of constraints in AI has fundamentally changed. With compute and data scaling exponentially, the need for handcrafted constraints is diminishing. Instead, AI thrives on vast amounts of information and processing power, refining itself dynamically rather than following pre-imposed limitations.
The future, as Amit suggests, won’t be about programming intelligence; it will be about feeding AI more data and compute, allowing it to discover patterns and optimize itself in ways that human-designed constraints never could.
Dheeraj likened AI’s evolution to the discovery of fire, a pivotal moment in human history. Referencing the book Sapiens: A Brief History of Humankind by Yuval Noah Harari, he explained how early humans, reliant on raw meat, expended enormous energy on digestion. But when humans learned to cook, digestion became more efficient, and more energy was freed up for cognitive functions. This allowed the human brain to grow, setting the stage for higher intelligence.
For decades, AI was like raw meat—slow, inefficient, demanding constant human effort. Then came AI’s fire: GPUs. Just as cooking unlocked energy for brain growth, GPUs unleashed AI’s full potential.
With transformers and large-scale models trained on vast data, AI no longer needed rigid, predefined rules. AI can now infer patterns on its own. What was once a vague hypothesis has now become a vivid reality.
The takeaway for businesses is clear: If you’re still relying on rules, you’re already obsolete. The companies that win in the AI era will be the ones that embrace self-learning, adaptable systems.
Rules break. AI adapts.
Rules fossilize. AI evolves.
Rules need fixing. AI fixes itself.
And that’s why it’s winning.
In this era of AI spring where AI learns, unlearns, and evolves, the only rule that matters is to keep learning—whether you’re an employee or an executive. Stay ahead of the curve by subscribing to The Effortless Podcast Substack, where you can dive deeper into the breakthroughs shaping the future with insights from pioneers in the field of AI and tech.