Análisis · History & Fundamentals · Edition #0008

Generative AI — what it is and why it reshaped knowledge work

From machines that classify to machines that write, design, and code. How the 2017 Transformer architecture ended up as a free text box in November 2022.

G
Germán Falcioni April 12, 2026
✦ Reading: 9 min
From labeling to generating: three decades of research converged into a text box in November 2022.
TL;DR

Generative AI creates new content rather than labeling what already exists. Underneath it is a 2017 architecture (Transformer) and a simple training objective (predict the next token) scaled to billions of examples. What broke the market wasn't the technology — it already existed — it was the wrapper: in November 2022, ChatGPT made it accessible and hit 100 million users in two months.

✦ Summarized with Claude at publish time
AI rewrite
Read it as…

The AI you used to know: labeling machines

Before 2022, almost every AI you encountered in a shipped product — in your inbox, your bank app, your feed — was a variant of one pattern: take something that exists and label it.

A spam filter reads ten thousand real emails, learns which words and structures signal suspicion, and hangs a tag on each new arrival: spam, not spam.

A social network studies your history and scores your next posts by priority: this will hold your attention, this won't.

A radiograph runs through a model trained on a million images and comes out labeled: anomaly, no anomaly.

Useful. Often essential.

But none of it creates anything. These are classification machines. They sort the world into buckets. They don't invent new buckets.

The pivot: from labeling to composing

Generative AI flips the equation. It doesn't receive something and stick a label on it; it receives an instruction and produces a new artifact.

Ask Claude for a client email and there is no stored email that matches yours. The model composes one in that moment — token by token — drawing on patterns from the millions of real emails it read during training.

Ask Midjourney for "a nineteenth-century physician, oil painting style, golden light" and the image isn't in any archive. It's generated pixel by pixel by a diffusion model that learns to "remove noise" until a coherent image emerges.

Ask GitHub Copilot for a Python function and there's no Stack Overflow in the loop. The model writes the code, respecting the syntax conventions it absorbed from billions of lines of public code.

Three new artifacts. Three different techniques underneath. One shared principle: combine learned patterns to produce something that didn't exist before.

The architecture that made it possible

All of this rests on a 2017 paper. Vaswani and co-authors, then at Google Brain, published Attention is All You Need — eight pages that rearranged the field. Today it has over 140,000 citations on Google Scholar, placing it among the most influential papers in the recent history of machine learning.

The core idea is called self-attention. Instead of processing language word by word in order — the way older networks (RNNs and LSTMs) did — the Transformer processes all tokens in a sequence simultaneously, letting each one "query" all the others.

That solves an old problem: on long texts, older networks forgot the opening by the time they reached the close. Transformer doesn't. A word at the start of a paragraph can directly influence one at the end, without degradation.

And it has a practical advantage: it parallelizes. Training runs that took weeks moved to taking days. That unlocked scale.

The scaling effect

With Transformer in hand, what was missing was compute and data. Both arrived:

  • BERT, 2018 (Devlin et al., Google): 340 million parameters. Strong context understanding, but still limited to classification tasks.
  • GPT-2, 2019 (Radford et al., OpenAI): 1.5 billion parameters. The first model that could write coherent paragraphs on arbitrary topics.
  • GPT-3, 2020 (Brown et al., OpenAI): 175 billion parameters — a hundred times larger than GPT-2. The paper documents something researchers called few-shot learning: show it two or three examples, and it generalizes.
  • ChatGPT, 2022: not a new model. GPT-3.5 wrapped in a chat interface, with RLHF applied to align responses.
  • GPT-4, 2023: estimated at roughly 1.76 trillion parameters according to SemiAnalysis reporting, never confirmed by OpenAI.

Pattern: each 10x jump in parameters produces qualitatively new capabilities. Not "better" — different.

The inflection: November 2022

In November 2022, OpenAI shipped ChatGPT with a minimal interface: text box, enter, reply.

Five days to the first million users. Two months to the first hundred million. It's the fastest consumer adoption ever measured — TikTok took nine months to hit that mark, Instagram took two and a half years.

The technology underneath was already out: GPT-3.5 had been available by API for a while. What was new was the wrapper — a box in a browser, free, no code.

That wrapper broke the barrier. Generative AI stopped being an engineering project and became a tool your accountant opened between two meetings.

What actually changed at work

The adoption curve matters less than what people started doing once the product was one click away.

McKinsey estimated in The Economic Potential of Generative AI (2023) that between 60% and 70% of the time office workers currently spend on writing, analysis, and coding tasks can be automated or accelerated with generative assistants. The Stanford AI Index 2024 documents that, at the individual productivity level, workers using Copilot or Claude report 20% to 40% less time on routine text tasks.

It isn't that the AI writes your email and you sign it. It's that you go from drafting three versions to reviewing one. From searching Stack Overflow to inline autocomplete. From building a deck from scratch to editing a reasonable draft.

It's a change in speed, not in nature. But at that scale, speed becomes nature.

What's worth thinking about

Are you delegating tasks where the AI is faster and double-checking where it hallucinates, or are you accepting what it produces without looking? Do you understand how it works under the hood — enough to know where to trust it and where to verify — or are you treating it as a black box?

These are decisions you make once and live with for years.

If you want the full arc of how AI went from an academic field in crisis to a product with a hundred million users in two months, the timeline is in Timeline: AI's defining moments. If you want to know how it predicts token by token on the inside, that's told in How an AI thinks.

Next article
The future of AI — what's coming in the next 24 months