AI History · EP 08 · FINAL

A buried 2020 paper
opened AI for every company

May 2020, Patrick Lewis at Meta AI Research submitted a paper to NeurIPS. It sat dormant for two years. After ChatGPT arrived in November 2022, every company faced the same question: "ChatGPT doesn't know our company's data." Then someone went back and pulled that buried paper out.

7 min read 2026.05.05 2020 → 2026 · FINAL

01May 2020, a paper buried in NeurIPS

📚
Patrick Lewis
UCL PhD → Meta AI Research → Cohere AI Lab · NeurIPS 2020

Title: "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks." Acronym: RAG. The core idea, in one sentence — "before the model answers, retrieve the relevant documents from an external knowledge base and use those as context for the answer."

Why did this matter? In 2020, GPT-3 had 175 billion parameters — but it was helpless on anything outside its training data (events after September 2020, internal company documents, etc.). It also had a "hallucination" problem — it would make up plausible-sounding lies when it didn't know. RAG solved both at once — "don't memorize from training data; read the actual retrieved documents and answer."

But — only the field paid attention. Industry didn't. The reason was simple: LLMs hadn't yet reached ordinary users. In 2020, GPT-3 was something only OpenAI API beta users knew about. So RAG sat as an "interesting academic result."

02After November 30, 2022, every company asked the same question

The day from EP04. ChatGPT — 1 million users in 5 days, 100 million in 2 months. CEOs started opening it during meetings. And — the same question popped up at every company at the same time.

"ChatGPT is genuinely smart… but it knows nothing about our HR rules or our travel reimbursement policy. Could we feed it our internal docs?"

— Practically every company's IT meeting in 2023

The first attempt was fine-tuning. Train GPT-3 on 10,000 pages of internal documents. Result — expensive (tens of thousands of GPU-hours), slow (2+ weeks), and you have to retrain every time new documents arrive. And fine-tuned models still hallucinate. They confidently invent things that aren't actually in the company policy.

That's when someone pulled out — Patrick Lewis's 2020 paper. "You don't have to train it. Just retrieve and show it."

03How RAG works — in one diagram

① User question
② Vector search
③ Pull relevant docs
④ Inject into LLM
⑤ Generate answer

① User question: "What's the travel expense limit?"
② Vector search: encode the question as an embedding vector → find the semantically closest documents in the internal database
③ Pull relevant docs: top 3-5 documents (e.g., "Travel Expense Policy" doc, Chapter 2)
④ Inject into LLM: "Using the following document, answer this question. [doc text]. Question: What's the travel expense limit?"
⑤ Generate answer: the LLM reads the retrieved doc and produces — "Per company policy §2.3, domestic travel is capped at $150 per person per day…"

🔑 The magic of RAG is step ② vector search
This isn't keyword search. Documents are embedded as 1024-dimensional vectors and matched by semantic similarity. A search for "travel expense limit" also matches "Travel reimbursement policy" or "Trip cost guidelines" — anything semantically close. That's a direct outcome of EP03's Transformer learning words as vectors of meaning.

042023, an entire new industry detonated

For RAG to work, you need a database that can search millions of document vectors quickly. Standard databases (PostgreSQL, MongoDB) are slow at vector search. So a new category appeared — Vector Database.

📦
Edo Liberty
Pinecone founder (2019) · ex-Yahoo Research / Amazon AWS · first cloud vector DB

When Liberty founded Pinecone in 2019, almost nobody knew what a "vector DB" even was. April 2023 — Series B raise of $100M at roughly $750M valuation. Same year, Weaviate, Chroma, Qdrant, and Milvus all grew explosively. PostgreSQL's pgvector extension became a de facto standard. The vector DB market is rapidly expanding into the multi-billion-dollar range.

052024, every tool became a "Copilot"

One of the first companies to apply RAG cleverly was — GitHub. GitHub Copilot had been a code autocomplete tool since 2021. In April 2024, the "Copilot Workspace" tech preview was announced. It indexes your entire codebase and, when you write new code, retrieves the relevant functions and classes as context — pure RAG.

Around the same time, Microsoft 365 Copilot shipped generally in November 2023. Word, Excel, PowerPoint, and Outlook all RAG-search the documents in your OneDrive and SharePoint. Tell it "summarize last month's marketing report" and it finds the file and summarizes. The way office workers actually work began to change.

And in-house builds followed. Through 2024, McKinsey and Bain consulting reports consistently noted that most large enterprises are deploying their own internal LLM copilots. In Korea — Samsung blocked external ChatGPT then announced its own GAUSS model (Nov 2023), LG GenAI Studio, SK Telecom A.X. Almost every chaebol shows the same pattern.

06So what RAG really means

The ChatGPT shock from EP04 brought AI to the public. But the thing that actually changed how work gets done inside companies wasn't ChatGPT itself — it was the RAG version of it. Same GPT-4, but a GPT-4 that knows your company's documents is an entirely different tool. New-hire onboarding from 6 months down to 6 weeks. Internal policy lookup from 5 minutes down to 5 seconds. First-draft reports from 3 hours down to 30 minutes.

And one more thing — Lewis's paper came out in 2020. The fact that it was applied immediately when ChatGPT launched in 2022 was no accident. EP01's 1986 backprop, EP03's 2017 Transformer, EP04's 2020 GPT-3 — every one of them sat in a 2-7 year incubation before exploding into industry. Some paper from 2026 that's currently buried — will be the standard of 2030.

📖 At the end of the 8-part series

It started in EP01 with Frank Rosenblatt's 1958 perceptron. That cabinet-sized machine — twice killed, twice revived — has been the throughline of this series.

Hinton's 1986 backprop, 1997 LSTM, 2012 AlexNet, 2017 Transformer, 2022 ChatGPT, 2024 Sora, 2025 NVIDIA Blackwell, and the 2020 RAG that was buried until it exploded. All of it is one current. One paper, one insight, one quiet incubation, one detonation.

If this series leaves you with one thing — it's that AI didn't appear suddenly; it's the accumulation of seventy years. And every step of those seventy years is simultaneously running, right now, in your phone's camera, in your company's copilot, in the virtual metrology of a semiconductor fab.

07The whole 8-part series at a glance

EP01 · 1958-1986AI died twice — Perceptron to backprop EP02 · 1989-2020How computers got eyes — LeNet → AlexNet → ResNet → ViT EP03 · 1997-2017The day one paper unified all of AI — Transformer EP04 · 2018-20261 million users in 5 days — the ChatGPT era EP05 · 2014-2026The bar-napkin idea — GAN, Diffusion, Sora EP06 · 1993-2026The Denny's company that owns AI — NVIDIA, CUDA, TPU EP07 · IndustryThe factories that make AI run on AI — Panoptes, cuLitho EP08 · FINALAI for every company — RAG and the copilot era
🧪
Try it · AI Lab
Run RAG yourself — question → retrieve → answer →
Answer questions over 6 fictional company documents. See the matched docs, similarity scores, and the generated answer all on one page. This is how an enterprise copilot works.
AI History · Series Navigation