TechArticle

Retrieval-Augmented Generation

Infobox

headline
Retrieval-Augmented Generation
description
Pattern where an LLM retrieves external chunks at query time instead of relying on a precompiled knowledge structure.

Retrieval-augmented generation (RAG) is a pattern where a language model answers questions by retrieving relevant passages from an external corpus—often via vector embeddings—at query time, then conditioning its reply on those chunks.

RAG is flexible for ad hoc document sets but can feel opaque: similarity search does not guarantee explicit structure, stable cross-links, or inspectable memory. The LLM Wiki pattern popularized by Andrej Karpathy argues for compiling sources into a persistent, interlinked markdown wiki that agents maintain and traverse—exemplified by Farzapedia.

Both approaches support Personal Knowledge workflows; many practitioners combine compiled wikis with targeted retrieval when corpora grow very large.

GraphRAG

GraphRAG is a RAG variant that first indexes a corpus into a knowledge graph—entities, relationships, and hierarchical community summaries—then retrieves through that structure at query time. It was popularized by Microsoft’s GraphRAG project for questions that need both local detail (“what did this document say about X?”) and global synthesis (“what are the main themes across the whole collection?”).

Compared with embedding-only RAG:

Vector RAG GraphRAG
Index Chunk embeddings Graph + community reports
Strength Fast similarity lookup Cross-document themes, structured hops
Tradeoff Weak explicit structure Heavier indexing pipeline; graph can be opaque

GraphRAG sits between opaque vector stores and human-inspectable wikis. You get explicit declarative knowledge in the graph, but the artifact is usually generated and queried through tooling—not necessarily a folder of markdown you edit in Obsidian. The LLM Wiki pattern pushes further toward files as the source of truth: interlinked pages, visible WikiLinks, and optional RDF / SPARQL validation in wikis like this one.

For agent memory, the practical spectrum is: vector RAG (retrieve chunks) → GraphRAG (retrieve graph neighborhoods and summaries) → compiled wiki (traverse linked articles the agent maintains). Farzapedia and HydraDB-style substrates (see WikiThon) mix file-first wikis with managed recall APIs.

ContentsOn this page

[hide]

Backlinks