TechArticle

Retrieval-Augmented Generation

Infobox

headline: Retrieval-Augmented Generation
description: Pattern where an LLM retrieves external chunks at query time instead of relying on a precompiled knowledge structure.

Retrieval-augmented generation (RAG) is a pattern where a language model answers questions by retrieving relevant passages from an external corpus—often via vector embeddings—at query time, then conditioning its reply on those chunks.

RAG is flexible for ad hoc document sets but can feel opaque: similarity search does not guarantee explicit structure, stable cross-links, or inspectable memory. The LLM Wiki pattern popularized by Andrej Karpathy argues for compiling sources into a persistent, interlinked markdown wiki that agents maintain and traverse—exemplified by Farzapedia.

Both approaches support Personal Knowledge workflows; many practitioners combine compiled wikis with targeted retrieval when corpora grow very large.

GraphRAG

GraphRAG is a RAG variant that first indexes a corpus into a knowledge graph—entities, relationships, and hierarchical community summaries—then retrieves through that structure at query time. It was popularized by Microsoft’s GraphRAG project for questions that need both local detail (“what did this document say about X?”) and global synthesis (“what are the main themes across the whole collection?”).

Compared with embedding-only RAG:

	Vector RAG	GraphRAG
Index	Chunk embeddings	Graph + community reports
Strength	Fast similarity lookup	Cross-document themes, structured hops
Tradeoff	Weak explicit structure	Heavier indexing pipeline; graph can be opaque

GraphRAG sits between opaque vector stores and human-inspectable wikis. You get explicit declarative knowledge in the graph, but the artifact is usually generated and queried through tooling—not necessarily a folder of markdown you edit in Obsidian. The LLM Wiki pattern pushes further toward files as the source of truth: interlinked pages, visible WikiLinks, and optional RDF / SPARQL validation in wikis like this one.

For agent memory, the practical spectrum is: vector RAG (retrieve chunks) → GraphRAG (retrieve graph neighborhoods and summaries) → compiled wiki (traverse linked articles the agent maintains). Farzapedia and HydraDB-style substrates (see WikiThon) mix file-first wikis with managed recall APIs.

ContentsOn this page

[hide]

(Top)
GraphRAG

Backlinks

Categories:

TechArticle

Talk / Local Notes: Retrieval-Augmented Generation

Your personal scratchpad for this page (saved locally in browser)

0 characters

View Source: Retrieval-Augmented Generation

Raw Markdown source code of the document

# Retrieval-Augmented Generation

**Retrieval-augmented generation (RAG)** is a pattern where a language model answers questions by retrieving relevant passages from an external corpus—often via vector embeddings—at query time, then conditioning its reply on those chunks.

RAG is flexible for ad hoc document sets but can feel opaque: similarity search does not guarantee explicit structure, stable cross-links, or inspectable memory. The [LLM Wiki](LLM_Wiki.md) pattern popularized by Andrej Karpathy argues for **compiling** sources into a persistent, interlinked markdown wiki that agents maintain and traverse—exemplified by [Farzapedia](Farzapedia.md).

Both approaches support [Personal Knowledge](Personal_Knowledge.md) workflows; many practitioners combine compiled wikis with targeted retrieval when corpora grow very large.

## GraphRAG

**GraphRAG** is a RAG variant that first **indexes a corpus into a knowledge graph**—entities, relationships, and hierarchical **community summaries**—then retrieves through that structure at query time. It was popularized by [Microsoft’s GraphRAG project](https://github.com/microsoft/graphrag) for questions that need both local detail (“what did this document say about X?”) and **global synthesis** (“what are the main themes across the whole collection?”).

Compared with embedding-only RAG:

|          | Vector RAG              | GraphRAG                                       |
| -------- | ----------------------- | ---------------------------------------------- |
| Index    | Chunk embeddings        | Graph + community reports                      |
| Strength | Fast similarity lookup  | Cross-document themes, structured hops         |
| Tradeoff | Weak explicit structure | Heavier indexing pipeline; graph can be opaque |

GraphRAG sits between opaque vector stores and human-inspectable wikis. You get explicit [declarative knowledge](Declarative_Knowledge.md) in the graph, but the artifact is usually generated and queried through tooling—not necessarily a folder of markdown you edit in [Obsidian](Obsidian_Integration.md). The [LLM Wiki](LLM_Wiki.md) pattern pushes further toward **files as the source of truth**: interlinked pages, visible WikiLinks, and optional [RDF](RDF.md) / [SPARQL](SPARQL.md) validation in wikis like this one.

For agent memory, the practical spectrum is: **vector RAG** (retrieve chunks) → **GraphRAG** (retrieve graph neighborhoods and summaries) → **compiled wiki** (traverse linked articles the agent maintains). [Farzapedia](Farzapedia.md) and [HydraDB](https://docs.hydradb.com/get-started/introduction)-style substrates (see [WikiThon](WikiThon.md)) mix file-first wikis with managed recall APIs.

Metadata: Retrieval-Augmented Generation

RDF representation compiled from frontmatter

{
  "@context": {
    "schema": "https://schema.org/",
    "wiki": "https://wazootech.github.io/wiki/"
  },
  "@id": "wiki:Retrieval_Augmented_Generation",
  "@type": "schema:TechArticle",
  "schema:description": "Pattern where an LLM retrieves external chunks at query time instead of relying on a precompiled knowledge structure.",
  "schema:headline": "Retrieval-Augmented Generation"
}

@prefix schema: <https://schema.org/> .
@prefix wiki: <https://wazootech.github.io/wiki/> .

wiki:Retrieval_Augmented_Generation a schema:TechArticle ;
    schema:description "Pattern where an LLM retrieves external chunks at query time instead of relying on a precompiled knowledge structure." ;
    schema:headline "Retrieval-Augmented Generation" .

@prefix schema: <https://schema.org/> .
@prefix wiki: <https://wazootech.github.io/wiki/> .

wiki:Retrieval_Augmented_Generation a schema:TechArticle ;
    schema:description "Pattern where an LLM retrieves external chunks at query time instead of relying on a precompiled knowledge structure." ;
    schema:headline "Retrieval-Augmented Generation" .

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:schema="https://schema.org/"
>
  <rdf:Description rdf:about="https://wazootech.github.io/wiki/Retrieval_Augmented_Generation">
    <rdf:type rdf:resource="https://schema.org/TechArticle"/>
    <schema:headline>Retrieval-Augmented Generation</schema:headline>
    <schema:description>Pattern where an LLM retrieves external chunks at query time instead of relying on a precompiled knowledge structure.</schema:description>
  </rdf:Description>
</rdf:RDF>

<https://wazootech.github.io/wiki/Retrieval_Augmented_Generation> <https://schema.org/headline> "Retrieval-Augmented Generation" .
<https://wazootech.github.io/wiki/Retrieval_Augmented_Generation> <https://schema.org/description> "Pattern where an LLM retrieves external chunks at query time instead of relying on a precompiled knowledge structure." .
<https://wazootech.github.io/wiki/Retrieval_Augmented_Generation> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://schema.org/TechArticle> .

@prefix schema: <https://schema.org/> .
@prefix wiki: <https://wazootech.github.io/wiki/> .

_:N3b8d7b46ef3346eaad343ecf968a16af {
    wiki:Retrieval_Augmented_Generation a schema:TechArticle ;
        schema:description "Pattern where an LLM retrieves external chunks at query time instead of relying on a precompiled knowledge structure." ;
        schema:headline "Retrieval-Augmented Generation" .
}

<https://wazootech.github.io/wiki/Retrieval_Augmented_Generation> <https://schema.org/headline> "Retrieval-Augmented Generation" .
<https://wazootech.github.io/wiki/Retrieval_Augmented_Generation> <https://schema.org/description> "Pattern where an LLM retrieves external chunks at query time instead of relying on a precompiled knowledge structure." .
<https://wazootech.github.io/wiki/Retrieval_Augmented_Generation> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://schema.org/TechArticle> .