> ## Documentation Index
> Fetch the complete documentation index at: https://docs.phidata.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Knowledge

**Agents use knowledge to supplement their training data with domain expertise**.

Knowledge is stored in a vector database and provides agents with business context at query time, helping them respond in a context-aware manner. The general syntax is:

```python theme={null}
from phi.agent import Agent, AgentKnowledge

# Create a knowledge base for the Agent
knowledge_base = AgentKnowledge(vector_db=...)

# Add information to the knowledge base
knowledge_base.load_text("The sky is blue")

# Add the knowledge base to the Agent and
# give it a tool to search the knowledge base as needed
agent = Agent(knowledge=knowledge_base, search_knowledge=True)
```

## Vector Databases

While any type of storage can act as a knowledge base, vector databases offer the best solution for retrieving relevant results from dense information quickly. Here's how vector databases are used with Agents:

<Steps>
  <Step title="Chunk the information">
    Break down the knowledge into smaller chunks to ensure our search query
    returns only relevant results.
  </Step>

  <Step title="Load the knowledge base">
    Convert the chunks into embedding vectors and store them in a vector
    database.
  </Step>

  <Step title="Search the knowledge base">
    When the user sends a message, we convert the input message into an
    embedding and "search" for nearest neighbors in the vector database.
  </Step>
</Steps>

## Example: RAG Agent with a PDF Knowledge Base

Let's build a **RAG Agent** that answers questions from a PDF.

### Step 1: Run PgVector

Let's use `PgVector` as our vector db as it can also provide storage for our Agents.

Install [docker desktop](https://docs.docker.com/desktop/install/mac-install/) and run **PgVector** on port **5532** using:

```bash theme={null}
docker run -d \
  -e POSTGRES_DB=ai \
  -e POSTGRES_USER=ai \
  -e POSTGRES_PASSWORD=ai \
  -e PGDATA=/var/lib/postgresql/data/pgdata \
  -v pgvolume:/var/lib/postgresql/data \
  -p 5532:5432 \
  --name pgvector \
  phidata/pgvector:16
```

### Step 2: Traditional RAG

Retrieval Augmented Generation (RAG) means **"stuffing the prompt with relevant information"** to improve the model's response. This is a 2 step process:

1. Retrieve relevant information from the knowledge base.
2. Augment the prompt to provide context to the model.

Let's build a **traditional RAG** Agent that answers questions from a PDF of recipes.

<Steps>
  <Step title="Install libraries">
    Install the required libraries using pip

    <CodeGroup>
      ```bash Mac theme={null}
      pip install -U pgvector pypdf "psycopg[binary]" sqlalchemy
      ```

      ```bash Windows theme={null}
      pip install -U pgvector pypdf "psycopg[binary]" sqlalchemy
      ```
    </CodeGroup>
  </Step>

  <Step title="Create a Traditional RAG Agent">
    Create a file `traditional_rag.py` with the following contents

    ```python traditional_rag.py theme={null}
    from phi.agent import Agent
    from phi.model.openai import OpenAIChat
    from phi.knowledge.pdf import PDFUrlKnowledgeBase
    from phi.vectordb.pgvector import PgVector, SearchType

    db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"
    knowledge_base = PDFUrlKnowledgeBase(
        # Read PDF from this URL
        urls=["https://phi-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"],
        # Store embeddings in the `ai.recipes` table
        vector_db=PgVector(table_name="recipes", db_url=db_url, search_type=SearchType.hybrid),
    )
    # Load the knowledge base: Comment after first run
    knowledge_base.load(upsert=True)

    agent = Agent(
        model=OpenAIChat(id="gpt-4o"),
        knowledge=knowledge_base,
        # Enable RAG by adding references from AgentKnowledge to the user prompt.
        add_context=True,
        # Set as False because Agents default to `search_knowledge=True`
        search_knowledge=False,
        markdown=True,
        # debug_mode=True,
    )
    agent.print_response("How do I make chicken and galangal in coconut milk soup")
    ```
  </Step>

  <Step title="Run the agent">
    Run the agent (it takes a few seconds to load the knowledge base).

    <CodeGroup>
      ```bash Mac theme={null}
      python traditional_rag.py
      ```

      ```bash Windows theme={null}
      python traditional_rag.py
      ```
    </CodeGroup>

    <br />
  </Step>
</Steps>

<Accordion title="How to use local PDFs" icon="file-pdf" iconType="duotone">
  If you want to use local PDFs, use a `PDFKnowledgeBase` instead

  ```python agent.py theme={null}
  from phi.knowledge.pdf import PDFKnowledgeBase

  ...
  knowledge_base = PDFKnowledgeBase(
      path="data/pdfs",
      vector_db=PgVector(
          table_name="pdf_documents",
          db_url=db_url,
      ),
  )
  ...
  ```
</Accordion>

### Step 3: Agentic RAG

With traditional RAG above, `add_context=True` always adds information from the knowledge base to the prompt, regardless of whether it is relevant to the question or helpful.

With Agentic RAG, we let the Agent decide **if** it needs to access the knowledge base and what search parameters it needs to query the knowledge base.

Set `search_knowledge=True` and `read_chat_history=True`, giving the Agent tools to search its knowledge and chat history on demand.

<Steps>
  <Step title="Create an Agentic RAG Agent">
    Create a file `agentic_rag.py` with the following contents

    ```python agentic_rag.py theme={null}
    from phi.agent import Agent
    from phi.model.openai import OpenAIChat
    from phi.knowledge.pdf import PDFUrlKnowledgeBase
    from phi.vectordb.pgvector import PgVector, SearchType

    db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"
    knowledge_base = PDFUrlKnowledgeBase(
        urls=["https://phi-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"],
        vector_db=PgVector(table_name="recipes", db_url=db_url, search_type=SearchType.hybrid),
    )
    # Load the knowledge base: Comment out after first run
    knowledge_base.load(upsert=True)

    agent = Agent(
        model=OpenAIChat(id="gpt-4o"),
        knowledge=knowledge_base,
        # Add a tool to search the knowledge base which enables agentic RAG.
        search_knowledge=True,
        # Add a tool to read chat history.
        read_chat_history=True,
        show_tool_calls=True,
        markdown=True,
        # debug_mode=True,
    )
    agent.print_response("How do I make chicken and galangal in coconut milk soup", stream=True)
    agent.print_response("What was my last question?", markdown=True)
    ```
  </Step>

  <Step title="Run the agent">
    Run the agent

    <CodeGroup>
      ```bash Mac theme={null}
      python agentic_rag.py
      ```

      ```bash Windows theme={null}
      python agentic_rag.py
      ```
    </CodeGroup>

    <Note>
      Notice how it searches the knowledge base and chat history when needed
    </Note>
  </Step>
</Steps>

## Attributes

| Parameter                  | Type                                  | Default | Description                                                                                                                                                                                                 |
| -------------------------- | ------------------------------------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `knowledge`                | `AgentKnowledge`                      | `None`  | Provides the knowledge base used by the agent.                                                                                                                                                              |
| `search_knowledge`         | `bool`                                | `True`  | Adds a tool that allows the Model to search the knowledge base (aka Agentic RAG). Enabled by default when `knowledge` is provided.                                                                          |
| `add_context`              | `bool`                                | `False` | Enable RAG by adding references from AgentKnowledge to the user prompt.                                                                                                                                     |
| `retriever`                | `Callable[..., Optional[list[dict]]]` | `None`  | Function to get context to add to the user message. This function is called when add\_context is True.                                                                                                      |
| `context_format`           | `Literal['json', 'yaml']`             | `json`  | Specifies the format for RAG, either "json" or "yaml".                                                                                                                                                      |
| `add_context_instructions` | `bool`                                | `False` | If True, add instructions for using the context to the system prompt (if knowledge is also provided). For example: add an instruction to prefer information from the knowledge base over its training data. |
