> ## Documentation Index
> Fetch the complete documentation index at: https://docs.phidata.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Document Chunking

Document chunking is a method of splitting documents into smaller chunks based on document structure like paragraphs and sections. It analyzes natural document boundaries rather than splitting at fixed character counts. This is useful when you want to process large documents while preserving semantic meaning and context.

## Usage

```python theme={null}
from phi.agent import Agent
from phi.document.chunking.document import DocumentChunking
from phi.knowledge.pdf import PDFUrlKnowledgeBase
from phi.vectordb.pgvector import PgVector

db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"

knowledge_base = PDFUrlKnowledgeBase(
    urls=["https://phi-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"],
    vector_db=PgVector(table_name="recipes_document_chunking", db_url=db_url),
    chunking_strategy=DocumentChunking(),
)
knowledge_base.load(recreate=False)  # Comment out after first run

agent = Agent(
    knowledge_base=knowledge_base,
    search_knowledge=True,
)

agent.print_response("How to make Thai curry?", markdown=True)

```

## Params

| Parameter    | Type  | Default | Description                                         |
| ------------ | ----- | ------- | --------------------------------------------------- |
| `chunk_size` | `int` | `5000`  | The maximum size of each chunk.                     |
| `overlap`    | `int` | `0`     | The number of characters to overlap between chunks. |
