- Write python scripts for processing data, creating charts
- Analyze CSV, Parquet or JSON data using DuckDb.
- Write complex queries including joins and filtering.
- Run data transformations and export the results.
- Explain analysis step-by-step.
Setup
1
Create a virtual environment
2
Install phidata
3
Install docker
Install docker desktop to run your app locally
4
Export your OpenAI key
You can get an API key from here.
Create your codebase
Create your codebase using thejunior-de
template
junior-de
with the following structure:
Set OpenAI Key
Set yourOPENAI_API_KEY
as an environment variable. You can get one from OpenAI.
Run Junior DE locally
Start your Junior DE using:DuckGPT: Automate Data Analysis using DuckDb
- Open localhost:8501 to access your Junior DE.
- Your first Junior DE is DuckGPT that can write and run SQL queries using DuckDb.
- Click on DuckGPT and enter a username.
- Message “Show me revenue over time”
- See your Junior DE work through the problem.
- Message “Save it” to save the query to the
ai/duckgpt/scratch
folder.

Add your data
DuckGPT
tables are defined in the ai/duckgpt/knowledge/tables.json
file.
- You can add
csv
,json
orparquet
files stored locally or on s3. - You can also add
txt
files to provide more information to the Agent. - Click the
Update Knowledge Base
to load the knowledge base.
Message us on discord if you need help.
How DuckGPT works
DuckGPT uses theDuckDbAgent
defined in the ai/duckgpt/duckgpt.py
file. You can customize your agent and adapt the Junior DE to your workflow.
PyGPT: Automate Data Analysis using Python
- Your next Junior DE is PyGPT that can write python scripts for processing data, create charts and more. Click on PyGPT.
- Message “Show me a chart of revenue per year”
- Each script that PyGPT creates and runs is saved to the
ai/pygpt/scratch
folder for reference - See your Junior DE work through the problem.

Add your data
PyGPT
files are defined in the ai/pygpt/knowledge/files.json
file.
- You can add
csv
,json
orparquet
files stored locally or on s3. - You can also add
txt
files to provide more information to the Agent. - Click the
Update Knowledge Base
to load the knowledge base.
Message us on discord if you need help.
How PyGPT works
PyGPT uses thePythonAgent
defined in the ai/pygpt/pygpt_streamlit.py
file. You can customize your agent and adapt the Junior DE to your workflow.
Optional: Run Jupyterlab
A jupyter notebook is a must have for AI development and yourjunior-de
comes with a notebook pre-installed with the required dependencies. To start your notebook:
1
Enable Jupyter
Update the
workspace/settings.py
file and set dev_jupyter_enabled=True
workspace/settings.py
2
Start Jupyter
3
View JupyterLab UI
- Open localhost:8888 to view the Jupyterlab UI. Password: admin
- Open
notebooks/duckgpt
to play with DuckGPT.

Delete local resources
Play around and stop the workspace using:Upcoming Upgrades
Junior DE is a v0 release, meaning there will be plenty of bugs and plenty of upgrades. Here’s what we have in the works:- Add Snowflake Agent.
- Ask questions via slack
- Add Airflow Agent.
- Add ability to write and test data pipelines.
Next
Congratulations on running your Junior DE locally. Next Steps:- Run your Junior DE on AWS
- Read how to update workspace settings
- Read how to create a git repository for your workspace
- Read how to manage the development application
- Read how to format and validate your code
- Read how to add python libraries
- Chat with us on discord