LLM-OS v1: Add a streamlit application example for LLM-OS. A application that uses an LLM with advanced operating-system level capabilities.
GeoBuddy: An AI-powered geography agent that analyzes images to predict locations based on visible cues like landmarks, architecture, and cultural symbols.
This release introduces new capabilities, such as image-to-image generation, MongoDB vector support, and enhanced chunking methods, alongside significant bug fixes and usability enhancements.
OpenAI rejecting functions with too large description: This issue was discovered with Composio tools, but affected all tools. Resolved and made tool definitions better match JSON Schema.
Google Embedder Compatibility: Fixed an issue with Pydantic 2.10.x that caused the rejection of the Google embedder, restoring full functionality.
ChromaDB Upsert Issue: Resolved bugs that caused errors during upsert operations in ChromaDB.
Async Error Messaging: Improved error messages for unsupported async operations in models.
Gemini Tool Parameters: Fixed an issue with parameter handling in Gemini tools.
Cassandra Integration as a Vector Database: Introduced support for Apache Cassandra as a Vector Database, leveraging CassIO for vector storage and retrieval.
ClickHouse as a Vector Database: Added support for ClickHouse as a Vector Database.
DuckDuckGo Modifier: Added a modifier parameter to the DuckDuckGo tool, allowing users to refine searches with site-specific queries, file type filters, and safe search toggles.
Enhanced Error Handling for python-docx: Improved error messaging for scenarios where the python-docx library is not installed, providing clearer guidance and smoother debugging.
Confluence Tool: Added a new tool using the Atlassian Confluence SDK, enabling operations such as listing pages in a space, creating and updating pages, retrieving page content by title, and getting space details.
Tool Compatibility: Enhanced older custom functions with manually specified descriptions and parameters to align with the updated tool-building system.
Deep Copy for Ollama Chat Agents: Addressed an issue where manually set clients caused errors during agent model copying, ensuring all properties are properly handled.
This release enhances multimodal capabilities with audio support, improves session page performance, and fixes various bugs for better stability and usability.
Version Checker for OpenAI: Added a warning for users with OpenAI versions below 1.52.0 to ensure compatibility with features like audio in ChatCompletionMessage.
Agent Response Handling: Enhanced processing of agent responses to support lists, improving handling of multi-item outputs.
AWS Bedrock Tool Descriptions: Fixed an issue where the transfer tool description was missing, causing incompatibility with AWS Bedrock Claude.
Response Content Handling: Resolved crashes on the session page caused by non-string response content.
Deep Copy Agent Memory: Addressed deep copy errors when using agent memory on the playground.
Session Page Enhancements: Fixed the refresh button.
Fix Tool Parsing for Ollama: Fixed JSON schema tool parsing by transforming ['string', 'null'] parameters to 'string' for compatibility.
Response Parsing for Gemini Tool: Improved response parsing to handle unserializable objects in tool_calls for Gemini on the playground.
Memory Handling for Google Provider: Fixed an issue in monitoring_data where memory was removed for all providers, causing blank titles on Phidata.app; now only modifies memory for Google provider.
RecursiveChunking ID Conflict: Resolved an issue in RecursiveChunking where processing large files with multiple chunks caused duplicate chunk record IDs, leading to psycopg.errors.UniqueViolation.
This update introduces new multimodal tools, enhances the multimodal capabilities of existing models, and includes several quality-of-life improvements.
This update introduces image and video multi-modal support for the agent playground and adds image and video generation tools like FAL, replicate, and ModelLabs.
Show Reasoning in the Playground: Provides users with insights into the agent’s thought process and decision-making, offering a behind-the-scenes look at how the agent operates within the Playground.
Show Tool Call in the Playground: Display tool call results and metrics on hover within the Playground.
Sessions Page Enhancements: Added context, reasoning, and tools integration to the Sessions page.
Delete Custom Endpoint for Playground: Introduced the ability to delete endpoints directly within the Playground.
Show References in the Playground: You can now view the sources used for RAG.
Chunking Strategies in RAG: Introduced five levels of chunking strategies in RAG: Semantic Chunking, Fixed Chunking, Agentic Chunking, Document Chunking, and Recursive Chunking.
Unified Reranker for Vector Databases: Implemented a unified reranking feature across various vector databases, including LanceDB, with support for CohereReranker.
Milvus Vector Database Integration: Added support for Milvus as a vector database option.
Support for Multi-Modalities: Added support for processing audio, video, and image data to enhance agent capabilities.
Context Injection: Improved context handling to enhance agent responses and usability.
Pre-hook and post-hook for function calls: enabling users to validate arguments, add human-in-the-loop flows and validate results of tool calls.
Agent.add_context is now Agent.add_references as the context terminology is now used for context injection. Similarly, context_format is now references_format.
This update is packed with new integrations, significant improvements to existing tools, and crucial fixes to enhance functionality and user experience.