| # Advanced RAG Tool for smolagents | |
| This repository contains an improved Retrieval-Augmented Generation (RAG) tool built for the `smolagents` library from Hugging Face. This tool allows you to: | |
| - Create vector stores from various document types (PDF, TXT, HTML, etc.) | |
| - Choose different embedding models for better semantic understanding | |
| - Configure chunk sizes and overlaps for optimal text splitting | |
| - Select between different vector stores (FAISS or Chroma) | |
| - Share your tool on the Hugging Face Hub | |
| ## Installation | |
| ```bash | |
| pip install smolagents langchain-community langchain-text-splitters faiss-cpu chromadb sentence-transformers pypdf2 gradio | |
| ``` | |
| ## Basic Usage | |
| ```python | |
| from rag_tool import RAGTool | |
| # Initialize the RAG tool | |
| rag_tool = RAGTool() | |
| # Configure with custom settings | |
| rag_tool.configure( | |
| documents_path="./my_document.pdf", | |
| embedding_model="BAAI/bge-small-en-v1.5", | |
| vector_store_type="faiss", | |
| chunk_size=1000, | |
| chunk_overlap=200, | |
| persist_directory="./vector_store", | |
| device="cpu" # Use "cuda" for GPU acceleration | |
| ) | |
| # Query the documents | |
| result = rag_tool("What is attention in transformer architecture?", top_k=3) | |
| print(result) | |
| ``` | |
| ## Using with an Agent | |
| ```python | |
| import warnings | |
| # Suppress LangChain deprecation warnings | |
| warnings.filterwarnings("ignore", category=DeprecationWarning) | |
| from smolagents import CodeAgent, InferenceClientModel | |
| from rag_tool import RAGTool | |
| # Initialize and configure the RAG tool | |
| rag_tool = RAGTool() | |
| rag_tool.configure(documents_path="./my_document.pdf") | |
| # Create an agent model | |
| model = InferenceClientModel( | |
| model_id="mistralai/Mistral-7B-Instruct-v0.2", | |
| token="your_huggingface_token" | |
| ) | |
| # Create the agent with our RAG tool | |
| agent = CodeAgent(tools=[rag_tool], model=model, add_base_tools=True) | |
| # Run the agent | |
| result = agent.run("Explain the key components of the transformer architecture") | |
| print(result) | |
| ``` | |
| ## Gradio Interface | |
| For an interactive experience, run the Gradio app: | |
| ```bash | |
| python gradio_app.py | |
| ``` | |
| This provides a web interface where you can: | |
| - Upload documents | |
| - Configure embedding models and chunk settings | |
| - Query your documents with semantic search | |
| ## Customization Options | |
| ### Embedding Models | |
| You can choose from various embedding models: | |
| - `sentence-transformers/all-MiniLM-L6-v2` (fast, smaller model) | |
| - `BAAI/bge-small-en-v1.5` (good balance of performance and speed) | |
| - `BAAI/bge-base-en-v1.5` (better performance, slower) | |
| - `thenlper/gte-small` (good for general text embeddings) | |
| - `thenlper/gte-base` (larger GTE model) | |
| ### Vector Store Types | |
| - `faiss`: Fast, in-memory vector database (better for smaller collections) | |
| - `chroma`: Persistent vector database with metadata filtering capabilities | |
| ### Document Types | |
| The tool supports multiple document types: | |
| - PDF documents | |
| - Text files (.txt) | |
| - Markdown files (.md) | |
| - HTML files (.html) | |
| - Entire directories of mixed document types | |
| ## Sharing Your Tool | |
| You can share your tool on the Hugging Face Hub: | |
| ```python | |
| rag_tool.push_to_hub("your-username/rag-retrieval-tool", token="your_huggingface_token") | |
| ``` | |
| ## Limitations | |
| - The tool currently doesn't support image content from PDFs | |
| - Very large documents may require additional memory | |
| - Some embedding models may be slow on CPU-only environments | |
| ## Contributing | |
| Contributions are welcome! Feel free to open an issue or submit a pull request. | |
| ## License | |
| MIT |