🧠 Spe Knowledge Assistant

An intelligent RAG pipeline that understands and answers questions about your documents and images.

Project Overview

Spe is a complete Retrieval-Augmented Generation (RAG) system built from the ground up. It allows users to upload documents (PDFs) and images (screenshots, photos), and then engage in a natural conversation to extract information. The system uses an OCR engine to read text from images, a vector database for efficient searching, and a large language model (Meta-Llama-13b) to generate accurate, context-aware answers.

💡

From Pixels to Answers

OCR-Powered RAG

Key Features

📄 PDF & Image Processing

Extracts text from both PDF documents and images (PNG, JPG) using a robust OCR engine.

🧠 Meta-Llama 13b Integration

Leverages Meta's powerful Llama 13b model locally for high-quality text generation and understanding.

Isolated Sessions

Each document upload session is isolated, ensuring answers are strictly based on the provided context.

🚀 High-Performance Backend

Built with FastAPI and served via ngrok, providing a responsive and scalable foundation.

💬 Interactive Chat UI

A dynamic, mobile-responsive interface with streaming responses, stop-generation control, and a fresh start option.

📚 Source Citation

Every answer is backed by the source document's filename, ensuring transparency and trust.

Interactive RAG Pipeline

Click on each component of the pipeline to learn more about its role.

Document Upload

PDFs & Images

OCR & Embedding

Text Extraction

Pinecone Vector Store

Indexing

User Query & RAG

Context Retrieval

Meta-Llama 13b LLM

Answer Generation

Select a component to see details.

Technology Stack

Python FastAPI LangChain Pinecone Llama 13b PyTorch Tesseract (OCR) Hugging Face

Implementation Journey