Multilingual RAG System - Bangla Text Book Pipeline

Published:

Built a complete Retrieval-Augmented Generation (RAG) system that understands both English and Bengali. Features dual embedding strategy (Cohere + OpenAI), advanced reranking, OCR processing for Bengali text, and a FastAPI backend with Streamlit GUI. Successfully deployed on AWS EC2 with Nginx reverse proxy. The system ingests HSC Bangla textbooks and provides intelligent question answering capabilities.

Technologies Used: RAG, Multilingual AI, FastAPI, Streamlit, Docker, AWS EC2, ChromaDB, OCR, Bengali NLP

View ProjectLive Demo