GitHub repository for a tool that detects and filters malicious prompts before they are entered into a Retrieval-Augmented Generation (RAG) database, ensuring data integrity and security.
This filter is currently for Text data. Image data can be handled after Stephanalysis and converted to text to check for malicious prompts.
Model 1 - 98.4% accuracy.
Random Forest based model - 0.9954359274429491 accuracy