Amazon Bedrock is a fully managed service that makes high-performing foundation models (FMs) from leading AI startups and Amazon available for your use through a unified API. You can choose from a wide range of foundation models to find the model that is best suited for your use case. Amazon Bedrock also offers a broad set of capabilities to build generative AI applications with security, privacy, and responsible AI. Using Amazon Bedrock, you can easily experiment with and evaluate top foundation models for your use cases, privately customize them with your data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG), and build agents that execute tasks using your enterprise systems and data sources.
Large Language Models (LLMs) inevitably exhibit hallucinations since the accuracy of generated texts cannot be secured solely by the parametric knowledge they encapsulate. Although Retrieval Augmented Generation (RAG) is a practicable complement to LLMs, it relies heavily on the relevance of retrieved documents, raising concerns about how the model behaves if retrieval goes wrong.
Advanced RAG techniques like Corrective RAG were proposed to improve the robustness of generation. In CRAG, a lightweight retrieval evaluator is designed to assess the overall quality of retrieved documents for a query, returning a confidence degree based on which different knowledge retrieval actions can be triggered. Since retrieval from static and limited corpora can only return sub-optimal documents, large-scale web searches are utilized as an extension for augmenting the retrieval results. CRAG is plug-and-play and can be seamlessly coupled with various RAG-based approaches.
This repository contains code that will walk you through the process of building a simplified CRAG based assistant. We will cover two scenarios for the retrieval phase:
- (Scenario 1) A document that closely matches the specified query is located in the Knowledge Base.
- (Scenario 2) A document that closely matches the specified query is not located in the Knowledge Base. As a result, a web search will be performed to retrieve matching document(s).
- Choose an AWS Account to use and make sure to create all resources in that Account.
- Identify an AWS Region that has Amazon Bedrock with Anthropic Claude 3 and Titan Embeddings G1 - Text models.
- In that Region, copy the following file to a new or existing Amazon S3 bucket of your choice. Make sure that this bucket can be read by AWS CloudFormation.
- Create the Lambda layer file named
py312_opensearch-py_requests_and_requests-aws4auth.zip
using the following procedure and upload it to the same Amazon S3 bucket as in step 3.- On Windows 10 or above:
- Make sure Python 3.12 and pip are installed and set in the user's PATH variable.
- Download 7-zip and install it in
C:/Program Files/7-Zip/
. - Open the Windows command prompt.
- Create a new directory and
cd
into it. - Run the lambda_layer_file_create.bat from inside of that directory.
- This will create the Lambda layer file named
py312_opensearch-py_requests_and_requests-aws4auth.zip
.
- On Linux:
- Make sure Python 3.12 and pip are installed and set in the user's PATH variable.
- Open the Linux command prompt.
- Create a new directory and
cd
into it. - Run the lambda_layer_file_create.sh from inside of that directory.
- This will create the Lambda layer file named
py312_opensearch-py_requests_and_requests-aws4auth.zip
.
- On Windows 10 or above:
- Take the provided AWS CloudFormation template simplified-corrective-rag-cfn.yaml and update the following parameter,
- DeploymentArtifactsS3BucketName - set this to the name of the Amazon S3 bucket from step 3.
- Create an AWS CloudFormation stack with the updated template.
- Open the Jupyter notebook named simplified-corrective-rag.ipynb by navigating to the Amazon SageMaker notebook instances console and clicking on the Open Jupyter link on the instance named simplified-crag-instance.
- An assets folder that contains the AWS CloudFormation template and the dependent artifacts.
- The Python code for an AWS Lambda function that will be invoked by the Bedrock Agent to perform the web search. This is also zipped into this file as a dependent artifact.
- A notebooks folder that contains all the artifacts related to the Jupyter notebook that you will be working on.
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.