This repository contains the code samples that will let participants explore how to use the Retrieval Augmented Generation (RAG) architecture with Amazon Bedrock and Amazon OpenSearch Serverless (AOSS) to quickly build a secure chat assistant that uses the most up-to-date information to converse with users. Participants will also learn how this chat assistant will use dialog-guided information retrieval to respond to users.
Amazon Bedrock is a fully managed service that offers a choice of high-performing Foundation Models (FMs) from leading AI companies accessible through a single API, along with a broad set of capabilities you need to build generative AI applications, simplifying development while maintaining privacy and security.
Large Language Models (LLMs) are a type of Foundation Model that can take natural langauge as input, with the ability to process and understand it, and produce natural language as the output. LLMs can also can perform tasks like classification, summarization, simplification, entity recognition, etc.
LLMs are usually trained offline with data that is available until that point of time. As a result, LLMs will not have knowledge of the world after that date. Additionally, LLMs are trained on very general domain corpora, making them less effective for domain-specific tasks. And then, LLMs have the tendency to hallucinate where the model generates text that is incorrect, nonsensical, or not real. Using a Retrieval Augment Generation (RAG) mechanism can help mitigate all these issues. A RAG architecture involves retrieving data that closely matches the text in the user's prompt, from an external datasource, and using it to augment the prompt before sending to the LLM. This prompt augmentation will provide the context that the LLM can use to respond to the prompt.
This repository contains code that will walk you through the process of building a chat assistant using a Large Language Model (LLM) hosted on Amazon Bedrock and using Knowledge Bases for Amazon Bedrock for vectorizing, storing, and retrieving data through semantic search. Amazon OpenSearch Serverless will be used as the vector index.
- Choose an AWS Account to use and make sure to create all resources in that Account.
- Identify an AWS Region that has Amazon Bedrock with Anthropic Claude 3 and Titan Embeddings G1 - Text models.
- In that Region, create a new or use an existing Amazon S3 bucket of your choice. Make sure that this bucket can be read by AWS CloudFormation.
- Create the Lambda layer file named
py312_opensearch-py_requests_and_requests-aws4auth.zip
using the following procedure and upload it to the same Amazon S3 bucket as in step 3.- On Windows 10 or above:
- Make sure Python 3.12 and pip are installed and set in the user's PATH variable.
- Download 7-zip and install it in
C:/Program Files/7-Zip/
. - Open the Windows command prompt.
- Create a new directory and
cd
into it. - Run the lambda_layer_file_create.bat from inside of that directory.
- This will create the Lambda layer file named
py312_opensearch-py_requests_and_requests-aws4auth.zip
.
- On Linux:
- Make sure Python 3.12 and pip are installed and set in the user's PATH variable.
- Open the Linux command prompt.
- Create a new directory and
cd
into it. - Run the lambda_layer_file_create.sh from inside of that directory.
- This will create the Lambda layer file named
py312_opensearch-py_requests_and_requests-aws4auth.zip
.
- On Windows 10 or above:
- Take the provided AWS CloudFormation template standard-rag-cfn.yaml and update the following parameter,
- DeploymentArtifactsS3BucketName - set this to the name of the Amazon S3 bucket from step 3.
- Create an AWS CloudFormation stack with the updated template.
- Open the Jupyter notebook named rag-router.ipynb by navigating to the Amazon SageMaker notebook instances console and clicking on the Open Jupyter link on the instance named rag-router-instance.
This repository contains
-
A Jupyter Notebook to get started.
-
Architecture diagrams that show the various components used in this session along with their interactions.
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.