This guide will walk you through the steps to configure Azure API Management (APIM) to work with a new consumer or Azure OpenAI deployment.
- Prerequisites
- Step-by-Step Configuration: Onboarding a New Azure OpenAI Resource
- Step-by-Step Configuration: Onboarding a New Consumer
Before starting, make sure you have:
- An operational AI Hub Gateway deplyoment.
- Access to the Azure OpenAI service if you are adding a new deployment.
- Azure Portal access.
Steps:
-
Azure Portal:
- Navigate to your Virtual Network (VNet) where APIM is deployed.
- Go to DNS Servers and ensure you have the correct DNS settings for resolving OpenAI endpoints.
-
DNS Configuration:
- If you're using custom DNS, ensure the DNS server can resolve OpenAI service endpoints.
- You may need to add custom DNS entries to your DNS server for OpenAI services.
-
Network Configuration:
- Ensure that network connectivity is available between API Management and the Azure OpenAI Resource. If your Azure OpenAI Resource does not allow public networking, you may need to add a private endpoint in your Virtual Network. See: Use private endpoints.
The identity of the Azure API Management needs access to perform inference calls on the AI Models.
Steps:
-
Azure Portal:
- Navigate to your Azure API Management instance.
- Go to Managed identities under Security and ensure it is enabled.
-
Role Assignment:
- Navigate to your Azure OpenAI resource.
- Go to Access Control (IAM) and click Add role assignment.
- Select Cognitive Services OpenAI User role. See: Role-based access control.
- Assign this role to the APIM Managed Identity.
Steps:
- Azure Portal:
- Navigate to your Azure OpenAI resource.
- Under Deployments, note down the names of all the deployments you have created.
Tip
Ensure that your backend-url ends with /openai
Steps:
- Azure Portal:
- Navigate to your Azure API Management instance.
- Go to Backends under APIs.
- Click + Add to create a new backend.
- Configure the backend with the OpenAI endpoint URL and name it appropriately (it should end with
/openai/
).
Steps:
- Azure Portal:
- Navigate to your Azure API Management instance.
- Go to APIs, select the OpenAI API, and navigate to Design.
- Go to the menu on the OpenAI API and select Add Revision to create a new revision (to avoid downtime during implementation).
- Under Inbound processing, update the policy to include the new routes and clusters for OpenAI deployments.
Sample Configuration:
<set-variable name="oaClusters" value="@{
// route is an Azure OpenAI API endpoint
JArray routes = new JArray();
JArray clusters = new JArray();
routes.Add(new JObject()
{
{ "name", "EastUS" },
{ "location", "eastus" },
{ "backend-id", "openai-backend-0" },
{ "priority", 1},
{ "isThrottling", false },
{ "retryAfter", DateTime.MinValue }
});
clusters.Add(new JObject()
{
{ "deploymentName", "chat" },
{ "routes", new JArray(routes[0]) }
});
return clusters;
}" />
Ensure that the backend is linked with all available deployments for that endpoint by updating the clusters variable accordingly.
Steps:
- Azure Portal:
- Navigate to your Azure API Management instance.
- Go to APIs and select the OpenAI API.
- Under Test, select the new revision and test the API endpoints to ensure they are working as expected.
Steps:
- Azure Portal:
- Navigate to your Azure API Management instance.
- Go to APIs, select the OpenAI API, and navigate to Revisions.
- Select the new revision and click Make current.
In some cases, you might want to restrict access to specific models based on the business unit or team using the OpenAI endpoint.
The following policy can be implemented at a product level to restrict access to specific model deployments. For more details, refer to the Model-based RBAC guide.
Caution
This policy will restrict access to only two deployments (gpt-4 and embedding). Any other model deployment will get a 401 Unauthorized response.
Sample Policy:
<inbound>
<base />
<!-- Restrict access for this product to specific models -->
<choose>
<when condition="@(!new [] { 'gpt-4', 'embedding' }.Contains(context.Request.MatchedParameters['deployment-id'] ?? String.Empty))">
<return-response>
<set-status code="401" reason="Unauthorized" />
</return-response>
</when>
</choose>
</inbound>
Steps:
- Azure Portal:
- Navigate to your Azure API Management instance.
- Go to Products and click + Add.
- Configure the product with the appropriate settings for token throughput capacity and access to specific models (using product-level policies).
Steps:
- Azure Portal:
- Navigate to your Azure API Management instance.
- Go to Products, select the newly created product, and navigate to Subscriptions.
- Click + Add to create a new subscription.
- Provide the necessary details and generate a subscription key.
Steps:
- Azure Portal:
- Navigate to your Azure API Management instance.
- Go to APIs,
select the OpenAI API, and copy the endpoint URL.
- Share the endpoint URL, subscription key, and list of available models with the team.
Sample Configuration for Sharing:
Caution
A subscription key is like a password. Ensure you share it securely.
API Endpoint: https://apim-your-instance.azure-api.net/openai
Subscription Key: {YourSubscriptionKey}
Available Models: gpt-3.5-turbo, gpt-4, dall-e