🧪 AI Gateway labs

What's new ✨

➕ AI Gateway workshop provides a comprehensive learning experience using the Azure Portal

➕ Refactor most of the labs to use the new LLM built-in logging that supports streaming completions.
➕ Realtime API (Audio and Text) with Azure OpenAI 🔥 experiments with the AOAI Realtime
➕ Realtime API (Audio and Text) with Azure OpenAI + MCP tools 🔥 experiments with the AOAI Realtime + MCP
➕ Model Context Protocol (MCP) ⚙️ experiments with the client authorization flow
➕ the FinOps Framework lab to manage AI budgets effectively 💰
➕ Agentic ✨ experiments with Model Context Protocol (MCP).
➕ Agentic ✨ experiments with OpenAI Agents SDK.
➕ Agentic ✨ experiments with AI Agent Service from Azure AI Foundry.

The rapid pace of AI advances demands experimentation-driven approaches for organizations to remain at the forefront of the industry. With AI steadily becoming a game-changer for an array of sectors, maintaining a fast-paced innovation trajectory is crucial for businesses aiming to leverage its full potential.

AI services are predominantly accessed via APIs, underscoring the essential need for a robust and efficient API management strategy. This strategy is instrumental for maintaining control and governance over the consumption of AI models, data and tools.

With the expanding horizons of AI services and their seamless integration with APIs, there is a considerable demand for a comprehensive AI Gateway pattern, which broadens the core principles of API management. Aiming to accelerate the experimentation of advanced use cases and pave the road for further innovation in this rapidly evolving field. The well-architected principles of the AI Gateway provides a framework for the confident deployment of Intelligent Apps into production.

🧠 AI Gateway

This repo explores the AI Gateway pattern through a series of experimental labs. The AI Gateway capabilities of Azure API Management plays a crucial role within these labs, handling AI services APIs, with security, reliability, performance, overall operational efficiency and cost controls. The primary focus is on Azure AI Foundry models, which sets the standard reference for Large Language Models (LLM). However, the same principles and design patterns could potentially be applied to any third party model.

Acknowledging the rising dominance of Python, particularly in the realm of AI, along with the powerful experimental capabilities of Jupyter notebooks, the following labs are structured around Jupyter notebooks, with step-by-step instructions with Python scripts, Bicep files and Azure API Management policies:

🧪 Labs with AI Agents

🧪 MCP Client Authorization

Playground to experiment the Model Context Protocol with the client authorization flow. In this flow, Azure API Management act both as an OAuth client connecting to the Microsoft Entra ID authorization server and as an OAuth authorization server for the MCP client (MCP inspector in this lab).

Name		Name	Last commit message	Last commit date
Latest commit History 377 Commits
.devcontainer		.devcontainer
.github		.github
.infracost		.infracost
.vscode		.vscode
docs		docs
images		images
labs		labs
modules		modules
scripts		scripts
shared		shared
tools		tools
workshop		workshop
.editorconfig		.editorconfig
.gitignore		.gitignore
.markdownlint.yaml		.markdownlint.yaml
AI-GATEWAY.md		AI-GATEWAY.md
AI-GATEWAY.pptx		AI-GATEWAY.pptx
AI-Gateway.sln		AI-Gateway.sln
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md
pyrightconfig.json		pyrightconfig.json
requirements.txt		requirements.txt

Lab	Security	Reliability	Performance	Operations	Costs
Access controlling	⭐
Backend pool load balancing	⭐	⭐	⭐
Semantic caching			⭐		⭐
Token rate limiting		⭐	⭐
Built-in LLM logging				⭐
FinOps framework				⭐	⭐

License

Azure-Samples/AI-Gateway

Folders and files

Latest commit

History

Repository files navigation

🧪 AI Gateway labs

What's new ✨

Contents

🧠 AI Gateway

🧪 Labs with AI Agents

🧪 Labs with the Inference API

🧪 SLM self-hosting (Phi-3)

🧪 Labs based on Azure OpenAI

🧪 Backend pool load balancing - Available with Bicep and Terraform

Backlog of Labs

🚀 Getting Started

Prerequisites

Quickstart

🔨 Supporting Tools

🏛️ Well-Architected Framework

🥇 Other resources

Disclaimer

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Uh oh!

Uh oh!

Languages