Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intra-vendor Model Routing Support AI APIs #3353

Open
7 of 12 tasks
AnuGayan opened this issue Nov 8, 2024 · 2 comments
Open
7 of 12 tasks

Intra-vendor Model Routing Support AI APIs #3353

AnuGayan opened this issue Nov 8, 2024 · 2 comments

Comments

@AnuGayan
Copy link

AnuGayan commented Nov 8, 2024

Problem

AI API designers need to be able to route incoming requests across models within a single vendor’s ecosystem.

With the API Manager 4.4.0 release, AI API support was introduced. We allow API consumers to specify the model they wish to consume within that LLM vendor. This leaves us with the following concerns:

  • API designers have no control over which models are used, how frequently they are accessed, or the costs incurred.
  • Risks such as model exhaustion and model misuse could lead to API being throttled out.
  • Inability to enforce a model routing strategy at design time considering the use cases of AI applications consuming the API.

Proposed Solution

Support intra-vendor model routing for AI APIs.

The proposed solution is to enforce the model routing strategy as a policy for the API request flow. Policies can be categorised as follows:

  • Static routing techniques (only relies on the incoming request)
  • Dynamic routing techniques (based on metrics from previous invocations)
  • Failover
  • Custom routing strategy enforcement

Targeting the APIM 4.5.0 release, we are shipping the following policies:

  • Model Round-robin Policy
  • Model Weighted Round-robin Policy
  • Model Failover Policy

Task Breakdown

  • Feature design and UI wireframes
  • Admin REST API changes for AI Vendor model list maintenance
  • Publisher REST API changes for multi-endpoint support
  • Revamp the existing flow to handle AI APIs via a separate velocity template
  • Onboard round-robin policy
  • Onboard failover policy
  • Admin UI changes to handle model list under the AI vendor
  • Handle APICTL flows: import/export APIs with multiple endpoints
  • Write unit tests
  • Write integration tests
  • Write CTL tests
  • Write feature documentation

Version

4.5.0

@ashera96 ashera96 changed the title Multiple Backend Support for APIs Intra-vendor Model Routing Support AI APIs Feb 3, 2025
@ashera96 ashera96 self-assigned this Feb 3, 2025
@ashera96
Copy link

Progress Update:

The task breakdown under the issue description was updated with the completed tasks.

The following PRs were merged targeting the alpha release:

  1. Carbon-apimgt changes: Intra-vendor model routing support for AI APIs carbon-apimgt#12871
  2. Apim-apps changes: Intra-vendor model routing support for AI APIs apim-apps#883
  3. Product-apim changes: Intra-vendor model routing support for AI APIs product-apim#13661

@ashera96
Copy link

Progress Update:

The following PRs were merged targeting the beta release:

  1. Carbon-apimgt changes: Intra-vendor model routing feature enhancements carbon-apimgt#12987
  2. Apim-apps changes: Intra-vendor model routing feature enhancements apim-apps#929
  3. Product-apim changes: Intra-vendor model routing feature enhancements product-apim#13695

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants