What is a corporate AI governance layer?

It is a software layer that sits between the company's tools and the LLM providers and applies the organization's rules to every request: it blocks or masks sensitive data, controls spend through budgets, restricts which models can be used, isolates access per department and records everything in an audit trail. HorseLabs implements this layer across 5 fronts — data, cost, models, access and connectivity — behind a single key, for any provider.

How does HorseLabs stop sensitive data from reaching the LLM provider?

Every prompt passes through the gateway before the model. The DLP Shield inspects the content pre-call: deterministic rules detect bank data — credit-card numbers (Luhn) —, CPF, e-mails and credentials; an NLP layer detects person names and PII that rules can't reach. Depending on the team's policy, the data is masked or the request is blocked before it leaves — and the violation is recorded in the trail, with the data already masked.

Does using HorseLabs solve my LGPD compliance?

Not on its own — and be wary of anyone who promises that. Compliance involves process, legal basis, contracts and people; no tool "solves" it. What HorseLabs delivers are the technical controls that sustain your posture: blocking and masking of personal data before the provider, a violation audit trail and access isolation. And let's say it plainly: every relevant LLM provider is foreign, so international data transfer is intrinsic to using AI. We give you control, reduction and proof over what leaves — not the illusion that nothing does.

Do I need to replace the tools my team already uses?

No. The layer speaks the OpenAI-compatible standard: anything that already works with that standard — IDEs, agents, scripts, internal tools — starts pointing at the gateway by swapping the base_url and using the team's virtual key. The employee keeps their workflow; the company gains control.

What happens if the data detector goes down?

Fail-closed. When the team's policy is set to block and the detector becomes unavailable, the layer stops the request instead of letting it through. Protecting the data is the architectural default — not a setting someone forgets to turn on.

How does AI cost control work?

Each team uses a virtual key with its own budget. Spend shows up in real time per user, key, team and cost center. When consumption crosses the threshold you set, the layer fires an alert (webhook into your workflow); when it overruns, it cuts off. And every request is logged: who, which model, how many tokens, how much it cost.

Which providers and models are supported?

Claude (Anthropic), GPT (OpenAI), Gemini (Google) and Grok (xAI), behind the same key and the same API standard. The catalog is fed by each provider's live models and governed by an allowlist: everything starts off, and only what an administrator approves goes into use. A non-approved model gets a 403.

How is access isolated between departments and companies?

Each organization and each department lives in its own tenant, with strictly scoped roles (operator, admin, member). Provider credentials live in a vault and never reach the end user. Sensitive actions require a second factor, and every access lands in the audit trail.

How much does it cost?

It depends on scope — operation size, number of departments/tenants and request volume. The investment structure is at horse-labs.dev/pricing; scope and metric are defined before we start, with no surprises.

By requesting access through the form on this page — a corporate e-mail and your team size are enough. We are in a validation phase with selected companies: the founder replies within 1 business day.

AI model governance

AI model governance means deciding and enforcing which models the organization may use and routing each task to the right model, without being tied to a single provider. This guide details the four mechanisms: an allowlist that starts off, a catalog that comes live from each provider's API, wildcard routing that avoids lock-in, and approval enforced at the gateway itself.

Default-OFF allowlist

A default-OFF allowlist is a list of approved models that starts empty: nothing is usable until the organization explicitly approves each model.

When access to models follows the provider's default, everything it exposes is available out of the box — the expensive models, the experimental ones, the just-released ones, and the ones your organization never evaluated. The default is "allowed," and it falls to someone to remember to turn off what shouldn't run, one by one, always chasing what the provider publishes. The default-OFF allowlist flips that logic: the list starts empty and nothing is usable until it's explicitly approved. The default stops being permissive and becomes denied by omission — the organization decides what may run, not the vendor's catalog.

In practice this means adopting a new model is a deliberate decision, not a side effect of the provider having shipped it. The approval is stored as governance state — not baked into the code that calls the gateway — so the list of what's allowed is a single, reviewable, auditable source. Approving or revoking a model is editing that list, without touching any application. The result is real control over the model surface: the organization knows exactly what's enabled and on whose decision.

At Horse Labs, the allowlist is global and default-OFF, stored in the governance-db: nothing is usable until it's approved, and approval is toggled in batches (admin + second factor) — the organization decides what may run.

Live catalog

A live catalog means discovering the available models by querying each provider's API in real time, instead of keeping a static list that ages.

A model list hand-written in the code has a short shelf life. Providers ship new models, deprecate old ones, and rename versions at a pace no static list keeps up with: before long it points at models that no longer exist and ignores ones that just appeared. Keeping that list current becomes recurring, error-prone work, and every lag is either a model wrongly unavailable or a decision made on stale information. The live catalog removes that maintenance: the available models come from each provider's own API, queried when the list is needed.

With that, what the organization sees to approve is always the current state of what each provider actually offers — including the name and date of each model, pulled from the API's own response. New models surface for evaluation as soon as the provider publishes them; deprecated ones drop off. Governance stays the allowlist, which decides what's approved; the catalog only guarantees that decision is made against present reality, not a stale snapshot someone forgot to revisit.

At Horse Labs, the model catalog comes live from the providers' /models APIs (with name and date per model), instead of a static list in the code — new models surface for approval and deprecated ones drop off on their own.

Routing without lock-in

Routing without lock-in means dispatching each call wildcard per provider, so that adding or swapping a model or provider is configuration, not a code rewrite.

Lock-in is born when the application talks directly to a single vendor's API: that provider's client, its formats, and its quirks end up scattered through the code. Switching providers, in that scenario, is a rewrite project, and that friction is exactly what ties an operation to one vendor even when another would be better or cheaper. Wildcard-per-provider routing breaks that coupling: the gateway understands a wildcard route for each provider, so enabling a new provider or a new model is adjusting configuration, not touching the application.

For this to work, whoever calls the gateway uses a model identifier prefixed by the provider, not a loose alias — the prefix makes explicit which provider the call is destined for and lets wildcard routing resolve the target without a manual entry per model. The application talks to a single stable interface, in the OpenAI-compatible standard, while the choice of provider becomes a reversible configuration decision. Control over which models run and portability across vendors stop being conflicting goals: both live in the same layer.

At Horse Labs, the gateway's model_list is wildcard per provider (anthropic/* openai/* gemini/* xai/*) and the caller uses the prefixed id (e.g. anthropic/claude-opus-4-8) — adding or swapping a provider is configuration, not a rewrite.

Enforcement at the gateway

Enforcement at the gateway means checking the allowlist at the single point every call passes through: a non-approved model returns 403, before it reaches the provider.

An allowlist only counts if it's impossible to bypass. If approval were a convention — a list in a document each team is expected to respect — one direct call, a forgotten snippet of code, or a misconfigured agent would be enough to run a never-approved model, and no one would notice until the spend or the incident showed up. Enforcement at the gateway closes that gap by placing the check at the one point every call necessarily passes through: before forwarding to the provider, the gateway verifies that the model is on the allowlist for that access profile. It isn't, the call doesn't proceed — it gets a 403.

The important detail is that the governance decision (what's approved) and the enforcement point (the gateway) stay separate but connected: the allowlist is the source of truth, and what the gateway permits is the already-reconciled result of that decision. So approving or revoking a model in the allowlist takes effect at enforcement without rewriting any rule. The end result is that "which models may run" stops being an intention and becomes a technical guarantee: what wasn't approved simply doesn't execute.

At Horse Labs, approval is enforced at the gateway via LiteLLM's team.models (the already-reconciled result of the allowlist): a non-approved model returns 403, before it reaches the provider.

FAQ

What is AI model governance?

It's the set of controls that decides and enforces which models the organization may use — a default-OFF allowlist fed by the providers' live catalog — and routes each task to the right model wildcard per provider, with approval enforced at the gateway, without lock-in.

How do I switch model providers without rewriting the code?

By routing wildcard per provider: the application talks to a single OpenAI-compatible interface using a provider-prefixed model id, and enabling or swapping a provider becomes a configuration change — not a rewrite.

Talk about model governance