HERO · VARIANT A
HORSE LABS · AI ENGINEERING
HL-CORE-2026-001
-23.5505 / -46.6333
PRIMORDIAL EDITION
— ARTIFICIAL INTELLIGENCE ENGINEERING

Your company already uses AI.
The real question: is it accelerating results — or accelerating chaos?

Most teams end up hostage to a single LLM provider — and now to the governance platforms that just got bought by big vendors, too. We build the infrastructure layer that separates your operation from all of them: no lock-in, real cost control, and governance with a roadmap of its own.

— CORE PULSE · REAL TIME

AI that senses before the gesture — and that you can actually measure.

— AI INFRASTRUCTURE, OPERATED

AI without process doesn't accelerate the business. It accelerates the chaos.

We build the infrastructure layer that separates your operation from the LLM providers — gateway, multitenant, and governance. Switch models, optimize cost, and never get held hostage.

▶ NARRATED PRESENTATION · HL-CORE
HORSE LABS PRESENTATION

SYSTEM
ARTIFICIAL INTELLIGENCE ENGINEERING
FREQUENCY
432 HZ — ACTIVE PULSE
MARKET 2026
THE FIELD OPENED UP

The field opened up.

Acquired · May/2026

Portkey

Portkey was acquired by Palo Alto Networks in May 2026. Folded into Prisma AIRS. Its roadmap is now enterprise security.

Maintenance · Mar/2026

Helicone

Helicone was acquired by Mintlify in March 2026. In maintenance mode — no new features. Security patches only.

Independent · Active

Horse Labs

Our own roadmap. Self-hosted. No big vendor deciding what ships and what doesn't. We're the alternative for teams that don't want governance-platform lock-in.

01  THE MARKET PROBLEM
SOURCE: DEPENDENCY × EXPOSURE
0%
OF COMPANIES HAVE THEIR AI OPERATION COUPLED TO A SINGLE PROVIDER

It's not a lack of technology.
It's a lack of an exit.

Claude Code, Codex, Copilot. Any company can sign up in five minutes. The trouble shows up later: your team adapts to the tool's workflow, your processes get built around it — and before you notice, your operation belongs to the provider.

AI without governance doesn't scale the business — it scales the risk.

RISK 01 · UNPREDICTABLE COST

The provider deprecates the cheap model. You move to the expensive one with no say. No warning. No negotiation.

RISK 02 · UNCONTROLLED USAGE

Your people burn the company's tokens on tasks that have nothing to do with the business. You find out on the invoice.

RISK 03 · CASCADING DOWNTIME

If the provider goes down, your operation goes down with it. No fallback. No continuity.

RISK 04 · ZERO GOVERNANCE

Who used it. For what. How much it cost. For which project. Nobody can answer.

RISK 05 · UNOPTIMIZED COST

Claude Opus running a FAQ chatbot. You're paying 20× more than you need to.

02  WHAT CHANGES WITH HORSE LABS
THE THESIS · TWO ANGLES

AI is volatile infrastructure — not a business domain.

Our job is to keep that volatility away from your operation. We build the layer that separates the LLM providers from your day-to-day — switch models, optimize cost, and govern usage without rewriting a thing.

FROM DEPENDENCY TO CONTROL — THE SAME OPERATION, TWO WORLDS
STACK · WITHOUT HORSE LABSCOUPLED · EXPOSED
YOUR TEAMCLAUDE CODEhard lock-in · single provider
·price: the provider decides
·downtime: the operation stops
·model: no choice
·usage: invisible
·governance: no one
STACK · WITH HORSE LABSDECOUPLED · CONTROLLED
YOUR TEAMLLM GATEWAYfree routingClaudeGPTGemini
price: you decide
downtime: it switches models on its own
model: the best one per task
usage: tracked per operation
governance: you, in real time
→ ANGLE 01

AI as InfrastructureFor those who are going to build on AI seriously.

The technical base that decouples your operation from the providers — provisioned in your own house, under your own domain, ready to scale.

  • Gateway that abstracts the LLM providers
  • Provisioning with Ansible + Terraform
  • On-premise or in your own VPC — AWS · Azure · GCP
  • Per-tenant isolation and native observability
→ ANGLE 02

AI as a ServiceFor those who just want to buy a finished result.

You don't need to understand the technology. We deliver the result up and running — report, automation, or agent — with the metric agreed on from day one.

  • A delivered result, not a tool
  • Metric defined before we start
  • AI cost measured per operation
  • Continuous operation tracked by data
— PRODUCT DIFFERENTIATOR

Governance-first, not observability-first.

LLM observability tools focus on analyzing what already happened. Horse Labs focuses on controlling what's allowed to happen: budgets per team, permissions per operation, a hard stop before the overrun.

It's the same difference as between a security camera and an access-control door.
03  HOW IT WORKS
ARCHITECTURE AS AN ARGUMENT FOR TRUST

Structure is what separates results from improvisation.

Interchangeable LLM providers

The gateway abstracts away whoever sits behind it. Claude, GPT, Gemini, or a local model — switch without touching the operation.

AI cost monitored per operation

Every process carries a visible cost, with a budget ceiling per client. No surprise at the end of the month.

Isolated environment per tenant

One client's data never mixes with another's. Isolation from the very first line.

Ready to scale with no rework

The same base serves one client or a hundred. Growing is configuration, not reconstruction.

STACK · HL-COREREAL-TIME FLOW
CLOUDFLARE TUNNELedge · secure ingressAPI GATEWAYrouting · auth · tenantLLM GATEWAYbudget per tenant · cost/opMCP SERVERSconnectors · toolsAGENTsupportAGENTsalesAGENTreports

Those who get it, see it and trust it. Those who don't, move on without losing the thread. The diagram above is the real structure — the same one behind every deliverable below.

04  THE PRODUCT · LLM GATEWAY
ONE LAYER · ALL THE CONTROL

The LLM Gateway is the product. Everything runs through it.

— THE RIGHT MODEL FOR EACH TASK

Opus for the dev. Haiku for the chatbot. Gemini for the documents.

You set which model handles each activity — based on what each does best — and the gateway routes it. Cost stops being premium by default.

— RESILIENCE

Provider down? Another model takes over.

No single point of failure. If one provider goes dark, the gateway reroutes to another and the operation keeps running.

— GOVERNANCE

Budget ceiling per tenant, an audit trail per person.

Every cost center has a limit. Every bit of usage has an owner. The intern's side project no longer lands on the company's bill.

— COST CENTERS

Cost Centers per team or client.

Each team, squad, or client runs with its own budget, allowed model, and independent rate limit. A budget overrun blocks automatically. Admin sees everything; dev sees only what's theirs.

● MULTITENANT / COST CENTEREXAMPLE · AGENCY
AGENTS · SHARED POOLcopy · media · reporting · seoLLM InfraStructure Layerrouting + budgetClient Alarge projectmodel · opusbudget · high78%Client Bsmall projectmodel · haikubudget · lean44%Client Csmall projectmodel · sonnetbudget · medium60%

The same agents serve three clients. Budgets, models, and cost centers kept separate — according to each one's contract. All controlled by multitenant.

05  OPERATIONAL CONNECTIVITY
YOUR OPERATION IN THE PALM OF YOUR HAND

Your entire operation fits in a WhatsApp conversation.

No dashboard nobody opens. You ask in the channel you already use and get, in real time, any report on the health of your own company.

06  SECURITY & GOVERNANCE
SECOPS · RBAC · VAULT

Infrastructure is also where AI stays secure.

Treating AI as infrastructure opens up a layer no provider delivers: credentials that never touch the model, access restricted by team and role, and security policies enforced in real time.

D-01 · VAULT + RBACCREDENTIALS + ACCESS
— SHIELDED CREDENTIALS

Vault + RBAC

Credentials never exposed in the agents. Each team accesses only what it needs — and the reporting agent bridges the two, with explicit permission.

D-02 · n8n + PERMISSIONSREPORTING BY ROLE
— RIGHT DATA, RIGHT PERSON

Permission per report

Support can't pull the financial report. Leadership doesn't have to ask — it arrives on its own. Granular control over every piece of data the AI delivers.

D-03 · GATEWAY SecOpsTHREAT INTERCEPTED
— POLICY IN REAL TIME

SecOps at the gateway

A credential typed into the prompt is blocked before it reaches the model — and the manager gets the alert on WhatsApp instantly. The window to react exists because the infrastructure creates it.

07  WHAT WE DELIVER
04 FRONTS · ONE INFRASTRUCTURE

The problem first. The solution after.

C-01

Branded reports, right in WhatsApp

Your data becomes a finished report — in your own visual identity — delivered where the client already is. No lost PDF, no dashboard nobody opens.

DELIVERY · RECURRING · AUTOMATIC
C-02

Automation of operational processes

Repetitive tasks that drain the team move off manual and start running on their own — with every step logged and a human checkpoint where it matters.

OPERATIONS · LESS REWORK
C-03

Conversational agents for support and sales

Support that qualifies, answers, and sells — 24/7, in your company's voice. Not a decision-tree chatbot: an agent that understands context and acts.

SUPPORT · SALES · 24/7
C-04

AI infrastructure implementation

We build the full base: isolated, monitored, and ready to scale. You get big-company AI capability — running under your own domain.

FOUNDATION · ISOLATED · SCALABLE
08  WHO IT'S FOR
03 PROFILES · 01 STRUCTURE

If you recognize yourself here, Horse Labs was built for you.

PROFILE 01

Engineering teams

With 5 to 50 devs actively using AI. You approved the AI budget, but you don't know who spends what. Horse Labs gives visibility and control by team, by project, by operation.

PROFILE 02

Technical leaders

CTOs and VPs of Engineering who have to answer to finance: how much did we spend on AI this month, and why. Without relying on a spreadsheet or OpenAI's raw billing.

PROFILE 03

Companies with AI in production

That are past the experiment phase and need real governance: rate limits, audit trail, automatic blocking, self-hosted deploy. No big-vendor platform lock-in.

09  ILLUSTRATIVE SCENARIOS
THE MECHANICS · NOT THE MARKETING

How the problem becomes a solution, in practice.

Illustrative scenarios — the mechanics of each situation, not real cases. When there's a client result, it goes here with a number and context.

Marketing agency

Same agents, three clients

THE PROBLEMOne provider, one account. Impossible to separate what each client consumed — or to bill the right cost to each contract.
THE INFRA SOLVES ITMultitenant at the gateway: budget, model, and cost center per client. Each project's cost is visible right in WhatsApp.
Services SMB

Support without paying premium

THE PROBLEMA FAQ chatbot running on the most expensive model, usage with no ceiling, and zero visibility into who spends what.
THE INFRA SOLVES ITThe gateway routes support to a lightweight model, applies a budget ceiling, and hands governance back per person.
Startup · product team

Build without getting held hostage

THE PROBLEMA product coupled to a provider's proprietary features — any price or model change splashes straight onto the operation.
THE INFRA SOLVES ITInfra provisioned in the startup's VPC with Ansible and Terraform. Switching models becomes configuration, not migration.
10  WHAT IT'S LIKE TO WORK WITH HORSE LABS
04 STEPS · NO SURPRISES

From diagnosis to continuous operation.

STEP 01

Diagnosis

We map the current operation: where time leaks, what can be measured, and where AI creates real impact.

STEP 02

Proposal

Scope and metrics defined before we start. You know what will be delivered and how it will be measured.

STEP 03

Implementation

We build and integrate on top of the isolated, monitored, ready-to-scale infrastructure.

STEP 04

Continuous operation

We don't vanish after delivery. We operate on data, adjust, and keep the result alive.

11 FREQUENTLY ASKED QUESTIONS
STRAIGHT ANSWERS

What people ask before getting started.

What is Horse Labs' LLM Gateway?

It's the infrastructure layer between your operation and the LLM providers. It routes each task to the best model — Claude, GPT, Gemini, or a local model — applies a budget ceiling per client, and records the cost of every operation.

How does Horse Labs avoid AI provider lock-in?

Your operation talks to the gateway, not directly to the provider. Switching models becomes configuration, not migration — if one provider raises its price or goes down, another takes over without rewriting anything.

Does the infrastructure run on my own company?

Yes. We provision on-premise or in your own VPC — AWS, Azure, or GCP — with Ansible and Terraform, isolated per tenant and under your own domain.

How does per-client cost control work?

The model is multitenant: each client or cost center has its own budget, model, and usage trail. You track cost per operation — including over WhatsApp.

Do I need to understand the technology to hire you?

No. In the AI as a Service model we deliver the finished result — report, automation, or agent — with the metric agreed before we start. You hire the result, not the tool.

What's the difference between Horse Labs and Portkey or Helicone?

Portkey was acquired by Palo Alto Networks in May 2026 and folded into Prisma AIRS — an enterprise security product, with a roadmap outside the original team's control. Helicone was acquired by Mintlify and has been in maintenance mode since March 2026. Horse Labs is independent, with its own roadmap and a focus on organizational governance — not just observability. On top of that: unlimited logging with no charge per record volume, and self-hosted as a first-class citizen, not an enterprise afterthought.

Is Horse Labs a platform or a consultancy?

Both, in stages: we deliver the infrastructure implementation (a service) and operate the platform that governs it continuously (a product). The client buys the implementation once and uses the platform indefinitely. For those who already have a technical team: you can implement it in-house using our docs and use just the governance layer as SaaS.

— WHITEPAPER · EXECUTIVE EDITION

The operational layer of intelligence.

Why the company that wins won't be the one that owns the best model — but the one that can switch models without stopping the operation. Three stages of maturity, from no governance to governed infrastructure.

  • The eight symptoms of an exposed AI operation.
  • Why centralizing on a single vendor is false governance.
  • The target architecture: the LLM Gateway as the single operational layer.
FOR CTOs · ARCHITECTS · TRANSFORMATION LEADERS
PDF · 13 pages · Executive Edition
— A CONVERSATION, NOT A PITCH

No long presentation.
A 30-minute conversation.

Enough to tell whether it makes sense — and if it doesn't, you walk away clear on what you actually need.

Wellington Nascimento
Wellington Nascimento
FOUNDER · HORSE LABS