Managed Agents

We keep your AI working in production.

Most AI projects ship and then drift. Accuracy slips. Models change. Edge cases pile up. Managed Agents is the recurring service that keeps your workflows running well — monitoring, tuning, model updates, exception handling. Monthly retainer per agent.

30–60 days
Without tuning before accuracy starts to slip
3–6 months
Between major model releases that change the math
One Slack channel
With the engineers who actually run your agents
What We Run

The AI Work That Has to Keep Working After Launch

Five agent types, one team. The prompts, the models, the exception queue, the integrations — all of it.

Document & Image Workflows
Extraction pipelines that pull data from documents and images — claims forms, bills of lading, ACORD forms, medical records, invoices. We run accuracy monitoring, exception review, model upgrades, and integration upkeep into the systems your team uses.
Content & Drafting Agents
Research agents, blog drafting workflows, brief generators, video script writers, ambient meeting capture. We tune the prompts on your voice, keep the rails in place, and update the models as new ones release.
Search & Knowledge Agents
RAG chatbots, knowledge-base Q&A, internal search agents over your docs. We watch retrieval quality, re-index on a cadence, tune chunking and reranking, and fix the answers that quietly go wrong.
Internal Workflow Agents
Sales-call summarizers, inbox triagers, ticket classifiers, lead scorers, contract reviewers. The "boring" agents that save real hours when they keep working — and fail quietly when they do not.
Custom App Agents
The agents inside Apps you have built on Alpha Agent or any custom AI app. We treat the agent layer as a managed surface separate from the UI and run it like any other production system.
Monthly Retainer

What's Included Every Month

The work that has to happen on a running agent whether anyone is asking for it or not.

Accuracy & quality monitoring
Continuous evaluation against a golden set, drift detection, regression alerting. We catch slips before your team or your customers do.
Cost & latency tracking
Per-call cost, p50/p95 latency, token volume by workflow. Right-sizing model selection and prompt length so spend stays predictable.
Prompt & rail tuning
Ongoing prompt revisions against the edge cases that surface in production. Guardrails added where the model keeps trying to go wrong.
Model upgrades & migrations
When a new model ships that changes the math — cheaper, smarter, faster — we evaluate it against your golden set, migrate when it wins, and roll back if it doesn't.
Exception queue & escalations
A queue for the cases the agent flags as uncertain. We staff review (or train your team to), feed corrections back into the eval set, and tune until exception rate drops.
Slack channel with engineers
A shared Slack channel with the engineers who actually run your agents. Not a ticket queue. Not a bot. The same people every time.
Vendor & API management
OpenAI / Anthropic / Google / open-weight model accounts, rate limits, quota negotiations, breaking-change handling, key rotation. We own the integrations.
Monthly status report
Accuracy trend, cost trend, exception rate, what shipped, what broke and was fixed, what is on the radar. The receipts your stakeholders ask for.
Per-Outcome SOWs

The Bigger Pushes — Scoped and Quoted Up Front

The retainer covers the routine. Work that ships a new outcome — a new capability, a new integration, a major migration — gets its own SOW with milestones and a fixed price.

New agent capabilities

A new document type the workflow handles, a new entity to extract, a new language to support, a new decision path inside the agent.

New integrations

Push into a new system of record (Salesforce, Encompass, Workday, ServiceNow). New event sources. New webhook destinations.

Model platform migrations

OpenAI → Anthropic, hosted → open-weight, single-model → router, cloud → VPC. Scoped against your golden eval set so accuracy stays measured.

Eval-set buildouts

A scoped engagement to build a real evaluation harness against historic data — so accuracy is a number you trust, not a feeling.

Compliance & audit work

SOC 2 / HIPAA / SOX evidence, prompt logging, PII redaction, model-card documentation, vendor security reviews. The artifacts your auditor will ask for.

Throughput scaling

10× the volume on the same accuracy budget. Batching, caching, model routing, queue redesign — scoped against measured cost and latency targets.

Build vs. Run

Why AI Agents Need Active Management

An AI agent is not software you ship and walk away from. The thing that worked on day one slips on day 60. Here is why.

What gets you to production

Building

Building the agent is the easy part — the part everyone budgets for and the part AI coding tools help with. It is also the part that does not compound.

  • A working prompt against representative samples
  • A clean integration into the systems of record
  • A demo that passes a stakeholder review
  • A pilot batch with acceptable accuracy
What keeps it working

Running the agent

That's the work that compounds. It is the difference between an AI workflow that delivers ROI for years and one that quietly degrades until somebody notices the numbers stopped looking right.

  • The edge cases that only show up at scale
  • Drift as your data, workflows, and intake change
  • New models that change the cost-quality frontier
  • Vendor API changes, deprecations, rate-limit hits
  • Compliance artifacts and evidence over time
Side by Side

vs. In-House ML Team, vs. DIY, vs. Big-Consultancy Retainer

Four ways to run AI agents in production. The trade-offs are real — here is where each one lands.

Dimension In-house ML teamDIY / one engineerBig-consultancy retainerLast Rev Managed Agents
Cost structure $250K–$400K/yr per ML engineer × 2+One overloaded engineer wearing many hats$300–$500/hr blended, six-figure retainersMonthly retainer per agent + per-outcome SOWs
Time to start 6–12 months hiringSame day — until they leave6–12 weeks scoping & SOW2 weeks to onboarded
Who notices when accuracy slips Your team — when they have timeA customer or auditorBest-effort, billed by the hourOur monitoring, before your team does
When a new model ships Whoever read the most posts wins the argumentSkipped until something breaksNew SOW, rebid the projectEvaluated against your golden set, migrated when it wins
Exception handling Built once, then driftsNo queue — issues pile up in emailOut of scope unless you pay extraContinuously reviewed, fed back into eval set
Vendor & API management Your team owns every vendor relationshipWhoever set it up has the keysVariable — depends on partnerWe own the integrations and the keys, with you
Common Questions

What Teams Ask Before Handing Us Their Agents

We already built an agent in-house. Can you just take it over?
Yes. The most common starting point is an agent already in production that the original build team has rotated off of. Onboarding is typically 2 weeks: code audit, prompt audit, eval-set review, vendor review, on-call handoff. Then we run it.
What model platforms do you support?
OpenAI, Anthropic, Google, AWS Bedrock, Azure OpenAI, and the major open-weight models on Together / Fireworks / Groq / self-hosted vLLM. We will not take over an agent on a platform we cannot responsibly run — if it is on something we have not vetted, we say so up front.
Do we need a 'golden eval set' before you can run it?
Eventually yes, but you do not need one to start. Most agents we take over do not have one. Building it is usually one of the first per-outcome SOWs after onboarding, against your historic data and your team's corrections.
How is this different from an MLOps platform we already pay for?
MLOps platforms (LangSmith, Helicone, Arize, Braintrust, etc.) give you observability. We sit on top of those tools and do the actual work — review the dashboards, tune the prompts, migrate the models, work the exception queue. The platform tells you the agent is slipping. We make it stop slipping.
How is this different from a big-consultancy AI retainer?
Traditional consultancies bill hours and rebid every change. We bill a fixed monthly retainer per agent, the same engineers month over month, and per-outcome SOWs for bigger pushes. We are also opinionated — we tell you when an agent should be retired, not just keep billing to run it.
We're worried about handing over our prompts and data.
We sign mutual NDAs by default. Customer-owned IP stays customer-owned — the prompts and the eval data are yours, not ours. We work inside your accounts (your OpenAI / Anthropic keys, your data warehouse, your vector store) — we do not proxy your data through ours.
What does pricing actually look like?
Monthly retainer scales with agent count, volume, and accuracy SLA. Typical range is $5K–$25K/month per agent at moderate volume, with larger ranges for high-volume / regulated workloads. Per-outcome SOWs are scoped and quoted up front, with milestone-based payments.
Will you also build new agents, or only run existing ones?
Both — but the framing is the same as Managed Web. Net-new builds run as a per-outcome SOW that flows into the retainer once the agent goes live. We do not do build-only engagements that hand off to your team. The work that compounds is running the agent after launch.

Two Ways to Start

Take the AI assessment for a structured read on whether your agents are a fit. Or send us the agent or workflow and we'll come back with an audit and a retainer proposal.