How much can I expect to save?
Typical clients reduce AI spend by 30-60% in the first engagement. Model and token optimisation alone can cut costs by 40-70% without affecting output quality. Infrastructure changes (self-hosted inference, spot instances, cross-cloud arbitrage) often add another 20-40% on top. Every engagement starts with a baseline audit so savings are measured, not estimated.
How long does an AI cost optimisation engagement take?
A focused spend audit and optimisation roadmap typically takes 2-4 weeks. Implementation of high-impact changes depends on scope — model routing and prompt optimisation can roll out in days, while infrastructure migration (e.g., moving to self-hosted inference) may take 2-4 weeks. We also offer a rapid 1-week assessment for organisations that need quick visibility on spending.
Do you work with any AI platform or provider?
Yes. We are vendor- and platform-agnostic. We optimise costs across OpenAI, Anthropic, Google Gemini, AWS Bedrock, Azure OpenAI, together.ai, self-hosted models, and any other provider in your stack. Our recommendations are based purely on cost, quality, and latency trade-offs for your specific workloads — not on partnerships or commissions.
What is shadow AI and why does it matter?
Shadow AI refers to AI tools, API subscriptions, and model usage that teams adopt without IT or procurement oversight. In most organisations, this accounts for 30-50% of total AI spend. It creates security, compliance, and cost risks — especially under regulations like the Vietnam AI Law. Our audit process discovers shadow AI and brings it under governance without disrupting team productivity.
Do you also implement the optimisation changes?
Yes. Many clients start with an audit and roadmap, then engage us to implement the highest-impact changes — model routing systems, prompt optimisation, caching infrastructure, or self-hosted inference deployment with Rust engines (Candle, xinfer). The audit phase is always optional and independent; there is no obligation to continue with implementation.
How is this different from your AI Strategy service?
AI Strategy focuses on identifying where to apply AI for maximum business value — use case selection, roadmap development, build-vs-buy analysis. AI Cost Optimization focuses on maximising efficiency of existing AI investments — reducing waste, right-sizing models, and optimising infrastructure. They complement each other: Strategy tells you what to build, Cost Optimization tells you how to run it efficiently. Many clients do both.