How the product flow works
Free AI Cost Firewall → VCAL Server
The products are designed to work as a progression, not as competing choices.
Teams typically start with AI Cost Firewall because it is free, fast to deploy,
and immediately useful. As semantic caching proves its value, VCAL Server becomes
the production cache layer for deeper optimization and stronger operational control.
Step 1
Start with the gateway
Put AI Cost Firewall in front of your LLM application to reduce duplicate requests,
observe traffic, and understand where spend is being wasted.
Step 2
Validate semantic reuse
Use real traffic and metrics to confirm where semantic caching meaningfully lowers
latency and model usage.
Step 3
Scale with VCAL Server
Add VCAL Server when semantic caching becomes a core infrastructure function that
needs persistence, operational maturity, and enterprise deployment options.