The Gateway Flat-Stack Built

A doctrine has to ship

A few weeks ago we published the Flat-Stack Manifesto; this is the purist architecture model that aims for Zero Compute. To be honest, it's not easy to manage this kind of rigor, but it's the only way we could build the Airbrx Intelligent Data Gateway and deliver on the promise to cut large data warehouse compute WITHOUT changing the warehouse itself.

I almost hate to say the Airbrx Gateway is what a native flat-stack application looks like ("native" is rapidly heading to the banned word list), but this isn't a clever feature stapled onto a warehouse. It is not a caching tier someone glued in front of Snowflake or Databricks. It is the doctrine itself, in production, doing the one job the rest of the industry forgot, or was unwilling, to do.

Reduce the compute. Reduce the cost.

The data industry's instinct, for decades, has been the opposite of zero compute: better answers cost more compute. Bigger warehouse. More tiers. More vendors. Faster GPUs. Our instinct is the inversion — the fastest, cheapest, safest query is the one that never had to run at all. The gateway is the thing that makes that bet real.

What the gateway actually does

I know… architecturally this sounds counterintuitive… we're going to put ANOTHER layer into the chain of data… But in reality we are extracting the cache from the giant compute beast. We're not really duplicating in the sense of building more to save more, we're bringing the important data closer to your BI and reporting tools so we don't have to wake the beast of a data warehouse.

You point your BI tools at the gateway instead of your warehouse. Same SQL, same drivers, same credentials. The gateway looks at the query, decides whether the answer is already known, and either hands it back instantly or forwards the query downstream. When the answer is known, the warehouse never wakes. When it isn't, the warehouse runs exactly the query it would have run anyway.

"Ah," you say, "But you MUST need compute to evaluate the query!" To which I say, not really. That compute was already spent at the warehouse, we just tagged it and bagged it. You come in with the same request and our infinitesimal compute (compared to a XXXXL Databricks warehouse) says, "oh, I've seen this before…" and knows exactly where to go to get exactly the same response without having to spin up a data center.

The gateway accepts a query and either returns a cached answer or forwards the query to your warehouse. That is the entire surface area. One sentence. It hides a list of properties that most "data acceleration" products quietly cannot deliver:

It scales to zero. When no one is querying, nothing is running, and nothing is billing.
It keeps the warehouse asleep. Repeat questions get repeat answers without a single warehouse credit burned.
It busts cache surgically. When the underlying data changes, only the affected answers go — not the whole cache, and not by guesswork.
It is the security boundary. Auth, scope, masking, and audit happen at one place: the place the traffic already passes through.
It runs anywhere. AWS, Azure, on prem, in a Docker container on a laptop. Same binary, same behavior.

Every one of those properties is only possible if we follow the flat-stack manifesto, making choices framed by that philosophy. Take any of them away and the gateway starts to sprawl and becomes the thing it was built to replace.

Why only a flat-stack gateway could do this

I think it's worth walking through these one at a time, because each is a place where the enterprise-stack instinct would have killed the property before it shipped.

It scales to zero — because it is stateless

The gateway holds no configuration of its own. It carries no session that has to be warm. It owns no in-memory state another replica would have to coordinate with. Every rule it applies is fetched on demand; every cached answer lives in open storage, not in process memory. So when traffic stops, the gateway stops too — no drain, no warm pool, no idle compute keeping a seat warm in case someone shows up.

The enterprise-stack version of this would have been a cluster with sticky sessions, a coordination layer to keep replicas in sync, a warm pool sized for peak load, and an autoscaler that argued with the warm pool. None of that scales to zero. All of that bills you while you sleep. We specifically chose not to build that money pit on purpose.

It keeps the warehouse asleep — because keys are predictable

The cheapest query is the one that never ran. The gateway can answer from a cache because the cache key is reproducible: the same user, the same tenant, the same statement, deterministically becomes the same key. No probabilistic match. No fingerprinting heuristic. No "close enough" semantic search. If we have seen this question before from someone allowed to ask it, we have the answer — and the warehouse stays asleep.

A sleeping warehouse doesn't bill. The enterprise-stack instinct would have layered a query accelerator in front of the warehouse and still required the warehouse to be warm enough to handle the misses. We chose the colder, cheaper path: assume the warehouse is asleep, and only wake it when we truly must.

It busts cache surgically — because the keys come from the rules

Caches are easy. Cache invalidation is where most systems quietly fall over. The standard playbook is brutal: when something might have changed, throw the whole cache away and start over. Or worse — keep serving stale answers and hope nobody notices.

Our model lets us be strategic with what we throw away. Because cache keys are derived from the same attributes the rules are written against, invalidation rules can name exactly the slice of the cache they affect and leave everything else untouched. Fine-grained cache with fine-grained rule-busting is the combination most data platforms do not do, and it falls out naturally from keeping keys predictable. Making our data predictable doesn't just make it cheap, it's the very property that makes precise control possible.

It is the security boundary — because there is nothing behind it to defend

Every dependency you don't take on is an attack surface you don't have to defend. The gateway is the one place traffic enters; behind it sits storage and a warehouse. There is no plugin runtime, no in-process extension model, no vendor agent phoning home.

We don't keep your credentials, but we do verify them in a zero compute model. Auth happens at the gateway. Scope happens at the gateway. The audit log is written at the gateway. The smallest stack is also the safest one — and security stops being a feature you bolt on after the fact and starts being a property of the shape.

It runs anywhere — because nothing about it is special

It's hard to write stateless services… that reads open formats out of object storage and speaks the same wire protocol your warehouse speaks. There is no proprietary kernel module, no managed control plane it has to phone home to, no cloud-specific API it depends on. So it runs on AWS. It runs on Azure. It runs in a Docker container on a laptop, identically. Portability is not a roadmap item; it is the consequence of refusing to take dependencies that would have prevented it.

The inversion

When we put those five properties next to each other the picture is what the manifesto promised: a system that is smaller, cheaper, safer, and faster because it is smaller — not in spite of it.

This is the part the industry has been quietly avoiding. The entire warehouse economy is built on the assumption that the answer to "the data bill is too high" is "buy a bigger plan, buy a different vendor, buy a faster tier." Every vendor in the category has an incentive to sell you more compute or directly try to force your compute into a tier… none of them has an incentive to sell you less.

We do. The Airbrx Gateway sells you less compute. That is the product. The reason we can ship it — and the reason a Snowflake or a Databricks structurally cannot — is that we built the doctrine first and the code second.

What the customer gets

The doctrine matters because it cashes out in four things at the same time — the Four Pillars the gateway exists to deliver:

The bill goes down. A stack with almost nothing running has almost nothing to bill. The warehouse credits you don't burn are the cleanest line item on the invoice.

The reads get faster. The fastest query is the one that never woke the warehouse at all. Cached answers return at network speed, with no queue, no warm-up, no cold-start penalty.

Control gets precise. Predictable keys are governable keys. Rules and invalidations name the same attributes, so you can reason about exactly what is cached, who it's cached for, and exactly what changes when something upstream changes.

Security tightens. One enforcement point. Fewer dependencies. Less to defend. The audit trail is the gateway log; the boundary is the gateway address.

And none of it asks the data consumer to change a single workflow. Same tools, same SQL, same drivers, same credentials — pointed one hop to the left.

The doctrine, shipped

So this is what the Flat-Stack Manifesto was for. Not a vibe. Not a values page. This is a set of constraints strict enough that the only product they let you build is the one the industry needs and isn't being built — a gateway that reduces the compute instead of feeding it.

Reduce the compute. Reduce the cost. Reduce the surface. Reduce the parts. What's left is the gateway — and that is the entire point.