May 06, 2026

10 mins read

Save as You Transact: How we built a savings product that reacts to millions of events

An engineering note from the team behind Save-As-You-Transact at Moniepoint.

What SAYT does, in one paragraph

SAYT lets a customer set a rule, such as "save N500 every time I make a transfer" or "save 5% of every card transaction." Spending is the trigger. Saving happens as a by-product. The customer sets the rule once and forgets about it. Every time they transact through any of our channels, a small sliver of money moves from their main account into a SAYT wallet. They withdraw whenever they want.

That's the product. The interesting half of this article is the engineering required to make it feel like a single, simple promise to the customer, when, in reality, it sits atop millions of events flowing through a distributed system every minute.

The architecture, at a glance

Figure 1: SAYT producer-bus-consumer architecture

Producer services emit events when transactions complete. The event bus stores them in a durable, partitioned log. SAYT consumes them, decides what matters, and hands off the work that actually moves money. Nobody calls SAYT directly, and SAYT doesn't call its own internals directly either. Even the hop from the event service to the core service goes through the bus.

Why didn't we just integrate directly?

The intuitive approach is for every transactional service to call SAYT after it finishes its work. It works in a demo and collapses in production. Every producer team ends up with SAYT in its dependency graph, having to handle SAYT being down, slow, or in the middle of an API change. New transaction types only flow into SAYT if someone remembers to wire them in. SAYT, which was meant to be a small thing on the side, becomes part of the critical path of every transaction.

So we inverted it. Producers publish events. SAYT subscribes. The bus is the contract between them. The event itself is small and structured:

{ "event_id": "evt_01HX7K...",

"customer_id": "cus_8a4f...",

"event_type": "transfer.completed",

"amount_minor": 250000,

"currency": "NGN",

"produced_at": "2026-05-05T09:14:22.118Z"}

It's a statement of fact. Anyone who cares can listen. Anyone who doesn't can ignore it.

We split SAYT into two services on purpose:

The event service is CPU-bound and lookup-heavy. It reads events, checks matches, and either drops them or publishes a credit instruction to an internal SAYT topic.
The core service is database-bound and transaction-heavy. It consumes that internal topic, evaluates rules, and credits the customer's SAYT pot.

Two things to notice here. First, those workloads scale on different curves. Second, and more importantly, the hand-off between them is also event-driven. The event service does not call the core service over HTTP. It publishes to a topic that the core service consumes at its own pace.

This is a deliberate backpressure decision. The event service runs hot. If we let it call the core service synchronously, a sudden spike on the bus would translate directly into a flood of HTTP requests to a service that does transactional writes against a finite-throughput store. With an event-driven hand-off, the bus absorbs the spike. Credit instructions accumulate on the topic. The core service drains them at the rate its database can sustain. The general rule we've internalised: any time a service produces work much faster than another service consumes it, the hand-off should be a topic, not a call.

Scalability: why event-driven, and why horizontal

The argument that mattered for picking event-driven wasn't elegance. It was horizontal scalability.

Vertical scaling has a hard ceiling: there's the biggest available instance, and once you hit it you have nowhere to go. Horizontal scaling has none. You add instances as load grows, and the loss of any single one is absorbed by the others.

Event-driven architecture is what makes this work at our volume. The bus splits each topic into a fixed number of partitions. Consumers join a consumer group, and the bus assigns each partition to exactly one consumer in the group at a time. Add a consumer, and the bus rebalances. Lose a consumer, and the bus reassigns. The whole thing is elastic, with no coordination code on our side.

The ceiling is the partition count:

max_parallel_consumers = number_of_partitions

If a topic has N partitions, you can run up to N consumers in parallel and no more. So the partition count is the most consequential configuration we set on each topic. It's the basic knob that governs how far the SAYT event service can scale horizontally for that stream. We pick partition counts for the peak we can plausibly project, not the average. Throughput grows. Partitions are awkward to change after the fact.

The base requirement for the SAYT event service: be fast

The event service has milliseconds, not seconds, to decide whether a given event matters. A database round-trip is a few milliseconds at p50 and tens at p99, which is already over budget on the rejection path. Doing one per event would crush the underlying store and push the consumer into permanent lag.

So the per-event filter is cache-resident, and the whole point of that cache is to avoid the database round-trip altogether on the hot path. The cache holds the SAYT plan configuration we need for filtering, keyed by the customer: which event types each plan covers, the amount or percentage to save, and the withdrawal preference. Every incoming event hits the cache first. The vast majority of events fail the filter at this stage and are dropped without ever touching a database. Only the small subset that passes the cache check proceeds to rule evaluation and topic publication.

Figure 2: Event filtering funnel

The cache is invalidated on 'plan create, edit, or cancel', with the core service's database as the source of truth. We use a two-tier setup: a short-TTL local in-memory map per instance, backed by a shared Redis cluster, so a cold-starting instance doesn't stampede the database to rebuild its view. Hit ratio sits in the high nineties even under autoscaling churn, which means the database barely sees the firehose.

Without this cache, the architecture doesn't work. With it, the rejection path is essentially free, and the consumer keeps up with the bus regardless of how many customers are not transacting in a way SAYT cares about.

Event deduplication: let the database do the work

Anyone who has worked with event buses long enough has stopped believing in exactly-once delivery. The honest design assumption is at least once. Events arrive twice, occasionally three times, especially when a consumer crashes mid-processing and the bus redelivers what it had already started working on.

For a service that moves money, "process the same event twice" translates directly to "credit the customer's SAYT pot twice for the same transaction, and pull the matching amount out of their main wallet twice." Not acceptable.

The naive approach is a two-step dedup: read the dedup store, and if you haven't seen the event, insert a record and proceed. That costs two database round-trips on every matched event and opens a race window between the read and the write where two consumers can both decide to proceed.

We do something simpler. Every event carries a unique event_id from its producer. We treat that column as a uniqueness constraint in the database itself, and we just insert. If the event has been seen before, the database rejects the insert with a duplicate-key violation. We catch the violation, treat it as "already processed, drop", and move on.

Figure 3: Insert-first deduplication using a unique constraint

This is faster on the happy path (one round-trip instead of two), correct under concurrency (the database serialises the constraint check, so two consumers racing on the same event_id can't both win), and operationally simpler because the invariant lives in the schema, not in application code.

The ordering matters. The unique-constraint insert is the side effect that determines whether the event has been processed, so it has to happen before publishing to the internal SAYT topic. If we published first, a crash between the publish and the insert would leave an instruction on the topic with no dedup record, and a redelivery would publish a second instruction. The customer's pot gets credited twice. The reverse ordering is recoverable: insert first, crash before publish, the saving silently doesn't happen, and we replay it. A much better failure mode than a phantom double credit that a customer cannot explain.

The same pattern applies on the core service side. The credit instruction itself carries a unique ID, and the actual credit uses that ID under the same insert-first pattern. The dedup boundary travels with the work as it crosses service lines.

For events that genuinely cannot be processed (malformed payloads, unknown customers, bad rule outputs), we use a dead-letter queue. After a small number of retries, the event is pushed to the DLQ with metadata describing why. The main consumer moves on. We monitor DLQ depth; in steady state, it's near-empty, and any growth is treated as an incident.

A design choice: prioritise instant events, drop stale ones

This is the decision I'm proudest of, and the one most likely to surprise other engineers. SAYT is not a system that catches up. We deliberately drop events that arrive too late.

Every event carries a produced_at timestamp. The event service computes:

event_age = now() - event.produced_at

if event_age > FRESHNESS_WINDOW:

log_stale_event(event)

return # drop

The freshness window is short, on the order of minutes rather than hours. Long enough to absorb normal consumer lag and a brief outage. Short enough that a customer who looks at their app shortly after transacting still sees the savings land in real time.

Figure 4: Freshness window decision

The reasoning is product-driven. SAYT is a real-time savings product. If a customer bought airtime two days ago, they've moved on. They've spent that money in their head. If SAYT suddenly moves money out of their wallet now for a savings rule tied to that purchase, the experience is alarming. They check their app, see a transaction they don't recognise, justified by a rule they may not even still want. We've harmed trust in trying to be thorough.

So we draw a line. Stale-event drop rate is monitored, and a sustained spike means something upstream is broken, but the act of dropping is not, in our system, a failure. It's the correct thing to do. The retries, the DLQ, and the durable log on the bus exist to maximise the chance of timely processing and to make untimely outcomes visible. A small amount of completeness traded for a much larger amount of trust.

The architecture is a strategy

The reason this matters more than just "we built a savings product well" is that the same architecture is what lets SAYT become more than a savings product.

The event service does not know it's processing transactions. It's processing events. The fact that today every event happens to be a financial transaction is a property of what we've wired up, not a property of the system. Adding a weather event source, a sports result feed, or a calendar trigger is wiring, not redesign. The same is true on the destination side. Today, the core service credits a SAYT pot. Tomorrow, that destination becomes a parameter: Target Savings, Flexible Savings, anything else we ship.

The pattern underneath is the difference between a service that responds to commands and a service that responds to facts. Direct integration is command-driven. A finishes its work and tells B what to do next. Event-driven design is fact-driven. A publishes a fact, and B decides whether to act on it. Most of the systems I've worked with that aged badly were command-driven, where they should have been fact-driven.

If you're building something that has to react to many other things, spend an extra week resisting the temptation to wire it up directly. Build the producer-bus-consumer shape. Split your consumer into a part for filtering and a part for business logic. You'll save years on every version after.

We’re building systems that power millions of transactions every day. If that kind of engineering excites you, check out our careers page.

Read similar stories

High Availability in Production: What Running at Scale Actually Requires

Infrastructure

May 14, 2026

High Availability in Production: What Running at Scale Actually Requires

by Adegoke Obasa

What I Learned About AI Trust from Reconciling over 100 Billion Transactions

Infrastructure

April 07, 2026

What I Learned About AI Trust from Reconciling over 100 Billion Transactions

by Wole Olorunleke

Save as You Transact: How we built a savings product that reacts to millions of events

What SAYT does, in one paragraph

The architecture, at a glance

Why didn't we just integrate directly?

Scalability: why event-driven, and why horizontal

The base requirement for the SAYT event service: be fast

Event deduplication: let the database do the work

A design choice: prioritise instant events, drop stale ones

The architecture is a strategy

Infrastructure

High Availability in Production: What Running at Scale Actually Requires

Infrastructure

What I Learned About AI Trust from Reconciling over 100 Billion Transactions

Get more stories like this