Webhook-Based Payment Notifications: Patterns and Pitfalls
A developer's guide to webhook payment notifications: why they beat polling, idempotency and signature patterns, retry strategies, and the common pitfalls.
PUBLISHED
April 8, 2026
AUTHOR
Bridge Research Team
READ_TIME
10 min read
CATEGORY
Research
Webhooks have become the default way for payment platforms to notify integrators of state changes — payment initiated, authorisation granted, settlement finalised, refund posted, chargeback raised. They are universally supported, easy to start with, and for well-designed systems they are cheap and reliable. They are also one of the most frequent sources of subtle bugs in payment integrations, because the contract between sender and receiver is informal, failure modes are rarely exercised in testing, and retry and ordering semantics are often under-specified.
This article sets out the patterns that make webhook-based payment notifications reliable, the pitfalls that break them in production, and the specific choices Bridge has made in its webhook design. The audience is the engineer building the receiver — the integration point in a bank, a PSP, a wallet or a fintech that has to translate webhook traffic into internal state and downstream actions. We focus on payment notifications specifically, because the cost of a dropped or duplicated notification in payments is a real financial error rather than a merely cosmetic one.
Why Webhooks Beat Polling
Polling — the client repeatedly asking the platform for the current state of a resource — is simple, but it is either slow or expensive or both. A client that polls every five minutes will see state changes with up to a five-minute delay, which is unacceptable for a payment flow that is supposed to complete in seconds. A client that polls every second solves the latency problem by trading it for server load that scales with the number of clients and the number of resources they care about. Neither trade-off is good.
Webhooks invert the cost model. The platform pushes a notification to the client when an event happens. The client learns about the change immediately, the platform avoids serving pointless "no change" responses, and the traffic per event is bounded rather than traffic per poll cycle. The cost of a webhook is borne on the event, not on the wait.
For payments the latency gain matters. A stablecoin payment that finalises on Solana in fifteen seconds is useless to an integrator that learns about the finalisation five minutes later, because by then the customer has retried, the merchant has refunded, and the support ticket has been opened. A webhook that arrives at the client within a second of finalisation lets the integrator close the loop in the customer's session, which is the whole point of instant payments.
Bridge's integration surface offers three complementary mechanisms — REST for request/response, webhooks for push notifications, and a Kafka event stream for high-throughput consumers — and webhooks are the right default for most integration points. See the build pages for the broader integration surface.
Idempotency and Signatures
The core discipline of a webhook receiver is idempotency. The platform will send the same event more than once, because retries happen, network paths fail between send and acknowledgement, and distributed delivery is never exactly-once. If the receiver treats each incoming request as a fresh event, duplicate deliveries will produce duplicate side effects: double notifications to customers, double entries in the ledger, double settlement instructions to downstream systems. None of those outcomes is acceptable in a payment system.
Every webhook event carries a stable identifier — a UUID or equivalent — that is unique to the event. The receiver persists the identifier on successful processing and rejects any subsequent delivery of the same identifier. The persistence has to survive process restarts and application deployments; an in-memory set is not enough. The typical implementation is a table in the same database as the business state, with the event identifier as a primary key, written in the same transaction as the business effect. If the transaction commits, the event is processed; if the transaction rolls back, the event is not processed, and the platform will retry.
Signature verification is the second non-negotiable. Every Bridge webhook carries a signature header produced with a shared secret and the payload body. The receiver must verify the signature before parsing the payload, because an unverified request is either legitimate or an attacker, and the difference must be established before the content is trusted. A typical implementation uses HMAC-SHA256 on the raw body with a secret rotated via the platform's key management surface. Example, in pseudo-code:
function verify(req: Request): boolean { const signature = req.header("X-Bridge-Signature"); const timestamp = req.header("X-Bridge-Timestamp"); const body = req.rawBody(); const expected = hmacSha256(secret, `${timestamp}.${body}`); if (!constantTimeEquals(signature, expected)) return false; if (Math.abs(nowSeconds() - Number(timestamp)) > 300) return false; // replay window return true; }
The timestamp check protects against replay. A valid signature on an old payload is still a replay risk if an attacker captured it earlier. A five-minute window is a typical default. The constant-time comparison protects against timing attacks.
Order matters. Signature verification is the first step, before any parsing, because a malformed payload should not reach the parser unless it came from the platform.
Retry Strategy
A webhook that fails to deliver has to be retried. The retry strategy is the sender's job, but the receiver's behaviour determines whether retries produce the right outcome.
Bridge's webhook sender retries on any 5xx response, any connection failure, or any 4xx response other than 400, 401 and 403, which are treated as permanent failures (the platform stops retrying and surfaces the failure on the event's status). The retry schedule is exponential with jitter: 30 seconds, then 2 minutes, then 10 minutes, then 1 hour, then 6 hours, up to 24 hours total. After 24 hours the event moves to a dead-letter status and is retrievable via an admin endpoint.
The receiver should respond with 200 as soon as the event is durably persisted, not after the downstream side effects are done. This is important. If the receiver does the full business processing synchronously before acknowledging, a slow downstream will cause the webhook to time out, the platform to retry, and the business logic to run twice — even if the database write was idempotent, the downstream call may not be. The right pattern is to persist the raw event with its identifier, return 200, and process the event on a background worker that reads the persisted queue. The processing becomes internally observable and independently recoverable.
A concrete anti-pattern is to call a downstream API (for example, a bank's payment service) synchronously in the webhook handler. That call can take seconds, the webhook times out, and the platform retries. With the receiver doing non-idempotent work synchronously, the bank ends up seeing the same payment instruction repeatedly. The fix is the same every time: acknowledge the webhook on receipt, process it in the background with its idempotency key, and let the background worker handle its own retries against the downstream.
Ordering and Delivery Guarantees
Webhook platforms vary on ordering guarantees. The right assumption for a receiver is that events may arrive out of order, even when they were produced in order. A payment-authorised event may arrive after a payment-settled event if the two were produced close together and the delivery paths had different latency. The receiver must be able to apply any event regardless of the order in which it arrives.
The common implementation pattern is a state machine keyed by the payment identifier. Each event carries the payment's state; the receiver updates its local state only if the incoming event's state is ahead of the current local state. An event announcing "authorised" for a payment that is already locally "settled" is discarded as old news. An event announcing "settled" for a payment that is already locally "settled" is a duplicate and is handled by the idempotency check. The state machine has to be defined for every event type the platform emits, which is tedious but unavoidable.
Some event types are non-terminal and can be superseded by later versions of themselves (for example, periodic balance-update events). Those should carry a sequence number or a timestamp that the receiver uses to detect stale events.
Delivery guarantees in payment webhooks are at-least-once, never exactly-once. Any platform that advertises exactly-once is either wrong about what it is doing or is wrapping at-least-once delivery with an idempotency check that the client still has to participate in. Treat the stream as at-least-once, and design accordingly.
Common Pitfalls
Four pitfalls recur in payment webhook integrations often enough to be worth calling out explicitly.
The first is silent signature failures. A webhook receiver that rejects unsigned requests without logging them will eventually find itself rejecting legitimate traffic after a secret rotation, and the failure will only show up when the business process breaks downstream. The remedy is to log every rejection with enough context (event identifier, signature header, timestamp header) to diagnose, and to alert on rejection rates above a threshold.
The second is missing timeouts on the receiver side. A receiver that processes the event synchronously on a request-scoped connection will block a platform thread and eventually degrade delivery for everyone. The platform will protect itself with its own timeouts (Bridge's default is 30 seconds per delivery attempt), but a receiver that consistently hits the timeout will be marked degraded. The remedy is the acknowledge-and-process pattern described above.
The third is forgetting dead letters. Webhook platforms move events to dead-letter after a bounded retry period. A receiver that does not monitor the dead-letter queue will miss events that failed persistently, and the business state will diverge from the platform's state without a signal. The remedy is to have a scheduled job that queries the dead-letter endpoint and raises operational alerts.
The fourth is testing only the happy path. Webhook integrations fail on the edges — duplicate deliveries, out-of-order events, signature rotations, payload schema additions — and happy-path tests do not exercise any of these. The remedy is to have a fault-injection test suite that replays captured payloads with duplications, reorderings and signature variations, and verifies the receiver does the right thing.
Bridge Webhook Reference
Bridge emits webhooks for every significant state change on the settlement, custody, identity, and compliance domains. The subscription surface is described on the build pages and the complete event catalogue is part of the developer documentation. The patterns described in this article apply to all of them, with two specific notes.
First, Bridge events carry a chain of evidence — the events that preceded the current one in the processing of the intent — so that a receiver can reconstruct the full history of a payment from a single event without polling back. This is useful for receivers that need to write audit trails, because the evidence can be persisted alongside the business effect without additional calls.
Second, the Kafka event stream is available as an alternative to webhooks for high-throughput consumers. The semantics are the same — idempotency keys, ordering, signatures (via message headers) — but the delivery model is pull rather than push, and the retention allows a consumer to replay events after an outage. Most integrators use webhooks; a minority of high-volume integrators use the Kafka stream. The choice is a scaling question, not a correctness one.
View the webhook docs on /build, compare the full integration surface on /platform, or contact the team if you want to discuss a specific receiver design.