RESEARCH

HSM-Backed Custody Architecture Explained

A technical walk-through of HSM-backed custody for institutional digital assets: FIPS 140-3 validation, key ceremony, signing path, HSM vs MPC trade-offs and production deployment patterns.

PUBLISHED

February 24, 2026

AUTHOR

Bridge Research Team

READ_TIME

14 min read

What an HSM Is and What It Does

A hardware security module is a physical or network-attached appliance dedicated to cryptographic operations. Its defining property is that it provides a secure boundary: key material generated inside the HSM can be used for signing, encryption, decryption or key derivation without ever crossing the boundary in plaintext. An application calls the HSM across a well-defined interface — PKCS#11, JCA/JCE, KMIP, or a vendor-specific API — passing the data to be signed and a reference to the key; the HSM performs the operation and returns the result.

Two tamper properties underpin the security model. Tamper evidence means that physical attack on the appliance leaves visible traces; tamper response means that the appliance destroys its own key material when an attack is detected. Between them, these properties are what make it rational for an institution to treat the HSM as the ground truth for key material: an adversary who gains physical possession of the device is expected to obtain nothing useful from it.

The appliance also enforces authenticated access. Administrative functions — adding an operator, changing a policy, extracting a key for backup — require authentication by a quorum of operators using dedicated credentials (typically smart cards). Application access to keys is mediated by partitions or domains within the HSM, each with its own authentication. The intended property is that no single operator, and no single compromised application, can use a key without the appliance's consent.

The functions an HSM performs are not exotic. Generate a key pair. Sign a message with a private key. Verify a signature with a public key. Derive a child key from a parent. Wrap and unwrap keys for backup. What distinguishes the HSM from a software implementation is not the algorithms — the algorithms are standard — but the environment in which they run. The HSM is a computer whose entire software and hardware stack is engineered, validated and certified for cryptographic work. A general-purpose server is not.

FIPS 140-3 Validation and What the Levels Mean

FIPS 140-3 is the US federal cryptographic module standard, published by NIST as FIPS 140-3 in 2019, with validated products appearing from 2020 onward. It supersedes FIPS 140-2, which remained in active validation through a scheduled sunset. The standard specifies security requirements for cryptographic modules across four ascending levels.

Level 1 is the minimum. It requires the use of approved algorithms and some baseline controls, but it does not require physical security features beyond production-grade components. Level 1 is the level at which software cryptographic libraries can be validated. It is not, by itself, sufficient for institutional custody.

Level 2 adds requirements for tamper evidence. A validated module at Level 2 must show visible signs of physical tampering and must provide role-based authentication of operators. Level 2 is appropriate for many enterprise applications but still falls short of the physical protection that institutional key management demands.

Level 3 adds tamper response — the module must detect and respond to physical attack, typically by zeroising key material — and requires identity-based authentication and a separation between interfaces. Level 3 is the level at which serious institutional custody begins. Most major HSM product families intended for financial services — nShield, Luna, SafeNet, Utimaco — offer Level 3 validated variants.

Level 4 adds requirements for environmental failure protection (the module must respond to environmental attack such as temperature or voltage extremes) and the highest grade of physical protection. Level 4 appears in central-bank and sovereign-grade deployments and in a small number of specialised products.

A FIPS 140-3 validation attaches to a specific firmware version of a specific product configuration; changing the firmware invalidates the certificate until a new validation is issued. NIST publishes the cryptographic module validation programme list, which shows each validated module, its level, its firmware, and the date of validation. A competent evaluation of an HSM-backed custody provider checks the certificate number against the NIST database rather than accepting a marketing claim.

The FIPS 140-3 conversation has a European counterpart in Common Criteria, where EAL4+ against a protection profile such as eIDAS QSCD or a payment HSM profile is the equivalent threshold. In practice, major HSM products carry both kinds of certification, and institutional evaluators accept either against the relevant jurisdictional regime.

The Signing Path in Production

An HSM is not a custody solution by itself. The solution is the signing path: the chain of services and policies that takes a transaction request from a client, routes it through authorisation, and finally produces a signature using key material held in the HSM. The architecture of the signing path determines whether the HSM's security properties actually survive to production.

In a representative signing path, the client initiates a transaction through an API or user interface. The orchestration layer looks up the policy applicable to the client, the asset, the amount and the counterparty. The policy may require that the transaction be approved by one or more human or system signers — the subject of our companion article on multi-signature approval workflows. Once approvals are gathered, a signing service constructs the canonical transaction payload, computes the hash to be signed, and calls the HSM across the approved interface. The HSM performs the signing operation inside its boundary and returns the signature. The signing service assembles the final transaction and submits it to the relevant ledger or venue.

Three engineering properties of this path matter as much as the HSM itself. The first is that the signing service is not trusted to decide which key to use; the orchestration layer supplies the key reference, and the HSM's own access control confirms that the calling identity is entitled to use that key. A compromised signing service should not be able to elevate its access by supplying an unexpected key reference.

The second is that the canonical payload presented to the HSM for signing is the exact payload that will be submitted to the ledger. If the signing service constructs one payload for display and a different payload for signing, the HSM's signature does not protect what the client saw. Enforcement of this property typically uses the HSM to verify certain fields — or to sign a commitment to the full payload that can be verified downstream — before the final submission. In the highest-assurance designs, the orchestration layer itself is signed into the signing authorisation, so a compromised build of the service cannot produce a signature the HSM will accept.

The third is that signing throughput is capacity-planned. Institutional HSMs support modest transactions per second per partition — typical figures are in the hundreds to low thousands for ECDSA operations on curves of interest — and large custody books require either partition-per-tenant segmentation or HSM clustering with load balancing. The signing path design decides whether to queue at the HSM, to shard by client, or to scale horizontally by deploying additional HSMs.

Key Generation and Ceremony

How a key enters the HSM is as consequential as how it leaves. Two options exist. The key can be generated on the HSM, in which case the private component never exists outside the appliance; or the key can be generated elsewhere and imported, in which case the generation environment itself must be treated as part of the cryptographic perimeter.

For institutional custody the default is on-HSM generation. The generation ceremony is the controlled procedure in which this happens. A typical ceremony brings a quorum of authorised operators to a controlled location with the HSMs in a secure configuration. Each operator presents their authentication credentials. The HSM is initialised, its operator roles are populated by M-of-N quorum, and the key generation command is issued. The HSM generates the key pair, assigns the public component to an exportable attribute and the private component to a non-exportable attribute, and records the generation in its audit log. The ceremony is observed, minuted and typically recorded on tamper-evident media.

Backup is the corollary. A key that can never leave the HSM is a key that cannot be recovered if the HSM fails. Institutional HSMs support wrapped-key backup: the key is exported encrypted under a key-encryption key that is itself split across a quorum of operator tokens, so that no single operator (and no single stolen backup tape) can reconstruct it. The backup is stored in at least two geographies, ideally in custody of distinct institutions; recovery requires reassembling the quorum.

The ceremony and the backup procedure are what an auditor examines when they evaluate key management. The SOC 2 Type II report on an HSM-backed custodian will include findings on ceremony operation, key rotation, backup integrity, operator onboarding and offboarding, and quorum maintenance. Our custody insurance and SOC 2 audit article covers what the report looks like in practice.

HSM vs MPC: Where Each Fits

Multi-party computation has become a prominent alternative — or complement — to HSM-backed custody. In an MPC signing scheme, the private key is never assembled at any single location: it exists as a set of shares held by distinct parties, and signing is performed by running a distributed protocol among the shareholders that produces a valid signature without ever reconstructing the underlying key. Threshold ECDSA and threshold Schnorr are the common protocol families.

The properties MPC offers differ from HSM. First, MPC is protocol-layer rather than hardware-layer, which makes it flexible: signing topologies can span cloud and on-premises deployments, shares can be distributed across jurisdictions, and signing can be performed without any one party holding sufficient authority. Second, MPC removes a class of attack — physical compromise of a single device — that even the most tamper-resistant HSM cannot fully rule out at extreme thresholds. Third, MPC's security depends on the correctness of the protocol implementation and the integrity of each participant node, and both of these are harder to evaluate than the physical tamper response of a validated HSM.

In practice the strongest institutional designs combine the two. The signing protocol is MPC; the shares themselves live inside HSMs, each validated at FIPS 140-3 Level 3. This configuration brings the flexibility of MPC to the key topology while preserving the hardware-grade protection of each share. It also means that the evaluation framework from our pillar article applies at both layers: the MPC protocol must be evidenced, and the HSMs must be validated.

For banks choosing between an HSM-only custodian and an MPC-only custodian, the right question is not which architecture is superior in the abstract but which architecture is appropriate for the specific risk the institution is underwriting. Cold-tier reserves favour HSM-heavy designs with hardened quorum ceremonies. Operational and hot-tier workloads benefit from the topology flexibility of MPC. A mature custodian runs both and segments accordingly.

Deployment Patterns and Operational Realities

An HSM in a rack does not become a custody service without a surrounding deployment. Four patterns recur.

The first is the dedicated versus shared HSM trade-off. Dedicated HSMs assigned per client provide the strongest isolation but scale linearly with client count. Shared HSMs partitioned per client scale better but require careful policy enforcement to ensure cross-partition isolation. Regulated custody generally starts with dedicated HSMs for top-tier clients and partitioned HSMs for smaller books, with the partition boundary treated as a first-class security control.

The second is the network placement. Network-attached HSMs sit behind firewalls on an isolated cryptographic network segment, accessible only from a narrow set of signing services. Cloud HSMs — AWS CloudHSM, Azure Managed HSM, Google Cloud HSM — offer similar properties with the cloud provider as a physical operator. Institutional custodians that operate cloud HSMs treat the cloud provider's operational security as part of their supplier risk, and apply the same evidentiary standard they would to any third-party provider. Physical on-premises HSMs are often preferred for cold-tier custody precisely because the operator is the custodian itself.

The third is clustering and high availability. HSMs in production run in clusters, with synchronised key material across cluster members and load balancing across healthy units. The cluster design must tolerate individual HSM failure without loss of key availability and without single-operator recovery. Failover rehearsals form part of the operational readiness of the custody service.

The fourth is key rotation and lifecycle. Long-lived keys accumulate risk: the longer a key is in use, the greater the attack surface against its usage history. Institutional custody rotates operational signing keys on schedule and maintains clean separation between archived keys (which retain signing authority over historical commitments) and active keys. Rotation procedures are ceremony-driven and audited, and the HD wallet architecture of the custody stack is what makes rotation tractable at scale.

Bridge's HSM Architecture

Bridge runs HSM-backed custody on validated hardware at FIPS 140-3 Level 3, with HSM clusters deployed in multiple geographies and partitioned per client book. Key generation is ceremony-based, quorum-enforced and recorded; backup is wrapped-key export to quorum-split operator custody in distinct institutions. The signing path is mediated by our orchestration layer, which looks up policy, gathers approvals, and presents the final canonical payload to the HSM for signing through a narrow, authenticated interface.

For workloads where topology flexibility matters more than single-appliance tamper response — cross-jurisdiction cold-tier arrangements, sovereign and central-bank pilots — Bridge runs threshold ECDSA and Schnorr schemes with shares stored in the same validated HSMs. The policy engine, the multi-signature workflow layer and the HD wallet hierarchy sit above the signing path and are the surfaces through which operators and clients interact with the service.

The point is not the particular vendor of the HSMs or the particular MPC protocol family. The point is that the evaluation framework — validation level, ceremony, backup, access control, throughput, rotation, recovery — is applied as the engineering specification for the custody stack, and the evidence is produced as a matter of course rather than as a diligence artefact. Institutional custody is the property that survives the sum of these controls working correctly together.

For the evaluation framework in which HSM architecture sits, return to our pillar article on how banks evaluate custody providers. For the governance layer above the HSM, see multi-signature approval workflows. To discuss a custody deployment against the architecture in this article, contact us at /custody or /contact.