Secure OTA Firmware Updates with Rollback: A/B Partitions, Bootloader Strategy and Lifecycle

An OTA firmware update can either extend the life of an embedded product or brick an entire fleet in the field. The difference is not the update transport itself, but the architecture behind it: bootloader validation, signed images, rollback strategy, version management, staged rollout and lifecycle monitoring.

This guide explains how to design secure OTA firmware updates with rollback for embedded devices, from MCU-based products to Embedded Linux systems. It is written for teams that need a production-safe update flow, not just a lab demo that downloads a binary once.

For teams working on STM32 firmware, ESP32 firmware or broader embedded firmware architecture, OTA should be designed before the flash map, bootloader and release process become hard to change.

What makes an OTA firmware update safe in production?

One of the most dangerous simplifications is to treat OTA as a simple "file transfer". In reality, a firmware update on an embedded device is a critical transaction involving at least four distinct phases, each with its own risks and requirements.

The first phase concerns transport: the firmware package must arrive at the device intact, authenticated and resistant to interruptions. The second is verification: before even touching the flash, the device must make sure that the image received is legitimate, not corrupted and compatible with the hardware. The third is atomic write: the update must be applied such that an interrupt at any point does not leave the device in an inconsistent state. The fourth is post-boot validation: only after verifying that the new firmware works correctly, the system must consider the update complete.

Skipping or simplifying any of these steps doesn't speed up development — it simply moves the problem to production, where the cost is orders of magnitude higher.

Practical example: Check the current OTA status on ESP32 (ESP-IDF)

Before designing anything, it's helpful to understand how the framework exposes the state of OTA partitions:

#include "esp_ota_ops.h"
#include "esp_partition.h"

void check_ota_state(void) {
    const esp_partition_t *running = esp_ota_get_running_partition();
    const esp_partition_t *boot    = esp_ota_get_boot_partition();
    esp_ota_img_states_t state;

    esp_ota_get_state_partition(running, &state);

    ESP_LOGI("OTA", "Running: %s | Boot: %s | State: %d",
             running->label, boot->label, state);
}

Explanation: esp_ota_get_running_partition() tells you which slot you are spinning from, esp_ota_get_boot_partition() which slot will be used at the next reset. If the two don't match after an update, something in the commit process went wrong.

A/B partitions vs single-slot OTA updates

Before choosing the transport protocol, before integrating any OTA library, the critical decision is how to organize the flash memory. It's a choice that can't be easily revisited after deployment.

The single-bank approach — where new firmware directly overwrites the active one — is simple to implement but catastrophic in the event of an outage. If writing stops halfway, the device no longer has working firmware. The only path to recovery is physical access.

The dual-bank approach (A/B partitioning) always maintains two full firmware slots: one active, one staging. The new firmware is written to the inactive slot, verified, and only after verification is the bootloader instructed to boot from the new partition. If something goes wrong, fallback is automatic.

Strategy	How it works	Pros	Risks	Best fit
Single-slot update with recovery mode	The active image is overwritten, with a small recovery path available separately.	Low flash usage and simple memory map.	High risk if recovery is incomplete or unreachable after a bad write.	Very constrained devices with controlled service access.
Dual-slot / A/B firmware update	The new image is written to the inactive slot, validated, then selected by the bootloader.	Strong rollback behavior and clear production semantics.	Requires enough flash for two images plus metadata.	STM32, ESP32 and industrial MCU products that must survive field failures.
Bootloader fallback image	A minimal known-good image remains available for recovery or maintenance mode.	Useful when full A/B is too expensive.	The fallback image must remain compatible with hardware and backend protocols.	Small devices that need remote recovery but cannot store two full applications.
Embedded Linux A/B root filesystem	Two root filesystems are managed by U-Boot, Mender, RAUC or SWUpdate.	Atomic system updates across kernel, rootfs and applications.	Needs careful bootcount, compatibility and storage management.	Gateways, HMI panels, routers and Linux-based IoT products.
Delta update with fallback	Only binary differences are transferred, then reconstructed and validated locally.	Reduces bandwidth and energy cost.	Patch application adds complexity and must still keep rollback safe.	Cellular, NB-IoT or bandwidth-constrained fleets.

Practical example: partition layout for STM32 with OTA dual-bank

/* memory_map.h */
#define BOOTLOADER_START    0x08000000U  /* 32 KB  — mai aggiornabile via OTA */
#define BOOTLOADER_SIZE     0x00008000U

#define SLOT_A_START        0x08008000U  /* 192 KB — firmware attivo          */
#define SLOT_A_SIZE         0x00030000U

#define SLOT_B_START        0x08038000U  /* 192 KB — staging aggiornamento    */
#define SLOT_B_SIZE         0x00030000U

#define OTA_METADATA_START  0x08068000U  /* 4 KB   — flags, versione, CRC     */
#define OTA_METADATA_SIZE   0x00001000U

/* Struttura metadati OTA — in OTA_METADATA_START */
typedef struct {
    uint32_t magic;           /* 0xDEADBEEF — sanity check     */
    uint8_t  active_slot;     /* 0 = Slot A, 1 = Slot B        */
    uint8_t  update_pending;  /* 1 = nuovo fw in Slot B        */
    uint8_t  boot_attempts;   /* contatore tentativi boot      */
    uint8_t  reserved;
    uint32_t slot_b_crc32;    /* CRC32 firmware in Slot B      */
    uint32_t slot_b_size;
    uint32_t fw_version;      /* versione monotona anti-rollback */
} OtaMetadata_t;

Explanation: The bootloader reads OtaMetadata_t at every startup. If update_pending == 1, check the CRC of Slot B before committing the change. If the CRC fails or boot_attempts exceeds the threshold, returns to Slot A.

Practical example: recommended minimum headroom

/* Regola empirica per il dimensionamento */

/* Slot size = dimensione attuale firmware × 1.5 (minimo)     */
/* Motivazione:                                                */
/*   - 30% headroom per crescita del codebase                 */
/*   - spazio per delta update staging (se implementato)      */
/*   - buffer per metadati e padding di allineamento          */

/* Esempio: firmware attuale = 120 KB                         */
/*   Slot size consigliato = 180–200 KB                       */

Explanation: Undersizing slots is one of the most common mistakes. A firmware that "fits" today no longer fits after six months of active development.

OTA architecture review

Planning OTA for a real product?

Silicon LogiX can review your bootloader, flash layout, rollback strategy and firmware release process before you deploy OTA updates to devices in the field.

Request an OTA architecture review

Bootloader requirements for rollback

The bootloader is the component that makes any reliable OTA scheme possible or impossible. A poorly designed bootloader defeats any other security and robustness measures implemented in the application firmware. If you need the broader background, see the related guide on embedded bootloaders for MCU, Linux and FPGA systems.

The minimum responsibilities of an OTA-capable bootloader are clear: verify the integrity of the boot image, manage the boot attempt counter with automatic fallback, protect your region from unauthorized writes, and implement anti-rollback protection.

A rule that always applies: the bootloader must never be updateable through the same OTA channel as the application firmware. A corrupt bootloader can render your device permanently unusable without physical access.

Practical example: boot logic with automatic fallback (C pseudocode)

/* bootloader/main.c */
#define MAX_BOOT_ATTEMPTS 3

void bootloader_main(void) {
    OtaMetadata_t meta;
    read_ota_metadata(&meta);

    if (meta.magic != 0xDEADBEEF) {
        /* Prima accensione o metadati corrotti — boot da Slot A */
        boot_from_slot(SLOT_A);
    }

    if (meta.update_pending) {
        /* Nuovo firmware disponibile in Slot B */
        if (!verify_crc32(SLOT_B_START, meta.slot_b_size, meta.slot_b_crc32)) {
            /* CRC fallito — torna a Slot A, annulla aggiornamento */
            meta.update_pending = 0;
            write_ota_metadata(&meta);
            boot_from_slot(SLOT_A);
        }
        if (!verify_signature(SLOT_B_START, meta.slot_b_size)) {
            /* Firma non valida — possibile tampering */
            meta.update_pending = 0;
            write_ota_metadata(&meta);
            boot_from_slot(SLOT_A);
        }
        /* Verifica ok: committa il cambio */
        meta.active_slot    = 1;   /* Slot B */
        meta.update_pending = 0;
        meta.boot_attempts  = 0;
        write_ota_metadata(&meta);
    }

    /* Gestione boot loop */
    uint8_t slot = meta.active_slot;
    meta.boot_attempts++;
    write_ota_metadata(&meta);

    if (meta.boot_attempts > MAX_BOOT_ATTEMPTS) {
        /* Boot loop rilevato — fallback allo slot opposto */
        slot = (slot == 0) ? 1 : 0;
        meta.active_slot   = slot;
        meta.boot_attempts = 0;
        write_ota_metadata(&meta);
    }

    boot_from_slot(slot == 0 ? SLOT_A : SLOT_B);
}

Explanation: boot_attempts it is incremented before boot and reset by the application firmware only after confirming correct startup. If the firmware crashes repeatedly before the counter can be reset, the bootloader detects the boot loop and returns to the previous slot.

Practical example: the application firmware must "confirm" the successful boot

/* Nel firmware applicativo — da chiamare dopo init completata */
void confirm_successful_boot(void) {
    OtaMetadata_t meta;
    read_ota_metadata(&meta);
    meta.boot_attempts = 0;  /* Azzera il contatore: boot ok */
    write_ota_metadata(&meta);

    /* Su ESP32 con ESP-IDF, equivalente a: */
    /* esp_ota_mark_app_valid_cancel_rollback(); */
}

Explanation: This is the detail that distinguishes a robust OTA system from one that "usually works." If the app doesn't call this function within a timeout, at the next reset the bootloader realizes that something went wrong.

Firmware signing, versioning and anti-downgrade protection

An unsecured OTA channel is worse than no OTA: it becomes a direct attack vector towards the hardware. An attacker who manages to inject unauthorized firmware gains complete access to the system, without the need for application exploits.

The security of an OTA system is built on three independent and complementary levels: authenticity of the firmware (digital signature), confidentiality and integrity of the transport channel (TLS), device authentication (certificates or device-specific tokens). Removing any of these layers creates real vulnerabilities.

Practical example: Generate and verify firmware ECDSA signature (Python + openssl)

# ---- Lato build server ----

# Genera chiave privata (da conservare su HSM, mai su repo)
openssl ecparam -name prime256v1 -genkey -noout -out fw_signing_key.pem

# Estrai chiave pubblica (va flashata nel bootloader)
openssl ec -in fw_signing_key.pem -pubout -out fw_public_key.pem

# Firma il firmware binario
openssl dgst -sha256 -sign fw_signing_key.pem \
    -out firmware_v2.sig firmware_v2.bin

# ---- Lato bootloader (C, verifica con mbedTLS) ----

#include "mbedtls/ecdsa.h"
#include "mbedtls/sha256.h"

/* Chiave pubblica hardcoded nel bootloader (generata offline) */
static const uint8_t PUBLIC_KEY_DER[] = { /* ... bytes ... */ };

int verify_firmware_signature(const uint8_t *fw, size_t fw_len,
                              const uint8_t *sig, size_t sig_len) {
    uint8_t hash[32];
    mbedtls_sha256(fw, fw_len, hash, 0);

    mbedtls_ecdsa_context ctx;
    mbedtls_ecdsa_init(&ctx);
    mbedtls_ecp_keypair_init(&ctx);

    /* Carica chiave pubblica */
    mbedtls_pk_context pk;
    mbedtls_pk_init(&pk);
    mbedtls_pk_parse_public_key(&pk, PUBLIC_KEY_DER, sizeof(PUBLIC_KEY_DER));

    int ret = mbedtls_ecdsa_read_signature(
        mbedtls_pk_ec(pk), hash, sizeof(hash), sig, sig_len);

    mbedtls_pk_free(&pk);
    return (ret == 0) ? 0 : -1;  /* 0 = firma valida */
}

Explanation: The private key never leaves the build server (ideally an HSM). The bootloader contains only the public key: it can verify but not sign. Even if an attacker gains access to the device's flash, they cannot extract the key to sign malicious firmware.

Practical example: Anti-rollback protection with monotone version

/* Nel bootloader, prima di accettare un aggiornamento */
int check_antirollback(uint32_t new_version) {
    OtaMetadata_t meta;
    read_ota_metadata(&meta);

    if (new_version < meta.fw_version) {
        /* Tentativo di downgrade — rifiuta */
        return -1;
    }
    if (new_version == meta.fw_version) {
        /* Stessa versione — accetta solo per reinstallazione esplicita */
        return 0;
    }
    /* Versione superiore — aggiornamento legittimo */
    meta.fw_version = new_version;
    write_ota_metadata(&meta);
    return 0;
}

Explanation: The firmware version is stored in a protected region. Any attempt to install an older version is blocked at the bootloader level, regardless of signature. This prevents downgrade attacks towards versions with already known and patched vulnerabilities.

Common OTA failure modes in embedded devices

Most OTA failures in the field do not result from complex bugs or sophisticated attacks. It comes from trivial scenarios that simply hadn't been tested: the device loses power during flash writing, the connection drops mid-download, new firmware goes into boot loop due to a batch-specific hardware initialization issue.

Another frequent pattern is OTA testing done only under ideal conditions — stable network, guaranteed power supply, identical hardware to the development bench. The field is different: borderline voltages, temperature, RF interference, undocumented hardware variants.

Finally, there is the issue of untested rollback. Many systems implement rollback on paper but never actually test it. When needed, it turns out that the bootloader never actually fell back, or that the recovery slot was overwritten by a previous update.

Failure mode	Example	Mitigation
Power loss during update	The device resets while flash is being erased or programmed.	Write to an inactive slot, verify before switching and keep a known-good image.
Corrupted image	A download completes with missing chunks or modified bytes.	Use hashes, signed images and bootloader-side validation before boot.
Wrong firmware version	A package built for another product or hardware revision is installed.	Check hardware compatibility metadata and product identifiers before accepting the image.
Failed first boot	The new firmware enters a boot loop before the application becomes ready.	Use boot attempt counters and require the application to confirm a healthy boot.
Broken connectivity after update	The update works locally but breaks Wi-Fi, modem, TLS or backend credentials.	Run health checks and staged rollout with telemetry before full deployment.
Storage wear or flash write failure	A sector fails or reaches endurance limits after repeated updates.	Track write cycles, validate erase/program results and design metadata with redundancy.
Rollback loop	The device alternates between two images without reaching a stable state.	Store rollback reason, limit retry count and keep a maintenance/recovery state.
Incompatible hardware revision	A firmware image initializes peripherals that are not present on a field variant.	Bind packages to hardware revisions and test representative production batches.

Practical example: Simulate a power outage during OTA (STM32)

On STM32 with HAL, you can simulate an interrupt by forcing a reset while writing to test the behavior of the bootloader:

/* Nel firmware di test — interrompi deliberatamente a metà scrittura */
HAL_FLASH_Unlock();
for (uint32_t addr = FLASH_SLOT_B_START;
     addr < FLASH_SLOT_B_START + firmware_size;
     addr += FLASH_PAGE_SIZE) {

    FLASH_EraseInitTypeDef erase = {
        .TypeErase = FLASH_TYPEERASE_PAGES,
        .Page      = (addr - FLASH_BASE) / FLASH_PAGE_SIZE,
        .NbPages   = 1
    };
    uint32_t error;
    HAL_FLASHEx_Erase(&erase, &error);

    HAL_FLASH_Program(FLASH_TYPEPROGRAM_DOUBLEWORD, addr, data[i]);

    /* Simula power loss al 50% */
    if (addr == FLASH_SLOT_B_START + firmware_size / 2)
        NVIC_SystemReset();
}
HAL_FLASH_Lock();

Explanation: If after this reset the bootloader successfully boots slot A firmware, your fallback logic works. If the device remains frozen, you found a bug before sending it into the field.

OTA on MCUs vs Embedded Linux

On MCU-based products, OTA usually means managing one or more application firmware images, a bootloader, metadata and a small amount of non-volatile state. On Embedded Linux systems, OTA is broader: the kernel, device trees, root filesystems, libraries and applications can all require coordinated updates.

This is why the same rollback principle appears in different forms: MCU A/B slots, ESP-IDF OTA partitions, U-Boot bootcount, RAUC bundles, Mender artifacts or SWUpdate manifests. For the Linux security angle, see also secure embedded Linux with CRA, secure boot and OTA updates.

The two most established open source frameworks are SWUpdate and Mender. SWUpdate works with SWUPD packages containing a signed manifest and images to install, and is widely used in European industry. Mender introduces the concept of Artifact — a container with hardware compatibility metadata, pre/post-installation scripts, and encrypted images — with native cloud integration for fleet management.

Practical example: sw-description structure for SWUpdate

/* sw-description — manifest firmato nel pacchetto SWUPD */
software =
{
    version = "2.1.0";
    hardware-compatibility = ["1.0", "1.1", "2.0"];

    images: (
        {
            filename = "rootfs.ext4.gz";
            type     = "raw";
            device   = "/dev/mmcblk0p2";
            sha256   = "a1b2c3d4...";
            compressed = true;
        },
        {
            filename = "kernel.itb";
            type     = "raw";
            device   = "/dev/mmcblk0p1";
            sha256   = "e5f6a7b8...";
        }
    );

    scripts: (
        {
            filename  = "post_install.sh";
            type      = "shellscript";
            /* Eseguito dopo la scrittura, prima del reboot */
        }
    );
}

Explanation: the field hardware-compatibility is critical — it prevents firmware compiled for one hardware revision from being installed on a different revision. The sha256 of each component is checked by SWUpdate before writing, not after.

Practical example: U-Boot environment to manage A/B slots on embedded Linux

# /etc/fw_env.config — mappa l'environment U-Boot
/dev/mmcblk0    0x3F000   0x1000

# Leggi lo slot attivo corrente
fw_printenv mmcpart

# Dopo un aggiornamento riuscito, conferma il cambio slot
fw_setenv mmcpart 2          # Slot B
fw_setenv upgrade_available 0  # Disabilita rollback automatico

# In caso di rollback manuale
fw_setenv mmcpart 1          # Torna a Slot A
reboot

Explanation: U-Boot reads upgrade_available at every startup. If it is 1, decrement bootcount. When bootcount surpasses bootlimit, U-Boot automatically returns to the previous slot — the same mechanism as the counter boot_attempts seen for microcontrollers, but implemented in U-Boot environment variables.

Which processors support secure OTA updates with rollback?

Secure OTA is not a feature of the processor alone. A robust implementation depends on flash layout, bootloader control, cryptographic verification, available memory, update transport, watchdog behavior and the ability to revert to a known-good image.

MCU and SoC families such as STM32, ESP32, Nordic nRF, NXP i.MX RT, Renesas RA and Embedded Linux platforms can be used in OTA designs, but the actual rollback architecture must be validated for the specific product, memory map and boot chain.

Enough flash or external storage for a fallback image.
Bootloader can select and validate images.
Firmware image can be signed and verified before boot.
Device can detect failed boot or failed health check.
Version counters prevent accidental downgrade.
Watchdog and recovery path are tested under realistic failures.
Telemetry confirms rollout health from devices in the field.

For firmware analysis before release, tools such as SLX Firmware Explorer can help inspect firmware artifacts, while architecture work should still validate the full update path on the real target.

Update transport and payload strategy

There is no universally optimal OTA protocol. The choice depends on device characteristics, deployment environment and reliability requirements. The following table summarizes the most common patterns:

Protocol	Use case	Advantages	Limits
HTTPS/HTTP	ESP32, Wi-Fi gateway, always powered devices	Simplicity, high bandwidth, broad cloud support	Power consumption, requires stable connectivity
BLE DFU	Wearable, battery-powered sensors, healthcare	Low power, mature frameworks (Nordic DFU)	Low bandwidth (~20–244 bytes/packet), limited range
MQTT over TLS	Industrial IoT, distributed fleet	Lightweight, configurable QoS, centralized broker	Requires broker always available
LWM2M	Certified devices, telco IoT	GSMA standard, native lifecycle management	Implementation complexity
Cellular NB-IoT/LTE-M	Remote sensors, utilities, agriculture	Global coverage, distributed deployment	SIM/data cost, critical payload optimization

Practical example: OTA download with resumption on interrupted connection (ESP32)

#include "esp_http_client.h"
#include "esp_ota_ops.h"

#define OTA_URL         "https://update.example.com/firmware_v2.bin"
#define CHUNK_SIZE      4096

void ota_task(void *arg) {
    esp_ota_handle_t ota_handle;
    const esp_partition_t *update_part = esp_ota_get_next_update_partition(NULL);
    esp_ota_begin(update_part, OTA_SIZE_UNKNOWN, &ota_handle);

    /* Leggi offset già scaricato (da NVS, sopravvive al reset) */
    uint32_t resume_offset = nvs_get_download_offset();

    esp_http_client_config_t cfg = {
        .url = OTA_URL,
        .cert_pem = server_cert_pem,  /* Verifica TLS */
    };
    esp_http_client_handle_t client = esp_http_client_init(&cfg);

    /* HTTP Range request per riprendere dal punto interrotto */
    char range_header[64];
    snprintf(range_header, sizeof(range_header), "bytes=%lu-", resume_offset);
    esp_http_client_set_header(client, "Range", range_header);
    esp_http_client_open(client, 0);

    uint8_t buf[CHUNK_SIZE];
    int read_len;
    while ((read_len = esp_http_client_read(client, (char*)buf, CHUNK_SIZE)) > 0) {
        esp_ota_write(ota_handle, buf, read_len);
        resume_offset += read_len;
        nvs_set_download_offset(resume_offset);  /* Checkpoint */
    }

    esp_ota_end(ota_handle);
    esp_ota_set_boot_partition(update_part);

    nvs_clear_download_offset();  /* Pulizia checkpoint */
    esp_restart();
}

Explanation: The download offset is saved in NVS (Non-Volatile Storage) with each received chunk. In the event of an interruption — reset, power failure, low battery — the download resumes exactly where it stopped, without downloading the entire firmware again.

Practical example: delta update with heatshrink (payload reduction)

# Lato server: genera patch binaria tra versione corrente e nuova
# Usa bsdiff (cross-platform, adatto a firmware binari)

bsdiff firmware_v1.bin firmware_v2.bin firmware_v1_to_v2.patch

# Dimensioni tipiche:
# firmware_v1.bin  →  256 KB
# firmware_v2.bin  →  258 KB  (cambio di 2 funzioni)
# patch            →   ~8 KB  (riduzione ~97%)

# Il dispositivo applica la patch in locale:
# bspatch firmware_v1.bin firmware_v2_reconstructed.bin firmware_v1_to_v2.patch

Explanation: the delta update is essential on NB-IoT or LoRaWAN cellular connections where the energy and monetary cost of transmission is significant. An 8% patch to the full firmware can mean the difference between a successful update and a dead battery mid-transfer.

Production rollout strategy: from lab test to field deployment

An OTA system is judged not on how it performs under ideal conditions, but on how it survives the worst conditions. Mid-flash power failure, corrupt download, boot loop after a bad update and broken connectivity after deployment need to be explicitly tested, not hopefully avoided.

Run destructive tests in the lab before enabling field updates.
Deploy first to canary devices with known hardware revisions.
Track boot success rate, rollback rate, watchdog events and app-ready time.
Pause rollout automatically when failure signals cross thresholds.

The hardware watchdog is the ultimate safety net. It must be configured to detect not only explicit crashes, but also boot loops — situations in which the firmware boots, fails during initialization, and continually reboots without ever becoming operational.

Practical example: staged rollout — don't update the entire fleet at once

# Esempio di configurazione staged rollout su Mender (mender.conf)
# Fase 1: 5% del fleet (canary)
{
  "deployment": {
    "filter": { "tags": ["canary"] },
    "max_devices": 50,
    "phases": [
      { "batch_size": 5, "delay_before_next": "24h" },
      { "batch_size": 20, "delay_before_next": "48h" },
      { "batch_size": 75 }
    ]
  }
}

# Metriche di successo da monitorare prima di avanzare alla fase successiva:
# - boot_success_rate > 99%
# - crash_rate < 0.1%
# - watchdog_triggers == 0
# - app_ready_time entro soglia attesa

Explanation: Deploying an update to 100% of the fleet in one fell swoop is the fastest way to turn a bug that escaped testing into a full-scale emergency. Canary deployment limits the blast radius: if 5% of devices show problems, the update is blocked before propagating.

Practical example: monitor fleet status with GPIO marker (recovered from the boot architecture)

/* Nel firmware applicativo, dopo che l'OTA è stata applicata e
   l'app è operativa — stessa tecnica del boot marker */

void app_ready_after_ota(void) {
    /* 1. Conferma al bootloader che il boot è andato a buon fine */
    confirm_successful_boot();

    /* 2. Segnala allo stack di telemetria che l'aggiornamento è completato */
    telemetry_send_event("ota_success", current_fw_version());

    /* 3. Marker fisico (utile in fase di test con logic analyzer) */
    gpio_set_level(GPIO_APP_READY, 1);
}

Explanation: confirmation to the bootloader, notification to the telemetry system and the physical marker are three distinct but complementary operations. The first avoids unwanted rollbacks, the second allows remote monitoring of the fleet, the third makes the behavior verifiable in the laboratory before deployment.

OTA and Cyber Resilience Act: What Concretely Changes

Since March 2025, the European Cyber Resilience Act (CRA) has been formally in force. For those who develop connected devices intended for the European market, the implications on the OTA system are concrete and cannot be delegated.

The updateability obligation requires that products can receive security patches throughout their intended commercial lifecycle. Update logs must be verifiable: timestamps, image hashes, installation success and rollback reason all become part of the product evidence. Automatic generation of Software Bill of Materials (SBOM) should be part of the build process, not a manual task.

This is also where OTA connects to Matter device lifecycle and OTA architecture: update management is a product responsibility, not only a firmware feature.

Practical example: Generate SBOM automatically with Yocto/BitBake

# In local.conf del progetto Yocto
INHERIT += "create-spdx"

# Genera SBOM in formato SPDX 2.3 per ogni build
# Output: ${DEPLOY_DIR}/spdx/${MACHINE}/

# Include dipendenze ricorsive di tutti i pacchetti
SPDX_PRETTY = "1"

# Verifica versioni con vulnerabilità note (CVE check)
INHERIT += "cve-check"
CVE_CHECK_REPORT_PATCHED = "1"

Explanation: Yocto automatically generates the complete SBOM of each build, including the exact versions of kernels, libraries, and applications. The CVE check compares the used versions with the database of known vulnerabilities and reports those not yet patched. This is not just compliance: it is an operational tool for deciding when and what to update via OTA.

Practical example: update log for audit trail

/* Nel firmware: registra ogni evento OTA in area NVS con timestamp */
typedef struct {
    uint32_t timestamp;          /* Unix timestamp                   */
    uint32_t from_version;       /* Versione firmware precedente     */
    uint32_t to_version;         /* Versione firmware installata     */
    uint8_t  result;             /* 0=successo, 1=fallito, 2=rollback*/
    uint8_t  trigger;            /* 0=automatico, 1=manuale          */
    uint8_t  reserved[2];
    uint32_t image_crc32;        /* CRC del firmware installato      */
} OtaLogEntry_t;

void log_ota_event(OtaLogEntry_t *entry) {
    /* Scrivi in area NVS circolare — ultimi N eventi */
    nvs_set_blob(nvs_handle, "ota_log_latest", entry, sizeof(OtaLogEntry_t));
    /* Invia anche al backend di telemetria */
    telemetry_send_ota_event(entry);
}

Explanation: The CRA requires that the manufacturer can demonstrate when a device received an update and whether it was successful. This log, combined with backend data, forms the audit trail needed for compliance.

OTA firmware update checklist

Define the update unit: application, full image, root filesystem or bundle.
Reserve flash or storage for A/B slots, fallback image and metadata.
Keep bootloader validation independent from the application firmware.
Sign firmware images and verify them before boot.
Use monotonic version counters for firmware rollback protection.
Require the new firmware to confirm a healthy boot.
Test power loss, corrupt images, failed boot, watchdog reset and connectivity loss.
Roll out gradually and collect telemetry from each device cohort.
Keep update logs for support, audit and long-term lifecycle management.

Designing OTA as a late addition, a layer inserted when the firmware is already complete, is the root cause of fragile update systems. The choices that determine the quality of the OTA system are made much earlier: in the size of the flash, in the design of the bootloader, in the choice of microcontroller, in the structure of the filesystem.

The most effective engineering principle remains surprisingly simple: every component of the OTA system must be designed to fail safely. Not "designed to work" — designed to fail without leaving the device in an unrecoverable state.

FAQ: secure OTA firmware updates with rollback

What is an OTA firmware update?

An OTA firmware update is a remote update process that delivers, verifies and installs new firmware on an embedded device without physical access.

How does rollback work in embedded devices?

Rollback keeps or restores a known-good image when the new firmware fails validation, cannot boot reliably or does not pass a health check.

Is A/B partitioning required for OTA updates?

A/B partitioning is not always mandatory, but production systems need some fallback path such as a recovery image, dual slot layout or external storage.

Which processors support secure OTA updates with rollback?

Secure OTA depends on the full architecture, not the processor alone. STM32, ESP32, Nordic nRF, NXP i.MX RT, Renesas RA and Embedded Linux platforms can be used when flash layout, bootloader and verification are designed correctly.

How do you prevent firmware downgrades?

Downgrades are prevented with signed firmware, monotonic version counters and bootloader-side anti-downgrade checks before accepting an image.

Can OTA updates be used on STM32 or ESP32 devices?

Yes. STM32 and ESP32 devices can support OTA designs, but the flash layout, bootloader behavior, watchdog strategy and rollback validation must fit the specific product.

How should OTA firmware updates be tested before production?

Test power loss, corrupt images, failed first boot, connectivity loss, hardware revision mismatch, rollback loops, watchdog behavior and staged rollout telemetry.

Conclusion: OTA is a lifecycle capability

An embedded system without robust OTA has a built-in expiration date. Vulnerabilities are discovered, requirements change and bugs surface in production. The ability to update in the field is not a secondary feature: it is what transforms an embedded product into a system with a real lifecycle.

As with boot time, there is no one-size-fits-all solution. But the fundamentals are clear: dual-bank architecture, bootloader with automatic rollback, cryptographic signature of the firmware, anti-rollback protection, staged rollout. Correctly implementing these elements requires work early in the project — but avoids physical interventions in the field that cost orders of magnitude more.

Firmware update architecture

When Silicon LogiX can help

Need a production-safe OTA update architecture? We help embedded teams design firmware update flows with bootloader validation, rollback, staged rollout, telemetry and long-term device lifecycle management.

Talk to Silicon LogiX

Secure OTA Firmware Updates with Rollback: A/B Partitions, Bootloader Strategy and Lifecycle

What makes an OTA firmware update safe in production?

Practical example: Check the current OTA status on ESP32 (ESP-IDF)

A/B partitions vs single-slot OTA updates

Practical example: partition layout for STM32 with OTA dual-bank

Practical example: recommended minimum headroom

Planning OTA for a real product?

Bootloader requirements for rollback

Practical example: boot logic with automatic fallback (C pseudocode)

Practical example: the application firmware must "confirm" the successful boot

Firmware signing, versioning and anti-downgrade protection

Practical example: Generate and verify firmware ECDSA signature (Python + openssl)

Practical example: Anti-rollback protection with monotone version

Common OTA failure modes in embedded devices

Practical example: Simulate a power outage during OTA (STM32)

OTA on MCUs vs Embedded Linux

Practical example: sw-description structure for SWUpdate

Practical example: U-Boot environment to manage A/B slots on embedded Linux

Which processors support secure OTA updates with rollback?

Update transport and payload strategy

Practical example: OTA download with resumption on interrupted connection (ESP32)

Practical example: delta update with heatshrink (payload reduction)

Production rollout strategy: from lab test to field deployment

Practical example: staged rollout — don't update the entire fleet at once

Practical example: monitor fleet status with GPIO marker (recovered from the boot architecture)

OTA and Cyber ​​Resilience Act: What Concretely Changes

Practical example: Generate SBOM automatically with Yocto/BitBake

Practical example: update log for audit trail

OTA firmware update checklist

FAQ: secure OTA firmware updates with rollback

What is an OTA firmware update?

How does rollback work in embedded devices?

Is A/B partitioning required for OTA updates?

Which processors support secure OTA updates with rollback?

How do you prevent firmware downgrades?

Can OTA updates be used on STM32 or ESP32 devices?

How should OTA firmware updates be tested before production?

Conclusion: OTA is a lifecycle capability

When Silicon LogiX can help

Embedded firmware services

Related resources

Embedded firmware services

Embedded bootloaders

SLX Memory Map Explorer

Related articles

Post-quantum cryptography for embedded and IoT: secure boot, TLS and OTA

ESP-IDF 6.0 migration: how to upgrade ESP32 firmware without production regressions

Bluetooth Channel Sounding: precise and secure distance measurement for embedded IoT

OTA and Cyber Resilience Act: What Concretely Changes