Embedded C & Data Structures
Master the essential programming techniques and data structures used
in embedded systems—from bitwise operations and memory mapping to
hardware-efficient structures like circular buffers and linked lists
optimized for resource-constrained environments.
📚 References: This guide synthesizes concepts from
ARM® Cortex™-M documentation, MISRA-C guidelines, FreeRTOS memory
management, and embedded systems design patterns. Key resources
include Barr Group's Embedded C Coding Standard, "Making Embedded
Systems" by Elecia White, and ARM® Technical Reference Manuals.
1. Embedded C Fundamentals
Embedded C is a set of language extensions for the C programming
language that supports embedded systems programming. Unlike desktop
applications, embedded systems have strict constraints on memory,
power, and real-time performance.
1.1 Key Differences from Standard C
-
No Operating System: Embedded C often runs on bare
metal or RTOS, requiring direct hardware control
-
Memory Constraints: Limited RAM (often 4KB-256KB)
and flash memory
-
Deterministic Execution: Real-time systems require
predictable timing
-
Hardware Access: Direct memory-mapped I/O and
register manipulation
-
Power Awareness: Code must optimize for power
consumption
1.2 The 'volatile' Keyword Deep Dive
The volatile keyword is crucial in embedded systems for
several scenarios:
// 1. Memory-mapped hardware registers #define STATUS_REG (*(volatile
uint32_t *)0x40021000) // 2. Variables modified in interrupts volatile
uint32_t system_tick = 0; // 3. Variables shared between tasks in RTOS
volatile bool data_ready = false; // 4. Loop control variables
optimized away volatile bool keep_running = true; // WRONG: Compiler
might optimize this to infinite loop while(keep_running) { // Without
volatile, compiler might think keep_running never changes }
💡 Expert Tip: Use volatile for any
variable that can change outside the current execution context.
However, excessive use prevents compiler optimizations. For shared
variables in RTOS, combine volatile with proper synchronization
primitives.
2. Bit Manipulation Mastery
Bit manipulation is the cornerstone of embedded programming. It
enables efficient memory usage, direct hardware control, and protocol
implementations.
2.1 Advanced Bit Operations
// Bit field manipulation for configuration registers typedef struct {
uint32_t enable : 1; // Bit 0 uint32_t mode : 2; // Bits 1-2 uint32_t
prescaler : 4; // Bits 3-6 uint32_t reserved : 25; // Bits 7-31 }
TimerConfig; // Efficient bit reversal (for SPI, cryptography) uint8_t
reverse_bits(uint8_t b) { b = (b & 0xF0) >> 4 | (b & 0x0F) << 4; b =
(b & 0xCC) >> 2 | (b & 0x33) << 2; b = (b & 0xAA) >> 1 | (b & 0x55) <<
1; return b; } // Population count (count set bits) int
popcount(uint32_t x) { x = x - ((x >> 1) & 0x55555555); x = (x &
0x33333333) + ((x >> 2) & 0x33333333); x = (x + (x >> 4)) &
0x0F0F0F0F; x = x + (x >> 8); x = x + (x >> 16); return x & 0x3F; } //
Find most significant bit (log2 approximation) int
msb_position(uint32_t x) { int position = 0; if (x & 0xFFFF0000) {
position += 16; x >>= 16; } if (x & 0xFF00) { position += 8; x >>= 8;
} if (x & 0xF0) { position += 4; x >>= 4; } if (x & 0xC) { position +=
2; x >>= 2; } if (x & 0x2) { position += 1; } return position; }
2.2 Bit Masking Patterns
| Operation |
Pattern |
Example |
| Set multiple bits |
reg |= (mask) |
GPIO->ODR |= (1<<5 | 1<<6) |
| Clear multiple bits |
reg &= ~(mask) |
GPIO->ODR &= ~(1<<5 | 1<<6) |
| Toggle multiple bits |
reg ^= (mask) |
GPIO->ODR ^= (1<<5 | 1<<6) |
| Extract bit field |
(reg >> offset) & mask |
(ADC->DR >> 4) & 0xFFF |
| Insert bit field |
reg = (reg & ~mask) | (value << offset) |
Insert 4-bit value at bits 5-8 |
3. Memory Management in Embedded Systems
Memory management in embedded systems requires careful consideration
of constraints, determinism, and fragmentation risks.
3.1 Memory Segments Explained
- .text: Executable code (stored in flash)
- .data: Initialized global/static variables
-
.bss: Zero-initialized global/static variables
-
Stack: Local variables, function calls (fast,
limited)
-
Heap: Dynamic allocation (often avoided in
embedded)
3.2 Advanced Memory Pool Implementation
// Fixed-size block memory pool with alignment #define POOL_SIZE 32
#define BLOCK_SIZE 64 #define ALIGNMENT 8 // For 32-bit systems
typedef struct { uint8_t data[BLOCK_SIZE]; bool allocated; size_t
block_id; } MemoryBlock; typedef struct { MemoryBlock
blocks[POOL_SIZE]; size_t free_blocks; uint8_t* start_address;
uint8_t* end_address; } MemoryPool; // Initialize pool with proper
alignment void pool_init(MemoryPool* pool, uint8_t* memory_area) {
pool->start_address = (uint8_t*)(((uintptr_t)memory_area + ALIGNMENT -
1) & ~(ALIGNMENT - 1)); pool->end_address = pool->start_address +
(POOL_SIZE * BLOCK_SIZE); for (size_t i = 0; i < POOL_SIZE; i++) {
pool->blocks[i].allocated = false; pool->blocks[i].block_id = i; }
pool->free_blocks = POOL_SIZE; } // Allocate with timeout for RTOS
void* pool_alloc_timeout(MemoryPool* pool, uint32_t timeout_ms) {
uint32_t start_time = get_system_tick(); while (pool->free_blocks ==
0) { if (get_system_tick() - start_time > timeout_ms) { return NULL;
// Timeout } // In RTOS, you would yield here // taskYIELD(); } for
(size_t i = 0; i < POOL_SIZE; i++) { if (!pool->blocks[i].allocated) {
pool->blocks[i].allocated = true; pool->free_blocks--; return
pool->blocks[i].data; } } return NULL; } // Defragmentation for
long-running systems void pool_defragment(MemoryPool* pool) { // Move
allocated blocks together to create larger free blocks // Important
for systems that run for months/years }
4. Data Structures for Embedded Systems
Choosing the right data structure is critical for embedded systems. We
need structures that are memory-efficient, deterministic, and safe for
concurrent access.
4.1 Advanced Circular Buffer with Safety Features
// Thread-safe circular buffer for RTOS typedef struct { uint8_t*
buffer; size_t capacity; volatile size_t head; // Protected by mutex
in RTOS volatile size_t tail; size_t count; bool overwrite; // Allow
overwrite when full? // For debugging/analytics size_t max_usage;
size_t overflows; size_t underflows; } SafeCircularBuffer; bool
s_cb_init(SafeCircularBuffer* cb, uint8_t* buf, size_t size, bool
overwrite) { if (size == 0 || (size & (size - 1)) != 0) { return
false; // Size must be power of two for optimization } cb->buffer =
buf; cb->capacity = size; cb->head = 0; cb->tail = 0; cb->count = 0;
cb->overwrite = overwrite; cb->max_usage = 0; cb->overflows = 0;
cb->underflows = 0; return true; } // Push with multiple safety checks
bool s_cb_push(SafeCircularBuffer* cb, uint8_t data) { // Check if
buffer is full if (cb->count == cb->capacity) { if (cb->overwrite) {
// Move tail forward (overwrite oldest) cb->tail = (cb->tail + 1) &
(cb->capacity - 1); cb->overflows++; } else { return false; // Buffer
full } } else { cb->count++; } cb->buffer[cb->head] = data; cb->head =
(cb->head + 1) & (cb->capacity - 1); // Update statistics if
(cb->count > cb->max_usage) { cb->max_usage = cb->count; } return
true; } // Pop with timeout (for RTOS) bool
s_cb_pop_timeout(SafeCircularBuffer* cb, uint8_t* data, uint32_t
timeout_ms) { uint32_t start_time = get_system_tick(); while
(cb->count == 0) { if (get_system_tick() - start_time > timeout_ms) {
cb->underflows++; return false; // Timeout } // taskYIELD(); in RTOS }
*data = cb->buffer[cb->tail]; cb->tail = (cb->tail + 1) &
(cb->capacity - 1); cb->count--; return true; }
4.2 Embedded-Friendly Linked Lists
// Doubly-linked intrusive list with sentinel node typedef struct
DListNode { struct DListNode* prev; struct DListNode* next; }
DListNode; // Initialize sentinel node (circular list) void
dlist_init(DListNode* sentinel) { sentinel->prev = sentinel;
sentinel->next = sentinel; } // Insert node after specified node void
dlist_insert_after(DListNode* node, DListNode* new_node) {
new_node->prev = node; new_node->next = node->next; node->next->prev =
new_node; node->next = new_node; } // Remove node from list void
dlist_remove(DListNode* node) { node->prev->next = node->next;
node->next->prev = node->prev; // Optional: clear pointers node->prev
= node; node->next = node; } // Iterate through list (safe for
deletion) #define dlist_foreach_safe(pos, n, head) \ for (pos =
(head)->next, n = pos->next; pos != (head); pos = n, n = pos->next) //
Application: Task scheduler queue typedef struct { DListNode node; //
Embedded list node void (*task_func)(void*); void* arg; uint32_t
priority; uint32_t period_ms; uint32_t next_run_time; }
SchedulableTask;
5. Finite State Machines (FSMs) in Embedded Systems
FSMs are fundamental for designing reliable embedded systems. They
provide predictable behavior and are easier to test and maintain.
5.1 Hierarchical State Machines
// Hierarchical FSM for complex systems typedef struct State State;
typedef struct Event Event; struct Event { uint32_t id; void* data; };
typedef void (*StateAction)(State* state, Event* event); struct State
{ const char* name; StateAction entry_action; StateAction exit_action;
StateAction action; // Regular action const struct Transition*
transitions; size_t transition_count; State* parent; // For
hierarchical FSMs State* child; // Currently active child state };
struct Transition { uint32_t event_id; State* target_state;
StateAction guard; // Optional guard condition StateAction action; //
Transition action }; // Example: IoT Device State Machine State
states[] = { {"INIT", init_entry, init_exit, init_action,
init_transitions, 2, NULL, NULL}, {"CONNECTING", connecting_entry,
connecting_exit, connecting_action, connecting_transitions, 3, NULL,
NULL}, {"CONNECTED", connected_entry, connected_exit,
connected_action, connected_transitions, 4, NULL, NULL}, {"ERROR",
error_entry, error_exit, error_action, error_transitions, 2, NULL,
NULL}, }; // FSM engine typedef struct { State* current_state; State*
previous_state; uint32_t event_count; bool logging_enabled; } FSM;
void fsm_dispatch(FSM* fsm, Event* event) { if (fsm->logging_enabled)
{ log_event(fsm->current_state->name, event->id); } // Check current
state's transitions for (size_t i = 0; i <
fsm->current_state->transition_count; i++) { const Transition* t =
&fsm->current_state->transitions[i]; if (t->event_id == event->id) {
// Check guard condition if present if (t->guard &&
!t->guard(fsm->current_state, event)) { continue; } // Execute
transition State* old_state = fsm->current_state; // Exit actions
(from leaf to root) State* exit_state = old_state; while (exit_state)
{ if (exit_state->exit_action) { exit_state->exit_action(exit_state,
event); } exit_state = exit_state->parent; } // Execute transition
action if (t->action) { t->action(old_state, event); } // Update state
fsm->previous_state = old_state; fsm->current_state = t->target_state;
// Entry actions (from root to leaf) State* entry_state =
fsm->current_state; // Find root first (simplified - need proper
implementation) if (entry_state->entry_action) {
entry_state->entry_action(entry_state, event); } fsm->event_count++;
return; } } // Event not handled in current state if
(fsm->current_state->parent) { // Propagate to parent (hierarchical
FSM) // Simplified implementation } }
6. Hardware Abstraction Layer (HAL) Patterns
Creating portable embedded code requires proper hardware abstraction.
This allows the same application logic to run on different
microcontrollers.
6.1 GPIO Abstraction Layer
// gpio.h - Hardware independent interface typedef enum {
GPIO_MODE_INPUT, GPIO_MODE_OUTPUT, GPIO_MODE_ALT_FUNC,
GPIO_MODE_ANALOG } GpioMode; typedef enum { GPIO_PULL_NONE,
GPIO_PULL_UP, GPIO_PULL_DOWN } GpioPull; typedef enum {
GPIO_SPEED_LOW, GPIO_SPEED_MEDIUM, GPIO_SPEED_HIGH,
GPIO_SPEED_VERY_HIGH } GpioSpeed; typedef struct { uint8_t port; //
Port identifier (0 for GPIOA, 1 for GPIOB, etc.) uint8_t pin; // Pin
number (0-15) } GpioPin; // Hardware independent functions void
gpio_init(GpioPin pin, GpioMode mode, GpioPull pull, GpioSpeed speed);
void gpio_write(GpioPin pin, bool value); bool gpio_read(GpioPin pin);
void gpio_toggle(GpioPin pin); // Platform specific implementation
(e.g., stm32_gpio.c) #ifdef STM32F4 void gpio_init(GpioPin pin,
GpioMode mode, GpioPull pull, GpioSpeed speed) { GPIO_TypeDef* gpio =
get_gpio_port(pin.port); // Configure mode uint32_t moder =
gpio->MODER; moder &= ~(3 << (pin.pin * 2)); moder |= (mode <<
(pin.pin * 2)); gpio->MODER = moder; // Configure pull-up/pull-down
uint32_t pupdr = gpio->PUPDR; pupdr &= ~(3 << (pin.pin * 2)); pupdr |=
(pull << (pin.pin * 2)); gpio->PUPDR = pupdr; // Configure speed
uint32_t ospeedr = gpio->OSPEEDR; ospeedr &= ~(3 << (pin.pin * 2));
ospeedr |= (speed << (pin.pin * 2)); gpio->OSPEEDR = ospeedr; } #endif
7. Embedded C Best Practices and Optimization
7.1 Performance Optimization Techniques
-
Loop Unrolling: Manually or with compiler pragmas
-
Function Inlining: Use
static inline for small functions
-
Data Alignment: Align structures to processor word
boundaries
-
Memory Access Patterns: Sequential access is faster
than random
-
Compiler Intrinsics: Use processor-specific
instructions
// Optimized memory copy for ARM Cortex-M void optimized_memcpy(void*
dest, const void* src, size_t n) { uint32_t* d = (uint32_t*)dest;
const uint32_t* s = (const uint32_t*)src; // Copy word-by-word for
speed size_t words = n / sizeof(uint32_t); for (size_t i = 0; i <
words; i++) { d[i] = s[i]; } // Copy remaining bytes uint8_t* db =
(uint8_t*)d + words * sizeof(uint32_t); const uint8_t* sb = (const
uint8_t*)s + words * sizeof(uint32_t); size_t bytes = n %
sizeof(uint32_t); for (size_t i = 0; i < bytes; i++) { db[i] = sb[i];
} } // Using compiler intrinsics for CRC calculation (ARM) uint32_t
calculate_crc32(const uint8_t* data, size_t length) { uint32_t crc =
0xFFFFFFFF; for (size_t i = 0; i < length; i++) { // ARM Cortex-M4+
has CRC hardware instructions #ifdef __ARM_FEATURE_CRC32 crc =
__crc32b(crc, data[i]); #else // Software fallback crc ^= data[i]; for
(int j = 0; j < 8; j++) { crc = (crc >> 1) ^ (0xEDB88320 & -(crc &
1)); } #endif } return ~crc; }
7.2 Safety and Reliability
🔒 Safety First: Embedded systems often control
physical devices. Always implement watchdog timers, brown-out
detection, and proper error handling. Follow MISRA-C or similar
coding standards for safety-critical systems.
// Watchdog timer implementation typedef struct { uint32_t timeout_ms;
uint32_t last_feed_time; bool enabled; } Watchdog; void
watchdog_init(Watchdog* wd, uint32_t timeout_ms) { wd->timeout_ms =
timeout_ms; wd->last_feed_time = get_system_tick(); wd->enabled =
true; // Configure hardware watchdog
configure_hardware_watchdog(timeout_ms); } void
watchdog_feed(Watchdog* wd) { if (!wd->enabled) return;
wd->last_feed_time = get_system_tick(); feed_hardware_watchdog(); }
bool watchdog_check(Watchdog* wd) { if (!wd->enabled) return true;
uint32_t current_time = get_system_tick(); uint32_t elapsed =
current_time - wd->last_feed_time; if (elapsed > wd->timeout_ms) { //
System has hung! emergency_shutdown(); return false; } return true; }
// Stack overflow protection #ifdef DEBUG void stack_usage_check(void)
{ extern uint32_t __StackTop; extern uint32_t __StackLimit; uint32_t
stack_pointer; asm volatile ("mov %0, sp" : "=r" (stack_pointer));
uint32_t used = (uint32_t)&__StackTop - stack_pointer; uint32_t total
= (uint32_t)&__StackTop - (uint32_t)&__StackLimit; uint32_t percent =
(used * 100) / total; if (percent > 80) { log_warning("Stack usage at
%d%%", percent); } } #endif
8. Debugging and Testing Embedded C
8.1 Embedded Debugging Techniques
-
ITM (Instrumentation Trace Macrocell): ARM's debug
communication
-
Semihosting: Debug I/O via debugger (slow but
useful)
-
SWO (Serial Wire Output): Real-time printf without
UART
-
Hardware Breakpoints: Limited resource, use wisely
- Watchpoints: Break when variable changes
// Custom logging system with different levels typedef enum {
LOG_LEVEL_ERROR, LOG_LEVEL_WARNING, LOG_LEVEL_INFO, LOG_LEVEL_DEBUG,
LOG_LEVEL_VERBOSE } LogLevel; #ifdef ENABLE_LOGGING void
log_message(LogLevel level, const char* format, ...) { static const
char* level_strings[] = { "ERROR", "WARNING", "INFO", "DEBUG",
"VERBOSE" }; char buffer[256]; va_list args; va_start(args, format);
vsnprintf(buffer, sizeof(buffer), format, args); va_end(args);
uint32_t timestamp = get_system_tick(); // Output via SWO (if
available) #ifdef SWO_ENABLED ITM_SendString(buffer); #else //
Fallback to UART uart_send_string(buffer); #endif // Optional: Store
in circular buffer for post-mortem analysis log_buffer_push(level,
timestamp, buffer); } #else #define log_message(level, ...) ((void)0)
#endif
📖 Further Learning:
• ARM® Cortex™-M
Programming Guide
• "Embedded C Coding Standard" by Barr Group
• FreeRTOS Memory Management Documentation
•
MISRA-C:2012 Guidelines
• "Making Embedded Systems" by Elecia
White
• Embedded Systems Academy - Advanced C Course
🎯 Key Takeaway: Mastering Embedded C & Data
Structures requires understanding both low-level hardware
interactions and high-level design patterns. Focus on writing
deterministic, memory-efficient, and maintainable code. Always
consider the hardware constraints and real-time requirements of your
embedded system.