Introduction to Embedded Systems
An embedded system can generally be defined as a specialized computing device designed to perform a specific, predefined task or function. Unlike general-purpose computers, which can run a wide variety of applications, an embedded system is dedicated to a particular function, often with real-time performance requirements. These systems are typically composed of tightly integrated hardware and software components, working together to execute their task efficiently.
Hardware Components of an Embedded System
The hardware components of an embedded system include all the electronic elements necessary to perform the function for which the system was designed. While the specific structure of an embedded system can vary greatly depending on the application, there are several fundamental hardware components common to all such systems:
1. Central Processing Unit (CPU)
Executes software instructions to process system inputs and make decisions that guide its operation.
2. System Memory
Stores the programs and data required for the system's operation.
3. Input/Output (I/O) Ports
Allow for the exchange of signals between the CPU and the external world.
4. Communication Ports
Enable information exchange in serial or parallel with other devices or systems.
5. User Interfaces
Facilitate interaction between the system and human users.
6. Electromechanical Sensors and Actuators
7. Data Converters
8. Diagnostic and Redundancy Components
Ensure the system's operation is robust and reliable.
9. System Support Components
Provide essential services required for the system’s operation.
10. Application-Specific Subsystems
Include application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), and other dedicated units based on the complexity of the application.
Software Components of an Embedded System
The software components of an embedded system include all the programs necessary to provide functionality to the system's hardware. These programs, often referred to as firmware, are stored in non-volatile memory. Firmware is generally not modifiable by the user, although some systems allow for updates.
The software programs of an embedded system are organized around:
1. Operating System (OS)
2. Application Routines
3. System Tasks
4. System Kernel
5. Services
The software of an embedded system must be optimized to function in constrained environments, and its structure varies depending on the specific needs of the application. The tasks, kernel, and services form the fundamental architecture that enables efficient collaboration between hardware and software.
Embedded Systems Life Cycle
An embedded system undergoes several stages throughout its life, from conception to decommissioning, and sometimes reuse. These stages are:
1. Conception or Birth
The system is initially conceived to solve a specific problem or improve an existing solution. Opportunities emerge when effective solutions are identified using embedded technology. This stage focuses on defining specifications based on the target application.
2. Design and Development
3. Growth
4. Maturity
5. Decline
Embedded System Design Constraints
Embedded systems must adhere to various constraints depending on their application. Key constraints include:
Design Conflicts and Trade-offs
Designers often face trade-offs between conflicting constraints. Examples include:
Microcomputer Organization
The minimal hardware configuration of a microcomputer system is composed of three fundamental components: a Central Processing Unit (CPU), the system memory, and some form of Input/output Interface. These components are interconnected by multiple sets of lines grouped according of their functions, and globally denominated the systems buses. An additional set of components provide the necessary power and timing synchronization for system operation.
Microprocessor Units (MPU)
A Microprocessor Unit (MPU) is a chip containing a general-purpose CPU that requires external components, such as buses, memory, and I/O interfaces, to function in a system.
MPUs are designed to optimize data and instruction processing through features like caches, queues, parallel execution, branch prediction, and numeric coprocessors. They are widely used in complex systems like PCs and mainframes, offering exceptional computational power.
From the Intel 4004 in 1971 (10 µm, 400 kHz, 2,250 transistors) to the Xeon E7 in 2011 (32 nm, 2 GHz, 2.6 billion transistors), their evolution has been remarkable.
However, for simpler embedded systems, microcontrollers (MCUs) are preferred due to their integration and lower computational power requirements.
Microcontroller Units (MCU)
A Microcontroller Unit (MCU) integrates a CPU, memory (program and data), and various peripherals into a single chip, often referred to as a computer-on-a-chip.
Unlike MPUs, MCUs typically use simpler CPUs but include essential components like timers, I/O ports, interrupt handlers, and data converters, allowing them to implement complete applications with minimal external components. This self-contained design makes MCUs highly efficient for embedded systems and tasks requiring compact, low-power solutions.
Microcontrollers share characteristics with general-purpose microprocessors, but their architectural components are simpler and tailored to specific applications. Each microcontroller belongs to a family defined by a common base architecture (data and program path widths, architectural style, register structure, instruction set, and addressing modes). Differences between family members lie in the amount of integrated memory and the peripherals available on the chip. The market includes hundreds of microcontroller families, offering a wide variety of options to meet diverse needs.
RISC vs CISC Architectures
Microcomputer systems operate using software supported by their hardware architecture, and their design focuses on optimizing either hardware or software. This leads to two main architectural styles: CISC (Complex Instruction Set Computing) and RISC (Reduced Instruction Set Computing).
While RISC architectures are generally considered faster, this depends on the specific instructions being executed and the application requirements.
Central Processing Unit
The Central Processing Unit (CPU) in a microcomputer system is typically a microprocessor unit (MPU) or core. The CPU is where instructions become signals and hardware actions that command the microcomputer operation. The minimal list of components that define the architecture of a CPU include the following:
Hardware Components:
Software Components:
The instructions and addressing modes are defined by the specifications of the ALU (Arithmetic and Logic Unit) and CU (Control Unit) hardware. These components allow the CPU to access programs and data stored in memory or the input/output subsystem, enabling it to function as a stored-program computer. The sequence of instructions that makes up a program is chosen from the processor's instruction set. A memory-stored program dictates the sequence of operations to be performed by the system. In data processing, each CPU component plays a necessary role that complements the others.
The collection of hardware components within the CPU that perform data operations is called the processor's datapath. The CPU datapath includes the ALU, the internal data bus, and other functional components such as floating-point units, hardware multipliers, and so on. The hardware components responsible for system control operations are designated as the Control Path. The control unit is at the heart of the CPU's control path. The bus control unit and all timing and synchronization hardware components are also considered part of the control path.
The Control Unit (CU) manages the operation of the CPU, functioning as a finite state machine that continuously cycles through three main states: fetch, decode, and execute. This cycle is known as the fetch-decode-execute cycle, or instruction cycle. The complete cycle typically takes several clock cycles, depending on the instruction and its operands. Generally, it is assumed that the cycle requires at least four clock cycles, but if the instruction involves multiple words or intermediate steps, the execution may require more than one instruction cycle to complete.
The fetch-decode-execute process involves several CPU components, including special-purpose registers such as the Program Counter (PC) and Instruction Register (IR). The process can be broken down into three distinct states:
After the execution phase, the CU commands the BIL to fetch the next instruction using the updated PC value, thus restarting the cycle. If the instruction requires additional memory accesses, such as fetching values during the decoding phase, the cycle may involve extra steps as determined by the instruction's addressing mode.
The CU operates as a finite state machine and requires a Reset signal to start the cycle for the first time. Upon reset, the Program Counter is initialized to the address of the first instruction, referred to as the reset vector, marking the beginning of the cycle.
The Arithmetic Logic Unit (ALU) is a crucial component of the CPU responsible for performing all arithmetic and logic operations.
The Control Unit (CU) oversees the ALU by directing which operation to perform, specifying the source operands, and determining the destination for the result.
The ALU’s computational capacity is indicated by the width of the operands it can process (known as the datapath width). For example, a 16-bit microprocessor means the ALU can handle 16-bit data. This width determines the architecture of the CPU’s datapath, which in turn sets the size of the data bus and registers.
The Bus Interface Logic (BIL) refers to the CPU structures that coordinate the interaction between the internal buses and the system buses. The BIL defines how the external address, data, and control buses operate. In small embedded systems the BIL is totally contained within the CPU and transparent to the designer. In distributed and high performance systems the BIL may include dedicated peripherals devoted to establish the CPU interface to the system bus. Examples of such extensions are bus control peripherals, bridges, and bus arbitration hardware included in the chip set of contemporary microprocessor systems.
CPU registers provide temporary storage for data, memory addresses, and control information in a way that allows for quick access. They are the fastest form of information storage in a computer system, while being the smallest in capacity. The content of registers is volatile, meaning it is lost when the CPU is powered off. CPU registers can be broadly classified into two main categories: general-purpose registers and specialized registers.
General-purpose registers (GPRs) are not tied to specific processor functions and can be used to store data, variables, or address pointers as needed. Depending on this usage, some authors also classify them as data or address registers. Depending on the processor architecture, a CPU can have anywhere from two to several dozen GPRs.
Specialized registers perform specific functions that add functionality to the CPU. The most basic CPU structure includes the following four specialized registers:
Recommended by LinkedIn
Instruction Register (IR) This register holds the instruction that is currently being decoded and executed by the CPU. The action of transferring an instruction from memory into the IR is called instruction fetch. In many small embedded systems, the IR holds one instruction at a time. CPUs used in distributed and high-performance systems typically have multiple instruction registers arranged in a queue, allowing for the concurrent issuing of instructions to multiple functional units. In these architectures, the IR is often referred to as the instruction queue.
Program Counter (PC) This register holds the address of the instruction to be fetched from memory by the CPU. It is sometimes also called the instruction pointer (IP). Each time an instruction is fetched and decoded, the control unit increments the value of the PC to point to the next instruction in memory. This behavior can be altered, among other things, by jump instructions, which, when executed, replace the contents of the PC with a new address. Since the PC is an address register, its width may also determine the size of the largest program memory space directly addressable by the CPU.
The PC is usually not meant to be directly manipulated by programs. This rule is enforced in many traditional architectures by making the PC inaccessible as an operand to general instructions. Newer RISC architectures have relaxed this rule in an attempt to make programming more flexible. However, this flexibility must be used with caution to maintain correct program flow.
Stack Pointer (SP) The stack is a specialized memory segment used for temporarily storing data items in a particular sequence. The operations of storing and retrieving the items according to this sequence are managed by the CPU using the stack pointer (SP) register. Few microcontroller (MCU) models have the stack hardwired. However, most models allow the user to define the stack within the RAM section, or it is automatically defined during the compilation process.
The contents of the SP are referred to as the Top of Stack (TOS). This indicates to the CPU where new data is stored (push operation) or read (pull or pop operation).
Status Register (SR) The status register, also known as the Processor Status Word (PSW) or flag register, contains a set of indicator bits called flags, as well as other bits related to or controlling the CPU’s state. A flag is a single bit that indicates the occurrence of a particular condition.
The number of flags and conditions indicated by a status register depends on the MCU model. Most flags in the SR reflect the situation immediately after an operation has been executed by the ALU, although they can also be manipulated by software. The status of the flags also depends on the size of the ALU operands. Typically, both operands are of the same size (n bits), while the ALU operation may produce n + 1 bits. The "ALU result" refers to the n least significant bits, while the most significant bit is the Carry Flag. Here are the most common flags found in almost all MCUs:
MCUs may have more flags than the ones mentioned above. The user should refer to the specifications of the microcontroller or microprocessor being used to check the available flags and other bits included in the SR.
System Bus
Memory and I/O devices are accessed by the CPU through system buses. A bus is simply a group of lines that perform a similar function. Each line carries a bit of information, and the group of bits can be interpreted as a whole. The system buses are grouped into three categories: address bus, data bus, and control bus. These are described below.
The set of lines carrying data and instructions to or from the CPU is called the data bus. A read operation occurs when information is transferred into the CPU. A data transfer from the CPU to memory or to a peripheral device is called a write operation. It is important to note that the designation of a transfer on the data bus as "read" or "write" is always made with respect to the CPU. This convention applies to all system components. The data bus lines are generally bidirectional, as the same set of lines allows information to be transferred to or from the CPU. A transfer of information is called a data bus transaction.
The number of lines in the data bus determines the maximum data width the CPU can handle in a single transaction. Larger data transfers are possible but require multiple data bus transactions. For example, an 8-bit data bus can transfer at most one byte (or two nibbles) in a single transaction, and a 16-bit transaction would require two data bus transactions. Similarly, a 16-bit data bus could transfer at most two bytes per transaction; transferring more than 16 bits would require multiple transactions.
The CPU interacts with only one memory register or peripheral device at a time. Each register, either in memory or in a peripheral device, is identified by an identifier called an address. The set of lines carrying this address information forms the address bus. These lines are usually unidirectional and come out of the CPU. Addresses are typically written in hexadecimal notation.
The width of the address bus determines the size of the largest memory space the CPU can address. An address bus of m bits will be able to address at most 2^m different memory locations, which are referenced in hexadecimal notation. For example, with a 16-bit address bus, the CPU can access up to 2^16 = 64K locations, named 0x0000, 0x0001, ..., 0xFFFF. Note that the bits of the address bus lines work as a group, called an address word, and are not considered meaningful individually.
The control bus groups all the lines carrying signals that regulate system activity. Unlike the address and data bus lines, which are typically interpreted as a group (address or data), control bus signals function and are interpreted separately. Control signals include those used to indicate whether the CPU is performing a read or write access, those that synchronize transfers by indicating when the transaction begins and ends, those that request services from the CPU, and other tasks. Most control lines are unidirectional and enter or leave the CPU, depending on their function. The number and function of lines in a control bus vary depending on the CPU architecture and capabilities.
Memory Organization
The memory subsystem stores both instructions and data. It consists of hardware components that store one bit each, organized into n-bit words, which are often referred to as memory cells or locations. Each memory cell contains a memory word, which represents the basic unit of information.
Memory Types
Hardware memory is classified according to two main criteria: storage permanence and write ability. The first criterion refers to the ability of memory to retain its bits after they have been written. Write ability, on the other hand, concerns how easily the content of the memory can be written by the embedded system itself. All memory is readable, otherwise, it would be useless.
From the perspective of storage permanence, the two main subcategories are volatile and non-volatile groups. The first group includes memories that can retain data even when the power is turned off. Volatile memory, on the other hand, loses its content when power is removed.
The non-volatile category includes various read-only memory (ROM) structures as well as Ferro-electric RAM (FRAM or FeRAM). Another special memory is non-volatile RAM (NVRAM), which is actually a volatile memory with a battery backup, this is why NVRAM is not used in microcontrollers. The volatile group includes static RAM (SRAM) and dynamic RAM (DRAM).
From the write ability perspective, memory is classified into read/write or in-system programmable and read-only memories. The first group refers to memories that can be written to by the processor in the embedded system using the memory. Most volatile memories, such as SRAM and DRAM, can be written to during program execution, and these memory sections are therefore useful for temporary data or data that will be generated or modified by the program. FRAM memory is non-volatile, but its write speed is faster than DRAM. Thus, microcontrollers with FRAM memory are very convenient in this aspect.
Most in-system programmable non-volatile memories can only be written when loading the program, but not during execution. One reason for this is the write speed, which is too slow for the program. Another aspect of Flash memory and EEPROM is that writing requires higher voltages than those used during program execution, consuming power and requiring special electronics to increase voltage levels. FRAM, on the other hand, is writable at operational voltage levels during program execution.
FRAM is both non-volatile and writable at speeds comparable to DRAM, and at operational voltage levels. Moreover, unlike Flash or DRAM, it consumes power only during writing and reading operations, making it a very low-power device. A current disadvantage is that its operating temperature ranges are limited, so its application must be in environments where temperature is not too variable. Further research is being done to address this limitation.
As a side note, for historical reasons, in the embedded community, volatile read/write memories are referred to as RAM (random access memory), while the term ROM is used for non-volatile memory. This convention is adopted here unless specific details are needed.
Data Address: Little and Big Endian Conventions
Big Endian: In the big endian convention, data is stored with the most significant byte in the lowest address and the least significant byte in the highest address.
Little Endian: In the little endian convention, data is stored with the least significant byte in the lowest address and the most significant byte in the highest address.
Program and Data Memory
Microcomputers have two types of memory based on their content: Program Memory and Data Memory.
Von Neumann vs Harvard Architectures
In Von Neumann Architecture, both program and data memory share the same system buses. It uses a single set of buses for accessing both types of memory. This architecture is commonly used in systems like the Texas Instruments MSP430 series.
In contrast, Harvard Architecture has separate address spaces for program and data memories, and each memory type uses its own bus. These buses may also have different widths for the program and data subsystems. Examples of MCUs using Harvard architecture include the Microchip PIC and Intel 8051.
While both architectures have their advantages, the Von Neumann architecture is more commonly discussed in general, where the Instruction Register (IR) and Program Counter (PC) have the same width as other CPU registers. In Harvard architecture, the IR and PC widths can be independent of other registers because the buses are separate.
Memory Map
A memory map is a representation of how the addressable space of a microprocessor-based system is allocated. It shows the location of important system addresses and is an essential tool for program planning and selecting an appropriate microcontroller for an application.
In a typical Von Neumann system, the memory map organizes the system's memory as a flat array. For example, a system with a 16-bit address bus and a 64K word addressable space might have the following organization:
Memory maps can be either global or partial. A global map shows the entire addressable space, while a partial map focuses on a specific portion of the space for more detailed analysis. For instance, a partial map could highlight the distribution of I/O peripheral addresses.
I/O Subsystem Organization
The I/O subsystem consists of all devices (peripherals) connected to the system buses, excluding memory and the CPU. These devices are responsible for input, output, or both functions in a microprocessor-based system. The CPU uses a reference to designate a device as input (information flowing into the CPU) or output (information flowing out of the CPU). Some devices, like communication interfaces and storage devices, can perform both input and output tasks.
Examples of input devices include switches, keyboards, bar-code readers, and analog-to-digital converters (ADC), while output devices include LEDs, displays, buzzers, motor interfaces, and digital-to-analog converters (DAC). Some devices can handle both functions, such as serial interfaces, parallel port adapters, and mass storage devices.
Common peripherals in embedded systems include:
The I/O subsystem is organized similarly to memory, with each I/O device requiring an I/O interface to communicate with the CPU. The interface acts as a bridge between the device and system buses, with registers facilitating data exchange, control, and status information. The I/O device appears similar to hardware memory, with its interface registers connected to the data bus and addressed via the address bus. The interface may also contain address decoders, buffers, latches, and drivers depending on the application.
Anatomy of an I/O Interface
An I/O interface connects a microprocessor-based system's address, data, and control buses to external devices. It includes internal registers and lines for data communication with the I/O device. Here's how it is organized:
On the device side, there are dedicated lines connecting the interface to the actual I/O device, with the number and function of these lines varying depending on the device.
Internal Registers in the I/O interface typically include three types:
This setup allows the CPU to communicate with external devices efficiently by using a well-organized I/O interface.
Parallel vs. Serial I/O Interfaces
In microcomputer-on-a-chip systems, most peripherals are connected to the data bus via a parallel interface, where all the bits making up a word are communicated simultaneously, requiring one wire per bit. However, I/O ports interacting with external devices may use either parallel or serial interfaces. These ports are referred to as parallel I/O ports and serial I/O ports. Serial interfaces use only one wire to transfer information, sending one bit at a time.
Serial interfaces were the first to be used for inter-system communications, notably RS-232 for devices requiring long-distance connections. The advantage of serial lines is the cost-saving from using only one wire (plus a ground wire). However, in early computer systems, serial ports were slow, and for fast connections over short distances, parallel interfaces were preferred. As a result, early systems made extensive use of parallel ports for peripherals such as printers, mass storage devices, and I/O subsystems. Protocols such as ISA, EISA, PCI, GPIB, SCSI, and Centronics are some examples of the parallel communication standards that flourished during this era.
With technological advancements, parallel connections have encountered issues as speeds increase, including electromagnetic interference between wires and the difficulty of synchronizing signals at high speeds. Today, there is a shift back toward highly optimized serial channels, even for short-distance connections. Improvements in the hardware and software processes of dividing, labeling, and reassembling packets have enabled much faster serial connections. Nowadays, many peripherals, such as printers, scanners, hard drives, GPS receivers, and others, use serial channels like USB, FireWire, and SATA. All forms of wireless communication are also serial. Even for board-level connections, standards like SPI, I2C, and, at a higher level, PCIe, illustrate the trend of serializing most types of inter- and intra-system communications.
I/O and CPU Interaction
Two major operations are involved when interacting with the I/O subsystem: one is data transfer, i.e., sending or receiving data, and the other is synchronization or timing management. In the first category, input and output port registers are the key components.
Data Transfer: This involves the sending or receiving of data between the I/O ports and the CPU.
Synchronization or Timing Management: This is necessary when the nature of the peripheral or external device requires interaction with the MCU to wait until the device is ready. For example:
To manage this, devices use a flag to indicate their readiness to receive or deliver data. This flag is a flip-flop output set or cleared by the device when it is ready to communicate with the MCU.
Two methods of synchronizing I/O devices are used:
Introduction to Interrupts
The topic of interrupts and resets encompasses both hardware and software aspects, closely tied to the functioning of a CPU.
Peripheral Services: Polling vs. Interrupts
A microcontroller or microprocessor operates as a sequential machine. The control unit executes instructions in memory sequentially according to the stored program. This means the CPU can only execute one instruction at a time. When servicing peripherals, the CPU must access specific instructions to handle each device.
For instance, consider a keyboard. A keyboard can be seen as a set of switches, one for each key, where software assigns a meaning to each key. The CPU retrieves the code corresponding to each key pressed a process referred to as servicing the keyboard. Similarly, the CPU services other peripherals, such as displaying data on a screen or transmitting/receiving messages via a communication interface.
To service a peripheral, two approaches are possible:
Polling-Based Servicing
In polling, the CPU continuously checks the status of a peripheral to determine if it needs service. When a service condition is identified, the peripheral is attended to. This method can be illustrated with a hypothetical example:
Imagine having to answer a phone to take messages, but the phone has no ringer or vibration feature. Your only option to avoid missing a call would be to repeatedly pick up the receiver and ask, "Hello, is someone there?" This process would become tiresome, especially if you have other tasks to perform. Yet, to avoid missing a single call, you must constantly repeat this sequence: pick up, check, hang up.
This method is inefficient because it occupies all your attention, even when no call is incoming. Similarly, this is how a CPU operates in polling-based servicing.
Interrupt-Driven Servicing
In this method, a peripheral sends a signal to the CPU, notifying it that service is required. This signal is known as an Interrupt Request (IRQ). The CPU may be performing other tasks or even in a sleep mode. When an IRQ arrives (if enabled), the CPU pauses its current activity to execute the necessary instructions to service the peripheral. These instructions form what is known as an Interrupt Service Routine (ISR).
Returning to the phone analogy: this time, the phone has a ringer. You are free to focus on other tasks while waiting for a call. When the phone rings, you stop what you’re doing to answer it, knowing for sure that a call is incoming.
Interrupts are a much more efficient way to manage peripherals, allowing the CPU to focus on other meaningful tasks instead of constantly checking the status of peripherals.
Efficiency: Polling vs. Interrupts
In many cases, interrupts offer a more efficient use of the CPU. Consider a simple example where a push button turns on an LED. In a polling system, the CPU may check the button’s state 12,500 times in 100 ms, only to detect one press resulting in an efficiency of just 0.008%. By contrast, with an interrupt, the CPU responds only when the button is pressed, saving thousands of CPU cycles.
However, polling may be preferable in situations where ultra-fast responses are required, or where noise or interference might make interrupts unreliable.
Types of Interrupts
A CPU can handle two types of interrupt requests:
Steps for Interrupt Handling
When an interrupt request is accepted, the CPU typically follows these steps:
Conditions for Supporting Interrupts
To implement an interrupt-driven system, four conditions must be met:
What is a Reset?
A reset is an asynchronous signal that, when applied to an embedded system, causes the CPU and most sequential peripherals to start from a predefined, known state. This state is specified in the device's user guide or data sheet. In a CPU, the reset state enables fetching the first instruction to be executed after a power-up.
A reset occurs when power is applied to the system for the first time or when an event that could compromise the system's integrity happens. These events include a Power-On Reset (POR), Brown-Out Resets (BOR), and others.
A reset can also be triggered by a hardware event, such as the expiration of a watchdog timer, or by asserting the CPU's reset pin, caused either by a user or an external hardware component.
Conclusion
Embedded systems combine hardware and software components to perform specific tasks. Understanding the architecture, processing units (ALU, CU), addressing modes, and interrupt handling is crucial for designing efficient systems. The choice between polling and interrupts directly impacts the system's performance and responsiveness.
The next article will cover Assembly Language Programming, offering more precise control over embedded systems.
Embedded Systems | currently seeking an internship position National Higher Polytechnic Institute (NAHPI)
5moGreat move Nizar Mojab . What about some useful resources for beginners in the field?
Embedded system Developer
5moVery informative