PCIe Evolution,  Questions and Answers

PCIe Evolution, Questions and Answers

Evolution of PCIe Generations

The development of PCIe from its inception in 2003 to the latest generation, focusing on improvements in speed, bandwidth, encoding, and features.

Gen 1 (2003)

Key Features:

  • Launched with a data rate of 2.5 GT/s (Giga-transfers per second).
  • Supported link widths: x1, x2, x4, x8, x16, a standard maintained across all generations.
  • Used 8b/10b encoding, which adds 25% overhead (2 bits per 8-bit data byte), yielding a total bandwidth of 4 Gb/s for an x16 link.
  • The encoding ensured DC balance and supported physical-layer functions like packet start/end markers.

Gen 2 (2007)

Key Improvements:

  • Doubled the data rate to 5.0 GT/s, increasing per-lane throughput from 250 Mbps to 500 Mbps.
  • Retained 8b/10b encoding.
  • Introduced features for graphics applications and higher slot power limits for performance cards.
  • Backward Compatibility: Gen 2 devices work with Gen 1 systems, reducing costs (e.g., 4 links at 5 GT/s match 8 links at 2.5 GT/s in bandwidth).

Gen 3 (2010)

Key Advancements:

  • Increased data rate to 8.0 GT/s (not 10 GT/s, to avoid design challenges like new PCB materials or shorter channel lengths).
  • Switched to 128b/130b encoding, reducing overhead to 1.5% and boosting bandwidth by 20%. Additional optimizations (e.g., removing K-codes) provided the remaining 20% to double bandwidth.
  • Added features like I/O virtualization, device sharing, caching hints, and atomics for virtual machines and accelerators.

Gen 4 (2017)

Key Features:

  • Doubled the data rate to 16.0 GT/s, with a longer development time to ensure cost and power efficiency.
  • Increased channel loss budget to 28 dB (from 22 dB in Gen 3) and used advanced materials like Megatron-2.
  • Platforms compensated for the delay by raising lane counts (up to 128 per CPU socket), tripling aggregate bandwidth.
  • Enhanced RAS features for PCIe storage and deeper low-power states for mobile devices.

Gen 5 (2019)

Key Improvements:

  • Doubled the data rate to 32.0 GT/s, providing 32 Gbps per lane and 128 Gb/s for x16.
  • Met demands of AI, ML, cloud computing, and 400G Ethernet with improved signal integrity (better traces, thicker PCBs).
  • Supported alternate protocols (e.g., coherency) over PCIe pins.
  • Channel loss extended to 36 dB.

Gen 6 (2021)

Key Advancements:

  • Doubled the data rate to 64.0 GT/s.
  • Adopted PAM4 signaling (4 voltage levels) to maintain channel reach, but introduced a higher bit error rate (BER).
  • Added Forward Error Correction (FEC) and FLIT encoding (fixed-size packets) to manage errors efficiently with low latency.
  • Maintained backward compatibility, ensuring coexistence with prior generations.


PCIe FAQs:

What are the key differences between a Root Complex, Endpoint, Switch, and Bridge in PCIe?

•                     Root Complex (RC): Connects the CPU to PCIe devices.

•                     Endpoint (EP): A device (GPU, SSD, NIC) that communicates with the Root Complex.

•                     Switch: Provides connectivity between multiple PCIe devices.

•                     Bridge: Connects legacy PCI/PCI-X devices to PCIe.

What is PCIe Enumeration?

PCIe enumeration is the process of detecting the devices connected to the PCIe bus.

As part of PCIe enumeration, switches and endpoint devices are allocated memory from the PCIe slave address space of the HOST.

The enumeration process includes: -

– Initialization of BAR address of endpoint and switches

– allocation and initialization of MSI/MSI-X address for the devices.

– empowers bus-mastering capabilities of the device to initiate transactions on the bus.

– Initialization of different capabilities of the devices like power-management, max-payload size, etc

How does PCIe address the challenge of clock skew in high-speed serial communication?

PCIe uses a reference clock at both ends of the link and incorporates clock data recovery mechanisms to address clock skew. This ensures that the receiving device can accurately recover the clock signal from the incoming data, even in the presence of variations in signal propagation times.

What are the layers of PCIe?

PCIe can be divided into three discrete logical layers: the Transaction Layer, the Data Link Layer, and the Physical Layer. Each of these layers is divided into two sections: one that processes outbound (to be transmitted) data and one that processes inbound (received) data.

What is PCIe Equalization?

When the PCIe link is at GEN3 or higher speeds, then there can be less signal quality (bad eye).

Equalization is the process of compensating for the distortion introduced by the channel. After passing through a band-limited channel, the high-frequency components of the signal are heavily attenuated, which distorts the signal and spreads it into subsequent symbol periods.

This is visible as a closed eye in the eye diagram. The process of equalization produces a sufficiently open eye, as in Figure 1, and decreases Inter Symbol Interference (ISI). This facilitates the easier recovery of transmitted symbols, ultimately reducing the Bit Error Rate (BER). The Link equalization procedure enables components to adjust the Transmitter and the Receiver setup of each lane to improve the signal quality (good eye). The equalization procedure can be initiated either autonomously or by software.

How does PCIe address the challenge of signal degradation over longer distances in high-speed serial links?

PCIe addresses signal degradation through the use of equalization techniques. Adaptive equalization adjusts the characteristics of the transmitted signal to compensate for signal degradation over longer distances, maintaining signal integrity and allowing for reliable communication.

What is trainable equalization?

Trainable equalization refers to the ability to change the tap coefficients. Each Tx, channel, and Rx combination will have a unique set of coefficients yielding an optimum signal-to-noise ratio. The training sequence comprises adjustments to the tap coefficients while applying a quality measurement to minimize the error.

What’s the role of LTSSM (Link Training and Status State Machine) in PCIe?

LTSSM (Link Training and Status State Machine) manages the link initialization, training, and recovery in PCIe.

•                     It ensures proper link negotiation between devices.

•                     It transitions through states like Detect, Polling, Configuration, Recovery, L0 (Active), etc.

•                     Responsible for speed negotiation, lane alignment, and error recovery.

Describe the concept of ‘Replay Buffer’ in PCIe. How does it improve data integrity?

Each transmitter stores recently sent TLPs in a Replay Buffer. If the receiver detects an error (via LCRC check) and sends a NAK, the transmitter retransmits the stored TLPs, ensuring data integrity.

What is the role of LCRC in PCIe? At which layer is it used?

LCRC (Link CRC) is a checksum added by the Data Link Layer to detect transmission errors. If an error is found, the receiver sends a NAK to request a retransmission.

Why does PCIe use credit-based flow control instead of traditional ACK/NACK?

Credit-based flow control prevents buffer overflow and eliminates the need for per-packet ACKs, reducing overhead.

What is the purpose of the PCIe configuration space.

The PCIe configuration space is a region of memory that contains configuration registers for each device on the bus. These registers store information about device capabilities, status, and other configuration details. Software can access this space to configure and query information about connected devices.

Discuss the concept of peer-to-peer communication in PCIe.

PCIe supports peer-to-peer communication, allowing devices to communicate directly without involving the CPU. This feature is particularly beneficial in scenarios where data needs to be transferred efficiently between devices, reducing latency and offloading the CPU from handling data transfers.

How does PCIe address the challenges of link latency, and what mechanisms are in place to optimize latency in the communication link?

PCIe minimizes link latency through features like TLP (Transaction Layer Packet) processing hints, which allow devices to provide information about the criticality of a transaction. Additionally, features such as split transactions and completion coalescing are employed to improve overall system responsiveness.

Discuss the impact of clock and power gating on power consumption in PCIe.

Clock and power gating are power management techniques used in PCIe to reduce power consumption during idle periods. By selectively turning off clock signals or powering down specific components when not in use, PCIe devices can achieve lower power states, contributing to overall energy efficiency in the system.

How does PCIe handle congestion?

If a receiver runs out of buffer space, it stops advertising credits, preventing further data transmission.

Discuss the role of the PCIe TLP prefix in optimizing data transfers.

The TLP (Transaction Layer Packet) prefix is a field in the TLP header that provides additional information about the payload. It includes information such as the payload type, allowing devices to optimize their handling of the data. This feature helps enhance the efficiency of data transfers in PCIe.

How does PCIe handle interrupt delivery?

PCIe uses Message Signaled Interrupts (MSI) and MSI-X to handle interrupts. Instead of sharing a limited number of interrupt lines as in traditional PCI, each device generates its own interrupt request by sending a message to the interrupt controller.

Explain the role of the PCIe Extended Capability Structure in enhancing device capabilities.

The PCIe Extended Capability Structure is a mechanism that allows devices to expose additional capabilities beyond the standard PCIe configuration space. This enables devices to provide more detailed information about their features and functionalities, enhancing interoperability and facilitating advanced configuration.

What is the role of the PCIe retimer, and how does it improve signal integrity in the communication link?

A PCIe retimer is a component that helps restore and reshape signals in the communication link, improving signal integrity. It is often used in scenarios where signal degradation occurs due to factors like long trace lengths or the presence of connectors. The retimer ensures that the received signals are of sufficient quality for reliable communication.

Explain the concept of ordered sets in PCIe and their significance in link training.

Ordered sets are special sequences of bits used for link training and maintenance. They help establish and maintain synchronization between transmitting and receiving devices. Ordered sets contain specific bit patterns that indicate different link states, helping ensure reliable and error-free communication.

What is scrambling?

PCI Express utilizes data scrambling to diminish the chance of electrical resonances on the link. PCI Express specification defines a scrambling/descrambling algorithm that is carried out utilizing a linear feedback shift register.

Scrambling is a technique where a realized binary polynomial is applied to a data stream in a feedback topology. Since the scrambling polynomial is known, the data can be recovered by running it through a feedback topology using the inverse polynomial.

PCI Express accomplishes scrambling or descrambling by performing a serial XOR operation to the data with the seed output of a Linear Feedback Shift Register (LFSR) synchronized between PCI Express devices.

How does Flow Control work in PCIe?

Flow Control in PCIe prevents buffer overflow and ensures smooth data transmission. It operates using:

Credit-Based Flow Control: Transmitters request permission before sending packets.

Receivers allocate credits for different buffer types (Posted, Non-Posted, Completion).

Data is sent only when sufficient credits are available, preventing congestion.

Flow control ensures reliable data transfer, avoiding packet drops and deadlocks.

Can you explain the concept of virtual channels in PCIe?

Virtual channels in PCIe allow the division of the physical link into multiple virtual lanes, each with its own flow control mechanism. This feature helps prioritize and manage traffic based on different Quality of Service (QoS) requirements, enhancing the overall efficiency of data transfer.

How does PCIe handle QoS (Quality of Service)?

PCIe uses Traffic Classes (TC) and Virtual Channels (VC) to ensure high-priority data gets transmitted first.

Explain how PCIe performs error detection and correction. What are the different types of errors?

PCIe uses:

•                     LCRC (Link CRC): Detects transmission errors.

•                     ACK/NAK Mechanism: Retransmits corrupted data.

•                     Advanced Error Reporting (AER): Reports and handles errors at the OS level.

Types of errors:

•                     Correctable Errors: Automatically fixed (e.g., bit flips).

•                     Uncorrectable Errors: Fatal errors requiring system intervention.

How does PCIe support error reporting and recovery in the event of a link failure?

PCIe incorporates Advanced Error Reporting (AER) mechanisms to detect, report, and recover from errors in the communication link. When errors occur, devices can generate error messages, allowing the system to take appropriate actions, such as retraining the link or isolating the affected component.

What is a Non-Posted vs. Posted transaction in PCIe?

•                     Posted Transaction (P): No response required (e.g., Memory Write).

•                     Non-Posted Transaction (NP): Requires a response (e.g., Memory Read).

How does PCIe support atomic operations?

PCIe supports AtomicOps (e.g., Atomic Compare & Swap, Atomic Fetch & Add) to ensure thread-safe operations across multiple devices.

Explain the difference between DLLP and TLP in PCIe.

Article content

Explain the concept of Max Payload Size (MPS) and Max Read Request Size (MRRS).

•                    MPS (Max Payload Size): Defines the largest TLP payload a device can send.

•                    MRRS (Max Read Request Size): Limits the maximum data a single read request can fetch.

Larger values improve throughput but may increase latency.

What are the different power management states in PCIe, and how do they contribute to energy efficiency?

PCIe supports various power management states, including L0 (fully operational), L0s (low-power idle), L1 (lower-power idle), and L2 (power-off). Devices can transition between these states based on workload requirements, contributing to overall energy efficiency by dynamically adjusting power

consumption.

How does PCIe handle latency-sensitive data, such as real-time audio/video streams?

PCIe supports Traffic Classes (TC) and Virtual Channels (VC) to prioritize latency-sensitive transactions.

What is the function of a Completion TLP?

A Completion TLP (Cpl) is used to return data from a Non-Posted Request (e.g., memory read response).

Discuss the concept of "completion timeout" in PCIe.

Completion timeout is the maximum time a device should take to respond to a transaction request. If a device doesn't respond within this time, it is considered a timeout, and the link may be reset. This mechanism ensures that devices operate within specified time limits, preventing system hangs.

What is the role of the PCIe bifurcation feature, and how does it impact system design?

PCIe bifurcation allows a single physical slot to be split into multiple logical slots, each with its own set of lanes. This feature is valuable in optimizing the use of available PCIe lanes, especially in systems where the number of physical slots is limited. It offers flexibility in accommodating different device configurations.

Explain the concept of ATS (Address Translation Services) in PCIe.

Address Translation Services (ATS) in PCIe enables a device to request translation services from the endpoint that manages its address space. This is particularly useful in virtualized environments where devices might have different address spaces, and ATS facilitates efficient address translation without involving the CPU.

Explain the concept of hot swapping in PCIe.

Hot swapping refers to the ability to replace or add PCIe devices while the system is running. PCIe supports hot swapping through features like surprise removal, where the system can detect the removal or addition of a device and dynamically reconfigure the PCIe topology without requiring a system reboot.

How does PCIe support hot-plugging of devices?

PCIe supports hot-plugging by utilizing the Advanced Error Reporting (AER) mechanism. When a device is hot-plugged, the AER capability allows the system to detect the change and reconfigure the PCIe topology without requiring a system reboot. This feature is particularly useful for server environments where uninterrupted operation is critical.

Explain how PCIe supports multi-function devices and the advantages of this capability.

PCIe allows a single physical device to have multiple logical functions, each with its own set of configuration space and resources. This enables more efficient use of PCIe slots and resources, as a single physical device can provide multiple functionalities without the need for separate physical slots.

What is Forward Error Correction (FEC), and how is it utilized in the PCIe 6.0 specification?

Lightweight Forward Error Correction (FEC) and Strong Cyclic Redundancy Check (CRC) are the two essential techniques utilized in the PCIe 6.0 specification to address errors.

With the 64 GT/s data rate enabled by PAM4 encoding in the PCIe 6.0 specification, the bit error rate (BER) was several orders of magnitude higher than the 10-12 BER in all prior generations. FEC and CRC mitigate the bit error rate and allow the PCIe 6.0 specification to reach new performance degrees.

Flit Mode supports the higher BER expected in PAM4 (10-6 vs. 10-12 in NRZ). This can provide increased resilience in NRZ environments.

How does PCIe implement ordering rules? Explain the concept of relaxed ordering.

PCIe normally maintains strict ordering for transactions, but Relaxed Ordering (RO) allows certain transactions (e.g., writes) to bypass others for improved performance.

What software changes were needed to take advantage of Flit Mode?

Much care has been taken to keep away from significant impacts to existing software. However, some changes could not be avoided in order to take full advantage of Flit mode. Here are some examples:

  • The new TLP format changes how we interpret error logs.
  • Inter-hierarchy (a.k.a., inter-segment) routing allows the use of larger trees of PCIe technology devices to communicate directly.
  • Defined routing behavior for all TLP-type codes allows for additional TLPs to be defined without needing to change switches.

Discuss the role of ECN (Explicit Congestion Notification) in PCIe.

Explicit Congestion Notification in PCIe allows devices to communicate congestion information, enabling more efficient traffic management. Devices can signal congestion explicitly, allowing the system to take appropriate actions such as re-routing traffic or adjusting transmission rates to alleviate congestion.

Does putting a PCIe Gen3 video card in a Gen4 slot improve performance?

No, on the off chance that the designs card itself is PCIe 3.0, placing it in a faster 4.0 slot won’t give any advantage since they will be working at Gen3 speed.

What’s the role of Forward Error Correction (FEC) in Gen6, and how does it impact performance?

Role of FEC in PCIe Gen6:

• FEC (Forward Error Correction) is introduced to correct transmission errors introduced by PAM4 signaling.

• Since PAM4 has a higher Bit Error Rate (BER) than NRZ, FEC helps ensure data reliability.

Impact on Performance:

Error Correction: Reduces the need for retransmissions, improving overall efficiency.

Increased Latency: FEC adds a small processing delay (~2-4ns), but the benefits of higher bandwidth outweigh this.

Flit Mode Integration: PCIe Gen6 operates in 256-byte flits (Flow Control Units), where FEC is embedded for error handling.

What are the key differences between PCIe and NVMe, and how do they complement each other in modern storage architectures?

PCIe is a high-speed interconnect standard, while NVMe (Non-Volatile Memory Express) is a protocol designed specifically for storage devices, often using PCIe as the physical interface. PCIe provides the high-speed communication channel, and NVMe optimizes the communication protocol for low-latency access to non-volatile storage, resulting in faster storage solutions.

Why did PCIe Gen6 switch to PAM4 instead of NRZ?

·         PCIe Gen6 adopted PAM4 (Pulse Amplitude Modulation - 4 levels) instead of NRZ (Non-Return-to-Zero) because:

·         Higher Data Rate at the Same Frequency: PAM4 packs 2 bits per clock cycle instead of 1 (as in NRZ), doubling bandwidth without increasing clock speed.

·         Signal Integrity Concerns: At 64 GT/s, using NRZ would have caused excessive signal loss and increased power consumption.

·         Power Efficiency: PAM4 enables lower power per bit transmission, improving overall efficiency.

·         Challenges with PAM4: It has higher bit error rates (BER), which is why Forward Error Correction (FEC) is introduced in Gen6.

Trade-off:

PAM4 is more complex and requires additional error correction (FEC), but it allows PCIe to scale without significantly increasing power consumption.

How does PCIe ensure security in data transmission?

PCIe incorporates security features such as ECN (End-to-End data Corruption Notification) and TLP processing hints to enhance data integrity and security. Additionally, features like Access Control Services (ACS) help isolate and protect devices from unauthorized access, contributing to a more secure PCIe ecosystem.

What is ACS.

Disbale Peer-to-Peer PCIe transaction

Access Control Services (ACS) provides a mechanism by which a Peer-to-Peer PCIe transaction can be forced to go up through the PCIe Root Complex. ACS can be thought of as a kind of gate-keeper —preventing unauthorized transactions from occurring.

Without ACS, it is possible for a PCIe Endpoint to either accidentally or intentionally (maliciously) write to an invalid/illegal area on a peer endpoint, potentially causing problems.

To view or add a comment, sign in

More articles by Ajazul Haque

  • DDR2 SDRAM Technology Guide:

    Overview DDR2 SDRAM (Double Data Rate 2 Synchronous Dynamic Random-Access Memory) delivers higher bandwidth, lower…

  • RISCV ISA (Instruction Set Architecture)

    The RV32I base integer ISA provides essential 32-bit integer operations with 47 instructions, while RV64I extends this…

  • The Rise of RISC-V: A Flexible Solution for Modern SoC Design

    Growing Complexity of SoCs As computing demands surge—driven by applications in Machine Learning, Multimedia, and…

    1 Comment
  • DRAM Architecture and Standards:

    1. What is DRAM? DRAM (Dynamic Random Access Memory) is a type of volatile memory used in computers to temporarily…

  • Memory Systems in Modern ASIC/SoCs

    Arithmetic and logical operations in computing systems are executed by the Central Processing Unit (CPU). Modern…

    1 Comment
  • All my UVM/SV articles link path:

    https://www.linkedin.

  • PCIe: All my articles links

    PCI/PCIe Device Type and Topology: PCIe System Architecture: Transaction Layer in PCIe Part -1: Overview and TLP PCIe…

  • Rethinking Education for a Changing World

    One of the defining challenges of our time is the environment in which our children are growing up and the uncertain…

  • PCIe : Overview of Address Translation Service (ATS)

    The Address Translation Service (ATS) is a key feature in the PCIe architecture designed to manage memory access for…

  • PCIe Transaction Layer issues and TLP Errors:

    Common Issues: Transaction Layer: 1. Interpretation of the Read Completion Boundary (RCB) Parameter: · The RCB defines…

    1 Comment

Insights from the community

Others also viewed

Explore topics