Understanding PCI in the Linux Kernel: Architecture, Registers, Speed, and Testing

Peripheral Component Interconnect (PCI) is a standard interface for connecting peripherals to a computer’s motherboard. This article delves into how PCI works in the Linux kernel, covering its architecture, PCI speed, register descriptions, and practical methods for testing PCI devices and speed. We’ll also explore the use of huge pages with PCI for performance optimization.


PCI Architecture

The PCI subsystem in Linux enables communication between the CPU and connected devices. Key components include:

1. PCI Host Bridge

  • Connects the PCI bus to the CPU and memory.
  • Mediates transactions between PCI devices and system memory.

2. PCI Bus

  • A hierarchical structure with a root bus and optional subordinate buses.
  • Buses can contain multiple devices.

3. PCI Devices

  • Each device has up to 8 functions.
  • Functions are addressed by Bus, Device, and Function numbers (BDF).

4. PCI Configuration Space

  • Each device/function has a 256-byte configuration space (or 4 KB for PCIe).
  • Configuration registers control the device’s behavior.

PCI Speed and Generations

The PCI standard evolved to improve speed and bandwidth:

  • PCI: 133 MB/s (32-bit, 33 MHz).
  • PCI-X: Up to 4.3 GB/s (64-bit, 533 MHz).
  • PCIe: Scalable lanes (x1, x2, …, x16) with speeds like PCIe 4.0 (16 GT/s per lane).

PCI Registers

PCI configuration space is divided into:

  1. Standard Header (64 bytes):

    • Device and vendor identification.
    • Command and status registers.
    • Base Address Registers (BARs).
  2. Device-Specific Area:

    • Optional fields for additional capabilities.

Key Registers

  • Vendor ID and Device ID:
    • Identifies the device.
    • Offset: 0x00.
  • Command Register:
    • Controls device state (e.g., enabling/disabling memory or I/O access).
    • Offset: 0x04.
  • BARs:
    • Map device memory or I/O regions into system space.
    • Offset: 0x10–0x24.

Reading PCI Registers in Linux

You can read PCI configuration space using lspci or setpci.

Example:

1
lspci -v

Testing PCI Speed

PCI speed tests typically involve measuring bandwidth and latency. Tools and techniques include:

1. Benchmark Tools

  • lspci: Displays PCI device information.
  • dd: Measures read/write speed on PCI storage devices.
  • Custom Programs: Use tools like iperf for networking cards.

Example: Testing PCIe bandwidth:

1
dd if=/dev/nvme0n1 of=/dev/null bs=1M count=1000

2. Using Kernel Drivers

  • Write a kernel module to measure read/write latency on PCI devices.
  • Use ioremap() to map BAR memory and measure data access speed.

Testing PCI Devices

1. Using lspci

lspci lists all PCI devices and their configurations:

1
lspci -v

2. Using sysfs

Inspect PCI devices through /sys/bus/pci/devices:

1
2
3
cd /sys/bus/pci/devices/0000\:00\:1f.2
cat vendor
cat device

3. Using Debugfs

Debugfs provides low-level access to PCI registers:

1
2
mount -t debugfs none /sys/kernel/debug
cat /sys/kernel/debug/pci/0000:00:1f.2/config

4. Writing a Simple Driver

Create a kernel module to probe a PCI device, map its BAR, and interact with it.

Example: Simple PCI driver skeleton:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#include <linux/module.h>
#include <linux/pci.h>
#include <linux/io.h>

static int pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) {
void __iomem *bar;
pci_enable_device(pdev);
bar = pci_iomap(pdev, 0, pci_resource_len(pdev, 0));
if (!bar)
return -ENOMEM;

// Interact with the device memory
iowrite32(0x1, bar);
pci_iounmap(pdev, bar);
return 0;
}

static void pci_remove(struct pci_dev *pdev) {
pci_disable_device(pdev);
}

static const struct pci_device_id pci_ids[] = {
{ PCI_DEVICE(0x1234, 0x5678), }, // Replace with your vendor/device IDs
{ 0, },
};

static struct pci_driver pci_driver = {
.name = "my_pci_driver",
.id_table = pci_ids,
.probe = pci_probe,
.remove = pci_remove,
};

module_pci_driver(pci_driver);
MODULE_LICENSE("GPL");

Using Huge Pages with PCI Devices

Why Use Huge Pages?

  • Reduces TLB (Translation Lookaside Buffer) misses.
  • Improves performance for PCI devices requiring large memory regions (e.g., GPUs, NICs).

Steps to Enable Huge Pages

  1. Reserve Huge Pages:

    • Set the number of huge pages:
      1
      echo 128 > /proc/sys/vm/nr_hugepages
  2. Allocate Huge Pages:

    • Use mmap() in userspace to allocate huge pages for DMA buffers.
  3. Modify PCI Driver:

    • Use dma_map_page() or dma_map_single() for DMA transfers.
    • Ensure physical memory alignment to huge page boundaries.

Example: Allocating Huge Pages for DMA

1
2
3
4
5
6
7
8
#include <linux/dma-mapping.h>

void *dma_buffer = dma_alloc_coherent(&pdev->dev, SZ_2M, &dma_handle, GFP_KERNEL);
if (!dma_buffer)
return -ENOMEM;

// Use dma_buffer for PCI device communication
dma_free_coherent(&pdev->dev, SZ_2M, dma_buffer, dma_handle);

Summary

1. How PCI Works in Linux

  • Hierarchical structure with host bridge, buses, and devices.
  • Kernel interfaces like lspci, sysfs, and debugfs expose device details.

2. PCI Configuration and Speed

  • Configuration space provides device control and status information.
  • PCI speed varies by standard and lane configuration (e.g., PCIe 4.0).

3. PCI Testing

  • Use tools (lspci, dd) and custom drivers to benchmark and debug devices.

4. PCI and Huge Pages

  • Huge pages optimize memory access for PCI devices requiring large DMA regions.

Understanding the Linux PCI subsystem equips developers with the knowledge to debug, optimize, and extend PCI device support. From hardware diagnostics to performance tuning, the subsystem offers robust tools and APIs for modern computing needs.