What is IRQ?

Introduction

Have you seen “irq” in /var/log/messages? I’m going to understand what it is.

Asynchronous Interrupt

IRQ stands for Interrupt Request. It’s a signal from hardware devices to the kernel.

On the other hand, there are synchronous interrupts, which are the opposite of asynchronous interrupts. In synchronous interrupts, software triggers the interruption. I think the processor keeps executing the same thread even when software triggers it; that’s why this is called “synchronous.” In other words, an asynchronous interrupt is sent to a processor that is executing a different process or thread. For example, when a process initiates disk I/O and waits for its completion, a context switch occurs and the processor executes another process. When the disk I/O completes, an IRQ is triggered.

The kernel has interrupt handlers. There are two types: top-half and bottom-half. The top-half is a handler that initially receives interrupts. This handler is designed to pass tasks to another handler because it cannot accept interrupts while processing them. The handlers that take over these tasks are called bottom-halves.

The following are examples of asynchronous interrupts:

Example about data reception by NIC

Bottom-half

I understand that there are two main types of bottom-halves. One is SoftIRQ and the other is Workqueue. Workqueue was developed later and provides more flexibility.

How to view statistics

Interrupt statistics

cat /proc/interrupts

           CPU0       CPU1       
  1:         10          0   IO-APIC   1-edge      i8042
  4:          0       4488   IO-APIC   4-edge      ttyS0
  8:          0          0   IO-APIC   8-edge      rtc0
  9:          0          0   IO-APIC   9-fasteoi   acpi
 12:          0         88   IO-APIC  12-edge      i8042
 24:          2         12   PCI-MSI 65536-edge      nvme0q0
 25:      25975          0   PCI-MSI 65537-edge      nvme0q1
 26:          0      23572   PCI-MSI 65538-edge      nvme0q2
 27:         25        639   PCI-MSI 81920-edge      ena-mgmnt@pci:0000:00:05.0
 28:          0      26672   PCI-MSI 81921-edge      eth0-Tx-Rx-0
 29:      36088          0   PCI-MSI 81922-edge      eth0-Tx-Rx-1
NMI:          0          0   Non-maskable interrupts
LOC:      54290      52843   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          0          0   Performance monitoring interrupts
IWI:         22          9   IRQ work interrupts
RTR:          0          0   APIC ICR read retries
RES:     201117     208227   Rescheduling interrupts
CAL:     133058     142956   Function call interrupts
TLB:       5636       3375   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
DFR:          0          0   Deferred Error APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:          3          3   Machine check polls
ERR:          0
MIS:          0
PIN:          0          0   Posted-interrupt notification event
NPI:          0          0   Nested posted-interrupt event
PIW:          0          0   Posted-interrupt wakeup event

Each row is formatted as follows:

IRQ No: CPU0 CPU1 ... InterruptionController TriggerMode Devices

Column descriptions:

Example explanations:

Special interrupts:

From this information, we can see which CPU handles which interrupt and the load balancing situation.

Softirq statictics

cat /proc/softirqs

                    CPU0       CPU1       
          HI:          0          0
       TIMER:      17540      15624
      NET_TX:        121          1
      NET_RX:      36284      26838
       BLOCK:          2         13
    IRQ_POLL:          0          0
     TASKLET:         26         13
       SCHED:      32870      33113
     HRTIMER:          2          0
         RCU:      77026      73895

SoftIRQ is one of the bottom-half mechanisms that takes over processing from the top-half that handles hardware interrupts. Each row shows the number of interrupts executed by each CPU, grouped by SoftIRQ type.

SoftIRQ types:

Examples to view:

NET_RX:      36284      26838

Performance analysis usage:

Demo code

The following demo code demonstrates using tasklet as a bottom-half. It uses a timer to trigger interrupts. The timer_handler acts as the top-half and delegates processing via tasklet_schedule. The tasklet_handler is a function that emulates heavy processing. In my experiments, I confirmed that the top-half completes its processing in about 200 ns on average.

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/timer.h>
#include <linux/jiffies.h>
#include <linux/interrupt.h>
#include <linux/delay.h>

static struct timer_list my_timer;
static struct tasklet_struct my_tasklet;
static int interrupt_count = 0;
static ktime_t top_half_start;

static void tasklet_handler(struct tasklet_struct *t)
{
    ktime_t end;
    s64 total_duration;
    
    mdelay(5);
    
    end = ktime_get();
    total_duration = ktime_to_ns(ktime_sub(end, top_half_start));
    
    printk(KERN_INFO "TASKLET: Bottom Half completed, Total duration: %lld ns\n", 
           total_duration);
}

static void timer_handler(struct timer_list *t)
{
    ktime_t top_half_end;
    s64 top_half_duration;
    
    top_half_start = ktime_get();
    
    interrupt_count++;
    
    tasklet_schedule(&my_tasklet);
    
    top_half_end = ktime_get();
    top_half_duration = ktime_to_ns(ktime_sub(top_half_end, top_half_start));
    
    printk(KERN_INFO "TASKLET: Top Half %d completed, Duration: %lld ns\n", 
           interrupt_count, top_half_duration);
    
    mod_timer(&my_timer, jiffies + HZ);
}

static int __init tasklet_demo_init(void)
{
    printk(KERN_INFO "TASKLET: Module loaded\n");
    
    tasklet_setup(&my_tasklet, tasklet_handler);
    timer_setup(&my_timer, timer_handler, 0);
    mod_timer(&my_timer, jiffies + HZ);
    
    return 0;
}

static void __exit tasklet_demo_exit(void)
{
    del_timer(&my_timer);
    tasklet_kill(&my_tasklet);
    printk(KERN_INFO "TASKLET: Module unloaded, Total IRQs: %d\n", interrupt_count);
}

module_init(tasklet_demo_init);
module_exit(tasklet_demo_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Tasklet Bottom Half Demo");

From a performance perspective

According to 4.3. Interrupts and IRQ Tuning, each IRQ have a property called smp_affinity, which defines the CPU core that can execute the Interrupt Service Request for that IRQ. Application performance can be improved when configured properly.

By assigning the same CPU core for both interrupt processing and application threads, you can achieve:

You can view or modify the IRQ affinity in the /proc/irq/IRQ_NUMBER/smp_affinity file. The values are represented as a hexadecimal bitmask, where each bit corresponds to a CPU core.

Example: Configuring only CPU 0 to process eth0 (IRQ 32) on a 4-core system

# confirm current settings (f = all cpu core)
cat /proc/irq/32/smp_affinity
f

# set it to only CPU0
echo 1 > /proc/irq/32/smp_affinity

# confirm it changed
cat /proc/irq/32/smp_affinity
1

Closing Thoughts

In this article, I learned about the unfamiliar term “IRQ.” I learned that properly configuring IRQ affinity can provide performance benefits. However, from a network performance perspective, IRQ affinity configuration is not always optimal because of mechanisms like Receive Side Scaling (RSS), which improves network packet reception by distributing packets across multiple cores. Additionally, NUMA topology should be considered when configuring affinity settings.

This article is written by K.Waki

Software Engineer. English Learner. Opinions expressed here are mine alone.