When the kernel or driver is dead bug, Cause the system can't run normally , How to find out which function's position caused ?

answer , Through the system clock of the kernel , Because it is generated by timer interrupt , It's triggered at regular intervals , So when CPU Always in the process , We print the process information in the interrupt function

1. Let's first recall

In the previous kernel interrupt process , When a kernel interrupt occurs , I will do the following steps :

1)pc-4( Calculate the return address value ), Then save the values of each register to sp In the stack
2) Get interrupt number , obtain sp Address , And then call asm_do_IRQ()

1.1 among asm_do_IRQ The function prototype is as follows :

asmlinkage void __exception asm_do_IRQ(unsigned int irq, struct pt_regs *regs);  
                         //irq: Interrupt number         *regs: Base address of each register before interrupt (=sp Base address )

1.2 among pt_regs The structure members are shown in the figure below , An array used to hold the contents of each register :

 Insert picture description here

2. So this program's , modify asm_do_IRQ() function , Add the following :

1) Judge irq If it is equal to that of the system clock irq, then cnt++
2) If in 10s after , The process of acquisition has not changed , Just print : Process name 、PID、(regs-> ARM_pc)-4
(PS: Why print PC-4? Because the PC It's the return address , and PC-4 It's just CPU Address of operation )

3. First, find the interrupt number of the system clock irq

Input #cat /proc/interrupt, As shown in the figure below :
 Insert picture description here
The interrupt number comes from linux-2.6.22.6\include\asm-arm\arch-s3c2410\Irqs.h

and S3C2410 Timer Tick, It's our system clock count , In the kernel, it is jiffies This global variable , Every once in a while +1.

therefore S3C2410 Timer Tick Of The interrupt number is 30

4. Next, let's modify asm_do_IRQ() function

stay asm_do_IRQ() in , Add the following sections

asmlinkage void __exception asm_do_IRQ(unsigned int irq, struct pt_regs *regs){ struct pt_regs *old_regs = set_irq_regs(regs); struct irq_desc *desc = irq_desc + irq;  
   #ifdef 1static pid_t pre_pid;                    // Process number   static int cnt=0;                          // Count value if(irq==30)          // Judge irq Interrupt number , Whether it is equal to the system clock {  if(pre_pid==current->pid){   
            cnt++;}else{cnt=0;   
            pre_pid=current->pid;}if(cnt==10*HZ)   // Overtime 10s{cnt=0;printk("s3c2410_timer_interrupt : pid = %d, task_name = %s\n",current->pid,current->comm);printk("pc = %08x\n",regs->ARM_pc);}}     #endif ... ...}

1) among current Is a macro , by task_struct Structure , Represents the current running process information , The macro passed get_current() To get process information , be located include\asm-arm\current.h in

**current->pid:** Of the current process PID Number

**current->com:** Represents the current process's name

2) HZ It's also a macro , For each S The frequency of , Like every other 10ms Add 1, that HZ Is equal to 100

5. test run

Next , We'll install one with while(1) The drive of the dead cycle , And then through the test program , The kernel will always be while(1) Dead cycle , Into a dead state .

Because of the modification asm_do_IRQ() After the function , So it will print the following information :
 Insert picture description here

5.1 Then you can go through pc value =bf0000C, You can find out which function is wrong

( Reference resources :http://www.cnblogs.com/lifexy/p/8006748.html)