1. Before the second 32 Chapter , We learned to drive through oops Locate the error code line

oops The code is as follows :

Unable to handle kernel paging request at virtual address 56000050 // Unable to process virtual address for kernel page request 56000050pgd = c3850000[56000050] *pgd=00000000Internal error: Oops: 5 [#1] // internal error oopsModules linked in: 26th_segmentfault
// Indicates that an internal error occurred in 26th_segmentfault.ko In the driver module CPU: 0    Not tainted  (2.6.22.6 #2)PC is at first_drv_open+0x78/0x12c [26th_segmentfault] //PC value : The last address of the program running successfully , be located first_drv_open() In the function , Offset value 0x78, The total size of the function 0x12cLR is at 0xc0365ed8             //LR value /* Each register value at the time of the error */pc : [<bf000078>]    lr : [<c0365ed8>]    psr: 80000013sp : c3fcbe80  ip : c0365ed8  fp : c3fcbe94
r10: 00000000  r9 : c3fca000  r8 : c04df960
r7 : 00000000  r6 : 00000000  r5 : bf000de4  r4 : 00000000r3 : 00000000  r2 : 56000050  r1 : 00000001  r0 : 00000052Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  Segment user
Control: c000717f  Table: 33850000  DAC: 00000015Process 26th_segmentfau (pid: 813, stack limit = 0xc3fca258) // When something goes wrong , The process name is 26th_segmentfaultStack: (0xc3fcbe80 to 0xc3fcc000)        // The stack information , From the bottom of the stack 0xc3fcbe80 To the top of the stack 0xc3fcc000be80: c06d7660 c3e880c0 c3fcbebc c3fcbe98 c008d888 bf000010 00000000 c04df960
bea0: c3e880c0 c008d73c c0474e20 c3fb9534 c3fcbee4 c3fcbec0 c0089e48 c008d74c
bec0: c04df960 c3fcbf04 00000003 ffffff9c c002c044 c380a000 c3fcbefc c3fcbee8
bee0: c0089f64 c0089d58 00000000 00000002 c3fcbf68 c3fcbf00 c0089fb8 c0089f40
bf00: c3fcbf04 c3fb9534 c0474e20 00000000 00000000 c3851000 00000101 00000001bf20: 00000000 c3fca000 c04c90a8 c04c90a0 ffffffe8 c380a000 c3fcbf68 c3fcbf48
bf40: c008a16c c009fc70 00000003 00000000 c04df960 00000002 be84ce38 c3fcbf94
bf60: c3fcbf6c c008a2f4 c0089f88 00008588 be84ce84 00008718 0000877c 00000005bf80: c002c044 4013365c c3fcbfa4 c3fcbf98 c008a3a8 c008a2b0 00000000 c3fcbfa8
bfa0: c002bea0 c008a394 be84ce84 00008718 be84ce30 00000002 be84ce38 be84ce30
bfc0: be84ce84 00008718 0000877c 00000003 00008588 00000000 4013365c be84ce58
bfe0: 00000000 be84ce28 0000266c 400c98e0 60000010 be84ce30 30002031 30002431Backtrace:                                        // Backtracking information [<bf000000>] (first_drv_open+0x0/0x12c [26th_segmentfault]) from [<c008d888>] (chrdev_open+0x14c/0x164)
 r5:c3e880c0 r4:c06d7660[<c008d73c>] (chrdev_open+0x0/0x164) from [<c0089e48>] (__dentry_open+0x100/0x1e8)
 r8:c3fb9534 r7:c0474e20 r6:c008d73c r5:c3e880c0 r4:c04df960[<c0089d48>] (__dentry_open+0x0/0x1e8) from [<c0089f64>] (nameidata_to_filp+0x34/0x48)[<c0089f30>] (nameidata_to_filp+0x0/0x48) from [<c0089fb8>] (do_filp_open+0x40/0x48)
 r4:00000002[<c0089f78>] (do_filp_open+0x0/0x48) from [<c008a2f4>] (do_sys_open+0x54/0xe4)
 r5:be84ce38 r4:00000002[<c008a2a0>] (do_sys_open+0x0/0xe4) from [<c008a3a8>] (sys_open+0x24/0x28)[<c008a384>] (sys_open+0x0/0x28) from [<c002bea0>] (ret_fast_syscall+0x0/0x2c)Code: bf000094 bf0000b4 bf0000d4 e5952000 (e5923000)Segmentation fault

1.1 So why in the last chapter , We use the wrong app , But it didn't print oops, As shown in the figure below :

 Insert picture description here
Next , Let's configure the kernel , To print the application's oops

2. First of all, search for oops Inside :Unable to handle kernel Print statement , It depends on which function is printed

As shown in the figure below , Found in the __do_kernel_fault() Function :
 Insert picture description here

3. Keep looking , Find out __do_kernel_fault() By do_bad_area() call

 Insert picture description here
do_bad_area() function , Analyze literally , Indicates that the code is executed to the location of the error segment

among user_mode(regs) function , By judgment CPSR If the register is in user mode, it returns 0, Otherwise, it returns a positive number .

So the wrong application in the previous chapter will call __do_user_fault() function

4.__do_user_fault() The function is shown below :

 Insert picture description here
From the picture above , To print error messages for an application , It also needs to be :

3.1 Configure the kernel , Set macro CONFIG_DEBUG_USER( As long as the macro is "CONFIG_" start , It's all about configuration )

1) stay make menuconfig Search inside DEBUG_USER, As shown in the figure below :
 Insert picture description here
So will Kernel hacking-> Verbose user fault messages Set as Y, And burn the core again

3.2 send if (user_debug & UDBG_SEGV) It's true

1) among user_debug The definitions are as follows :
 Insert picture description here
Obviously uboot The command line character passed in contains "user_debug=" when , Will call user_debug_setup()->get_option(), In the end "user_debug=" The following string is extracted to user_debug Variable .

such as : When the command line character contains "user_debug=0xff" when , be user_debug The variable is equal to 0xff

2) among UDBG_SEGV The definitions are as follows :

#define UDBG_UNDEFINED  (1 << 0)        // Undefined instruction appears in user mode code (UNDEFINED)#define UDBG_SYSCALL (1 << 1)           // User mode system calls are obsolete (SYSCALL)     #define UDBG_BADABORT    (1 << 2)       // User mode data error aborted (BADABORT) #define UDBG_SEGV     (1 << 3)         // User mode code segment error (SEGV)#define UDBG_BUS       (1 << 4)        // User mode access busy (BUS)

From the analysis of the above definition , We just need to put user_debug Set to 0xff, All the above conditions are true .

such as : When undefined instructions appear in user mode code , because user_debug Its lowest =1, So print out oops.

therefore , Get into uboot, stay uboot Add... To the command line : “user_debug=0xff”
4. Boot kernel , test

As shown in the figure below , Executing the wrong application , Only the individual register values are printed , And function call relationships , Without stack information :
 Insert picture description here

5. Next , Continue to modify the kernel , Make the application oops Also print out stack information

In the driving oops Are there in "Stack: " This field , Search for "Stack: " have a look , In which function

5.1 As shown in the figure below , Found in the __die() Function :

 Insert picture description here
This __die() Will be die() call ,die() Will be __do_kernel_fault() call , And our application calls __do_user_fault() Not in it die() function , So it didn't print out Stack The stack information .

In the picture above dump_mem():

dump_mem("Stack: ", regs->ARM_sp,THREAD_SIZE + (unsigned long)task_stack_page(tsk)); // Print stack The stack information 

Mainly through sp The stack address in the register , Print every stack address 32 Bit data , Stack address will be added 4( One address to save 8 position , So add 4).

And then we go through this principle , To modify the __do_user_fault()

5.2 stay __do_user_fault(), Add the following sections :

static void  __do_user_fault(struct task_struct *tsk, unsigned long addr,unsigned int fsr, unsignedint sig, int code,struct pt_regs *regs){   struct siginfo si;   unsigned long val ;   int i=0;#ifdef CONFIG_DEBUG_USER   if (user_debug & UDBG_SEGV) {  printk(KERN_DEBUG "%s: unhandled page fault (%d) at 0x%08lx, code 0x%03x\n", tsk->comm, sig, addr, fsr);  show_pte(tsk->mm, addr);  show_regs(regs);printk("Stack: \n");while(i<1024){   /* copy_from_user() It is only used to check whether the address is valid , If it works , Get the address data , otherwise break */   if(copy_from_user(&val, (const void __user *)(regs->ARM_sp+i*4), 4))   break;printk("%08x ",val); // Print data i++;if(i%8==0)printk("\n");}printk("\n END of Stack\n");   }#endif   tsk->thread.address = addr;   tsk->thread.error_code = fsr;   tsk->thread.trap_no = 14;   si.si_signo = sig;   si.si_errno = 0;   si.si_code = code;   si.si_addr = (void __user *)addr;   force_sig_info(sig, &si, tsk);}

6. Reburning the kernel , test

As shown in the figure below :
 Insert picture description here
Next , Let's analyze PC value ,Stack Stack , How to call

7. First of all, let's analyze PC value , Identify the error code

1) Generate disassembly :

arm-linux-objdump -D test_debug > test_debug.dis

2) Search for PC value 84ac, As shown in the figure below :
 Insert picture description here
As can be seen from the above , Mainly is to 0x12(r3) Put in the address 0x00(r2) in

and 0x00 It's an illegal address , So wrong.

8. analysis Stack The stack information , Determine the function call procedure

Reference resources : 37.Linux Driver debugging - according to oops Stack information of , Determine the function call procedure

8.1 During analysis , encounter main() The return address of the function is :LR=40034f14

The virtual address of the kernel is c0004000~c03cebf4, And there's no such address in the disassembly , So this is the address of a dynamic library .

You need to use the static link method , Next, recompile , Disassembly , function :

#arm-linux-gcc -o -static  test_debug test_debug.c  //-static    Static links , The resulting file will be very large ,  The advantage is that you don't need a DLL , It can also run #arm-linux-objdump -D test_debug > test_debug.dis

8.2 Final , find main() The return address of the function is __lobc_start_main() in

So when a function goes wrong, the calling process :
__lobc_start_main()->
main()->
A()->
B()->
C() // take 0x12(r3) Put in the address 0x00(r2) in