An application or computer program without any debug information may be vulnerable to security hole like buffer overflow, memory leak, format string etc.Today’s operating systems and applications are increasing in lines of code (LOC). Windows operating systems have approximately 40 million LOC. Unix and Linux operating systems have much less, usually around 2 million LOC. A common estimate used in the industry is that there are between 5–50 bugs per 1,000 lines of code. So a middle of the road estimate would be that Windows 7 has approximately 1,200,000 bugs.If the software did not contain 5–50 exploitable bugs within every 1,000 lines of code, we would not have to build the fortresses we are constructing today.Many companies(for example Microsoft) ship products that hide what appear to be an almost infinite number of break-in vulnerabilities. They try to hide these problems by keeping their source code secret.this does make your job harder but not impossible. Analyzing machine code isn't so complex as to stop hackers for long.All you need is think from a hacker’s perspective. But remember, stealing code or breaking code without permission of owner is illegal.An ethical hacker is one who break code under controlled circumstances. If you found vulnerabilities in software, just report to the vendors.I have no intention to provide illegal techniques.this article is strictly for educational purposes only.
The best strategy is to write code that has as few bugs as possible. This can be achieved by using pseudo-code and verifying the logic of the pseudo-code even before you attempt to translate it into an assembly language program. To isolate a bug, program execution should be observed in slow motion. Most debuggers provide a command to execute a program in single-step mode. Debuggers provide commands to set up breakpoints. The program execution stops at breakpoints, giving us a chance to look at the state of the program. Another helpful feature that most debuggers provide is the watch facility.
GDB, the GNU Project debugger, allows you to see what is going on `inside' another program while it executes -- or what another program was doing at the moment it crashed.
GDB can do four main kinds of things (plus other things in support of these) to help you catch bugs in the act:
Start your program, specifying anything that might affect its behavior.
Make your program stop on specified conditions.
Examine what has happened, when your program has stopped.
Change things in your program, so you can experiment with correcting the effects of one bug and go on to learn about another.
Now let's discuss useful operation of GDB in debugging the program.
1. To run the program in GDB:
gdb file_name
For example, to debug the HelloWorld program :
gdb HelloWorld
2.If arguments as well have to be passed to the program to be loaded into GDB, following options can be use :
gdb YourProgramName args arg1 arg2 arg3 … argN
Displaying Source Code :
list ; To list the source code of executable loaded displays the Source Code and default number of lines.
Displaying Register Contents :
info registers To see the content and state of all registers
info all-registers Displays the contents of registers.
info register ... Displays contents of the specified registers.
For example,
info eax ecx edx ; to check the contents of the eax, ecx, and edx registers.
Memory display commands :
x address Displays the contents of memory at address (uses defaults).
x/nfu address Displays the contents of memory at address
Or,
(gdb) x/FMT &Label_name To see the value of variable (useful in case of integers).
(gdb) x/1s &Label_name To see the whole string in single-shot (useful in case of strings).
(gdb) x/1s register To see the whole string in single-shot located at the address stored in register.
(gdb) x/1s 0x080000 To see the whole string in single-shot at a particular address.
Break point Commands :
Use the “break” or “b” command at gdb prompt to specify a location which could be a function name, a line number or a source file and line number.
For example, the following commands insert breakpoint at line 20(assume) and function sum on line 32(assume) in a program:
(gdb) b 20
Breakpoint 1 at 0x80560a0: file HelloWorld.asm, line 20.
(gdb) b sum
Breakpoint 2 at 0x80560j5: file HelloWorld.asm, line 32.
(gdb)
Note:
* We can use info breakpoints (or simply info b) to get a summary of breakpoints and their status.
* We can use the enable and disable commands to enable or disable the breakpoints.
More Breakpoint commands :
break main to set a break point at the function “main”
break 5 to set a break point at the code line number 5
break function Sets a breakpoint at entry to the specified function in the current source file
break *_start+1 include “nop” on the very next line of it to get a break
point there
delete Deletes all breakpoints.
Program execution commands :
run Executes the program under GDB
continue Continues execution from where the program has last stopped (e.g., due to a breakpoint).
step Single-steps execution of the program (i.e., one source line at a time).
More Examine command :
print variable_name To see the value of a variable in decimal
print /x variable_name To see the value of a variable in hex
print /c variable_name To see the value of a variable in ASCII
print &Label_name To see the address of Label_name
print /x &Label_name To see the address of Lable_name in better format
print /c eax To see the value in register in ASCII
print /d eax To see the value in register in Decimal
print /x eax To see the value in register in HEX
Now let's use these features of GDB in real world. First we will write a simple computer C program:
#include <stdio.h>
int main()
{
printf("Hello World");
}
Now load this program in GDB:
root@r00t:~/Desktop/c_programming/blog_tutorial# gcc simple.c
root@r00t:~/Desktop/c_programming/blog_tutorial# gdb -q ./a.out
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/Desktop/c_programming/blog_tutorial/a.out...done.
(gdb) list
1 #include<stdio.h>
2
3 int main()
4 {
5 printf("Hello World");
6
7 }
(gdb) disassemble main
Dump of assembler code for function main:
0x0804841c <+0>: push %ebp
0x0804841d <+1>: mov %esp,%ebp
0x0804841f <+3>: and $0xfffffff0,%esp
0x08048422 <+6>: sub $0x10,%esp
0x08048425 <+9>: movl $0x80484d0,(%esp)
0x0804842c <+16>: call 0x8048300 <printf@plt>
0x08048431 <+21>: leave
0x08048432 <+22>: ret
End of assembler dump.
(gdb)break main
Breakpoint 1 at 0x8048425: file simple.c, line 5.
(gdb) run
Starting program: /root/Desktop/c_programming/blog_tutorial/a.out
Breakpoint 1, main () at simple.c:5
5 printf("Hello World");
(gdb) info register eip
eip 0x8048425 0x8048425 <main+9>
First we use list command to list the source code of executable. Then disassembly of the main() function is displayed. Then we use break main command to set a breakpoint at the start of main(), and the program is run. As We have already discussed that this break command simply tells the debugger to pause the execution of the program when it gets to that point. Since the breakpoint has been set at the start of the main() function, the program hits the breakpoint and pauses before actually executing any instructions(here printf('HelloWorld') is the next instruction to be executed after main() ) in main().
Then we use info eip command where eip is the register. This command simply displays contents of the specified register(here is eip). So the value of EIP (the Instruction Pointer) is displayed.In all of assembly registers, we have to concentrate on EIP(Enhanced Instruction Pointer). This register contains the pointer to the instruction ready for the processing. Thus if by any means we can control this pointer in EIP register, we will have the control over the CPU of victim machine.By modifying the EIP, if we fill it with the address of buffer, which is controlled by us and is filled with machine code, then the processor will ultimately be derailed from its normal execution and will execute the code supplied by us. This is the way buffer overflow attack works.
The GDB debugger provides a direct method to examine memory, using the command x, which is short for examine. Examining memory is a critical skill for any hacker. With a debugger like GDB, every aspect of a program's execution can be deterministically examined, paused, stepped through, and repeated as often as needed. Since a running program is mostly just a processor and segments of memory, examining memory is the first way to look at what's really going on.
As we had already discussed Memory display commands which help us into look at a certain address of memory in a variety of ways. Now we will use them in real world. In GDB, memory can be display in many format octal, binary, hexadecimal,standard base-10 format etc.
Some common format letters are as follows:
o ; to display in octal.
x ; to display in hexadecimal.
u ; to display in unsigned, standard base-10 decimal.
t ; Display in binary.
(gdb) x/o 0x8048425
0x8048425 <main+9>: 032011002307
(gdb) x/x $eip
0x8048425 <main+9>: 0xd02404c7
(gdb) x/u $eip
0x8048425 <main+9>: 3492021447
(gdb) x/t $eip
0x8048425 <main+9>: 11010000001001000000010011000111
We use these instruction in examine current address of the EIP register in various format.The value 032011002307 in octal is the same as 0xd02404c7 in hexadecimal, which is the same as 3492021447 in base-10 decimal, which in turn is the same as 11010000001001000000010011000111 in binary.
A number can also be pretended to the format of the examine command to examine multiple units at the target address:
(gdb) x/2x $eip
0x8048425 <main+9>: 0xd02404c7 0xe8080484
(gdb) x/5x $eip
0x8048425 <main+9>: 0xd02404c7 0xe8080484 0xfffffecf 0x9090c3c9
0x8048435: 0x90909090
The default size of a single unit is a four-byte unit called a word.The size of the display units for the examine
command can be changed by adding a size letter to the end of the format letter.
Some size letters are as follows:
b; A single byte
h; A halfword, which is two bytes in size
w; A word, which is four bytes in size
g; A giant, which is eight bytes in size
Now let use them in our program:
(gdb) x/8xb $eip
0x8048425 <main+9>: 0xc7 0x04 0x24 0xd0 0x84 0x04 0x08 0xe8
(gdb) x/8xh $eip
0x8048425 <main+9>: 0x04c7 0xd024 0x0484 0xe808 0xfecf 0xffff 0xc3c9 0x9090
(gdb) x/8xw $eip
0x8048425 <main+9>: 0xd02404c7 0xe8080484 0xfffffecf 0x9090c3c9
0x8048435: 0x90909090 0x90909090 0x55909090 0xc35de589
First examine shows the first two bytes to be 0xc7 and 0x04, but when a halfword is examined at the exact same memory address, the value 0x04c7 is shown, with the bytes reversed. This same byte-reversal effect can be seen when a full four-byte word is shown as 0xd02404c7, but when the first four bytes are shown byte by byte, they are in the order of 0xc7, 0x04, 0x24, and 0xd0. Why ? Hint: Use your little knowledge of endian architecture.
nexti command is used to execute the current instruction also known as next instruction.The processor will read the instruction at EIP, execute it, and advance EIP to the next instruction. let's see :
(gdb) nexti
0x0804842c 5 printf("Hello World");
(gdb) x/i $eip
=> 0x804842c <main+16>: call 0x8048300 <printf@plt>
The c format letter can be used to automatically look up a byte on the ASCII table, and the s format letter will display an entire string of character data.
(gdb) x/xw $esp
0xbffff490: 0x080484d0
(gdb) x/6cb 0x080484d0
0x80484d0: 72 'H' 101 'e' 108 'l' 108 'l' 111 'o' 32 ' '
(gdb) x/s 0x080484d0
0x80484d0: "Hello World"
These commands reveal that the data string "Hello, world!\n" is stored at memory address 0x080484d0.
Looking at the full disassembly again, you should be able to tell which parts of the C code have been compiled into which machine instructions.
(gdb) disassemble main
Dump of assembler code for function main:
0x0804841c <+0>: push %ebp
0x0804841d <+1>: mov %esp,%ebp
0x0804841f <+3>: and $0xfffffff0,%esp
0x08048422 <+6>: sub $0x10,%esp
0x08048425 <+9>: movl $0x80484d0,(%esp)
0x0804842c <+16>: call 0x8048300 <printf@plt>
0x08048431 <+21>: leave
0x08048432 <+22>: ret
End of assembler dump.
(gdb) list
1 #include<stdio.h>
2
3 int main()
4 {
5 printf("Hello World");
6
7. }
(gdb)
I have discussed gdb commands which helpful to hackers in examine binary program. if you like this post or have any question, please feel free to comment !
Reference Material :
1.Debugging with gdb
2. GDB Documentation
3. Hacking the art of exploitation
The best strategy is to write code that has as few bugs as possible. This can be achieved by using pseudo-code and verifying the logic of the pseudo-code even before you attempt to translate it into an assembly language program. To isolate a bug, program execution should be observed in slow motion. Most debuggers provide a command to execute a program in single-step mode. Debuggers provide commands to set up breakpoints. The program execution stops at breakpoints, giving us a chance to look at the state of the program. Another helpful feature that most debuggers provide is the watch facility.
GDB, the GNU Project debugger, allows you to see what is going on `inside' another program while it executes -- or what another program was doing at the moment it crashed.
GDB can do four main kinds of things (plus other things in support of these) to help you catch bugs in the act:
Start your program, specifying anything that might affect its behavior.
Make your program stop on specified conditions.
Examine what has happened, when your program has stopped.
Change things in your program, so you can experiment with correcting the effects of one bug and go on to learn about another.
Now let's discuss useful operation of GDB in debugging the program.
1. To run the program in GDB:
gdb file_name
For example, to debug the HelloWorld program :
gdb HelloWorld
2.If arguments as well have to be passed to the program to be loaded into GDB, following options can be use :
gdb YourProgramName args arg1 arg2 arg3 … argN
Displaying Source Code :
list ; To list the source code of executable loaded displays the Source Code and default number of lines.
Displaying Register Contents :
info registers To see the content and state of all registers
info all-registers Displays the contents of registers.
info register ... Displays contents of the specified registers.
For example,
info eax ecx edx ; to check the contents of the eax, ecx, and edx registers.
Memory display commands :
x address Displays the contents of memory at address (uses defaults).
x/nfu address Displays the contents of memory at address
Or,
(gdb) x/FMT &Label_name To see the value of variable (useful in case of integers).
(gdb) x/1s &Label_name To see the whole string in single-shot (useful in case of strings).
(gdb) x/1s register To see the whole string in single-shot located at the address stored in register.
(gdb) x/1s 0x080000 To see the whole string in single-shot at a particular address.
Break point Commands :
Use the “break” or “b” command at gdb prompt to specify a location which could be a function name, a line number or a source file and line number.
For example, the following commands insert breakpoint at line 20(assume) and function sum on line 32(assume) in a program:
(gdb) b 20
Breakpoint 1 at 0x80560a0: file HelloWorld.asm, line 20.
(gdb) b sum
Breakpoint 2 at 0x80560j5: file HelloWorld.asm, line 32.
(gdb)
Note:
* We can use info breakpoints (or simply info b) to get a summary of breakpoints and their status.
* We can use the enable and disable commands to enable or disable the breakpoints.
More Breakpoint commands :
break main to set a break point at the function “main”
break 5 to set a break point at the code line number 5
break function Sets a breakpoint at entry to the specified function in the current source file
break *_start+1 include “nop” on the very next line of it to get a break
point there
delete Deletes all breakpoints.
Program execution commands :
run Executes the program under GDB
continue Continues execution from where the program has last stopped (e.g., due to a breakpoint).
step Single-steps execution of the program (i.e., one source line at a time).
More Examine command :
print variable_name To see the value of a variable in decimal
print /x variable_name To see the value of a variable in hex
print /c variable_name To see the value of a variable in ASCII
print &Label_name To see the address of Label_name
print /x &Label_name To see the address of Lable_name in better format
print /c eax To see the value in register in ASCII
print /d eax To see the value in register in Decimal
print /x eax To see the value in register in HEX
Now let's use these features of GDB in real world. First we will write a simple computer C program:
#include <stdio.h>
int main()
{
printf("Hello World");
}
Now load this program in GDB:
root@r00t:~/Desktop/c_programming/blog_tutorial# gcc simple.c
root@r00t:~/Desktop/c_programming/blog_tutorial# gdb -q ./a.out
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/Desktop/c_programming/blog_tutorial/a.out...done.
(gdb) list
1 #include<stdio.h>
2
3 int main()
4 {
5 printf("Hello World");
6
7 }
(gdb) disassemble main
Dump of assembler code for function main:
0x0804841c <+0>: push %ebp
0x0804841d <+1>: mov %esp,%ebp
0x0804841f <+3>: and $0xfffffff0,%esp
0x08048422 <+6>: sub $0x10,%esp
0x08048425 <+9>: movl $0x80484d0,(%esp)
0x0804842c <+16>: call 0x8048300 <printf@plt>
0x08048431 <+21>: leave
0x08048432 <+22>: ret
End of assembler dump.
(gdb)break main
Breakpoint 1 at 0x8048425: file simple.c, line 5.
(gdb) run
Starting program: /root/Desktop/c_programming/blog_tutorial/a.out
Breakpoint 1, main () at simple.c:5
5 printf("Hello World");
(gdb) info register eip
eip 0x8048425 0x8048425 <main+9>
First we use list command to list the source code of executable. Then disassembly of the main() function is displayed. Then we use break main command to set a breakpoint at the start of main(), and the program is run. As We have already discussed that this break command simply tells the debugger to pause the execution of the program when it gets to that point. Since the breakpoint has been set at the start of the main() function, the program hits the breakpoint and pauses before actually executing any instructions(here printf('HelloWorld') is the next instruction to be executed after main() ) in main().
Then we use info eip command where eip is the register. This command simply displays contents of the specified register(here is eip). So the value of EIP (the Instruction Pointer) is displayed.In all of assembly registers, we have to concentrate on EIP(Enhanced Instruction Pointer). This register contains the pointer to the instruction ready for the processing. Thus if by any means we can control this pointer in EIP register, we will have the control over the CPU of victim machine.By modifying the EIP, if we fill it with the address of buffer, which is controlled by us and is filled with machine code, then the processor will ultimately be derailed from its normal execution and will execute the code supplied by us. This is the way buffer overflow attack works.
The GDB debugger provides a direct method to examine memory, using the command x, which is short for examine. Examining memory is a critical skill for any hacker. With a debugger like GDB, every aspect of a program's execution can be deterministically examined, paused, stepped through, and repeated as often as needed. Since a running program is mostly just a processor and segments of memory, examining memory is the first way to look at what's really going on.
As we had already discussed Memory display commands which help us into look at a certain address of memory in a variety of ways. Now we will use them in real world. In GDB, memory can be display in many format octal, binary, hexadecimal,standard base-10 format etc.
Some common format letters are as follows:
o ; to display in octal.
x ; to display in hexadecimal.
u ; to display in unsigned, standard base-10 decimal.
t ; Display in binary.
(gdb) x/o 0x8048425
0x8048425 <main+9>: 032011002307
(gdb) x/x $eip
0x8048425 <main+9>: 0xd02404c7
(gdb) x/u $eip
0x8048425 <main+9>: 3492021447
(gdb) x/t $eip
0x8048425 <main+9>: 11010000001001000000010011000111
We use these instruction in examine current address of the EIP register in various format.The value 032011002307 in octal is the same as 0xd02404c7 in hexadecimal, which is the same as 3492021447 in base-10 decimal, which in turn is the same as 11010000001001000000010011000111 in binary.
A number can also be pretended to the format of the examine command to examine multiple units at the target address:
(gdb) x/2x $eip
0x8048425 <main+9>: 0xd02404c7 0xe8080484
(gdb) x/5x $eip
0x8048425 <main+9>: 0xd02404c7 0xe8080484 0xfffffecf 0x9090c3c9
0x8048435: 0x90909090
The default size of a single unit is a four-byte unit called a word.The size of the display units for the examine
command can be changed by adding a size letter to the end of the format letter.
Some size letters are as follows:
b; A single byte
h; A halfword, which is two bytes in size
w; A word, which is four bytes in size
g; A giant, which is eight bytes in size
Now let use them in our program:
(gdb) x/8xb $eip
0x8048425 <main+9>: 0xc7 0x04 0x24 0xd0 0x84 0x04 0x08 0xe8
(gdb) x/8xh $eip
0x8048425 <main+9>: 0x04c7 0xd024 0x0484 0xe808 0xfecf 0xffff 0xc3c9 0x9090
(gdb) x/8xw $eip
0x8048425 <main+9>: 0xd02404c7 0xe8080484 0xfffffecf 0x9090c3c9
0x8048435: 0x90909090 0x90909090 0x55909090 0xc35de589
First examine shows the first two bytes to be 0xc7 and 0x04, but when a halfword is examined at the exact same memory address, the value 0x04c7 is shown, with the bytes reversed. This same byte-reversal effect can be seen when a full four-byte word is shown as 0xd02404c7, but when the first four bytes are shown byte by byte, they are in the order of 0xc7, 0x04, 0x24, and 0xd0. Why ? Hint: Use your little knowledge of endian architecture.
nexti command is used to execute the current instruction also known as next instruction.The processor will read the instruction at EIP, execute it, and advance EIP to the next instruction. let's see :
(gdb) nexti
0x0804842c 5 printf("Hello World");
(gdb) x/i $eip
=> 0x804842c <main+16>: call 0x8048300 <printf@plt>
The c format letter can be used to automatically look up a byte on the ASCII table, and the s format letter will display an entire string of character data.
(gdb) x/xw $esp
0xbffff490: 0x080484d0
(gdb) x/6cb 0x080484d0
0x80484d0: 72 'H' 101 'e' 108 'l' 108 'l' 111 'o' 32 ' '
(gdb) x/s 0x080484d0
0x80484d0: "Hello World"
These commands reveal that the data string "Hello, world!\n" is stored at memory address 0x080484d0.
Looking at the full disassembly again, you should be able to tell which parts of the C code have been compiled into which machine instructions.
(gdb) disassemble main
Dump of assembler code for function main:
0x0804841c <+0>: push %ebp
0x0804841d <+1>: mov %esp,%ebp
0x0804841f <+3>: and $0xfffffff0,%esp
0x08048422 <+6>: sub $0x10,%esp
0x08048425 <+9>: movl $0x80484d0,(%esp)
0x0804842c <+16>: call 0x8048300 <printf@plt>
0x08048431 <+21>: leave
0x08048432 <+22>: ret
End of assembler dump.
(gdb) list
1 #include<stdio.h>
2
3 int main()
4 {
5 printf("Hello World");
6
7. }
(gdb)
I have discussed gdb commands which helpful to hackers in examine binary program. if you like this post or have any question, please feel free to comment !
Reference Material :
1.Debugging with gdb
2. GDB Documentation
3. Hacking the art of exploitation