Many people(Programmers and hackers) thinks that high level language like Java, c/c++ are more useful than the Assembly language. But this is the wrong myth among them.Because assembly language allows you to do things you can't do in other programming languages. Without assembly language you will not be able to find the 0day against software , because debugger only output in asm code.In reality you don't need to be able to code in assembly, you need to be able to analyses malware and exploits and that is something else completely from coding for functionality.Actually you don't need to be coder but you should able to read it, understand it. So let's start...
Every computer has it's heart that have exactly two things : CPU and Memory.
The 8086 CPU was the first x86 processor. It was developed and manufactured by Intel.
The processor(CPU) is the heart of the your system. Basically It is designed to do following things:-
-> Fetch an instruction from memory.
-> Execute the instruction.
The CPU contains the following elements to accomplish this:
Program Counter
Instruction Decoder
Data bus
General-purpose registers
Arithmetic and logic unit
But here i am not discuss in details for the sack of simplicity.
Now let's write a very simple C program:
#include <stdio.h>
int main()
{
printf("Hello World");
}
Let’s start by looking at the machine code the main() function was translated into.
The GNU development tools include a program called objdump, which can be used to examine compiled binaries.
There are certain options -a, -d, -D, -f, -g, -G, -h, -H, -p, -r, -S, -t, -T, -V, -x that must be given to tell objdump what information to show.
Usage :
objdump [options] objfiles
options :
-d, –disassemble
Display assembler mnemonic names for the machine instructions. Disassemble only sections that are expected to contain instructions.
-D, –disassemble-all
Disassemble all sections, not just those expected to contain instructions.
-f, –file-header
Display overall header summary information.
-g, –debugging
Display debugging information.
-h, –section-header, –header
Display section-header summary information.
-H, –help
Display help information and exit.
-S, –source
Display source code intermixed with dis-assembly, if possible. Implies -d.
-V, –version
Print version information and exit.
root@r00t:~/Desktop/c_programming/blog_tutorial# gcc firstprog.c
root@r00t:~/Desktop/c_programming/blog_tutorial# objdump -D a.out | grep -A20 main.:
0804841c <main>:
804841c: 55 push %ebp
804841d: 89 e5 mov %esp,%ebp
804841f: 83 e4 f0 and $0xfffffff0,%esp
8048422: 83 ec 10 sub $0x10,%esp
8048425: c7 04 24 d0 84 04 08 movl $0x80484d0,(%esp)
804842c: e8 cf fe ff ff call 8048300 <printf@plt>
8048431: c9 leave
8048432: c3 ret
8048433: 90 nop
8048434: 90 nop
8048435: 90 nop
8048436: 90 nop
8048437: 90 nop
8048438: 90 nop
8048439: 90 nop
804843a: 90 nop
804843b: 90 nop
804843c: 90 nop
804843d: 90 nop
804843e: 90 nop
The programs are stored in memory of computer. A computer program is nothing more than a collection of numbers stored in memory.Processor needs its own “memory” to save current working information locally. These memory spaces are called registers. Registers are used to store data temporarily.The registers are physically located in the processor itself, so it doesn’t have to fetch anything from RAM.It can be considered to be a sort of basic variable, which can hold any value that the processor stores in it.
The x86 (Intel family) CPUs provide several general purpose registers for application use:
EAX, EBX, ECX, EDX, ESP, EBP, ESI, and EDI
The first four registers (EAX, ECX, EDX, and EBX) are known as general purpose registers.
The second four registers (ESP, EBP, ESI, and EDI) are also general purpose registers. These stand for Stack Pointer, Base Pointer, Source Index, and Destination Index, respectively.
The EIP register is the Instruction Pointer register, which points to the current instruction the processor is
reading. Like a child pointing his finger at each word as he reads, the processor reads each instruction using the EIP register as its finger.
GDB is used to show the state of the processor registers right before the program starts.Let's see..
root@r00t:~/Desktop/c_programming/blog_tutorial# gcc simple.c
root@r00t:~/Desktop/c_programming/blog_tutorial# gdb ./a.out
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/Desktop/c_programming/blog_tutorial/a.out...(no debugging symbols found)...done.
(gdb) break main
Breakpoint 1 at 0x804841f
(gdb) run
Starting program: /root/Desktop/c_programming/blog_tutorial/a.out
Breakpoint 1, 0x0804841f in main ()
(gdb) info registers
eax 0xbffff554 -1073744556
ecx 0x66aa2f28 1722429224
edx 0x1 1
ebx 0xb7fc1ff4 -1208213516
esp 0xbffff4a8 0xbffff4a8
ebp 0xbffff4a8 0xbffff4a8
esi 0x0 0
edi 0x0 0
eip 0x804841f 0x804841f <main+3>
eflags 0x246 [ PF ZF IF ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
(gdb)
A break point is set on the main() function so execution will stop right before our code is executed. Then GDB runs the program, stops at the break point, and is told to display all the processor registers and their current states.
Most of the instructions in assembly use these registers to read or write data, so understanding the registers of a processor is essential to understanding the instructions.
The x86 CPU provides basic followings instructions :
MOV, ADD, SUB, CMP, AND, OR, NOT, JE, JNE, JB, JBE, JA, JAE, JMP, BRK, IRET, HALT, GET, and PUT, NOP.
Seven of these instructions have two operands, eight of these instructions have a single operand, five instructions have no operands at all.
The assembly instructions in Intel syntax generally follow this style:
operation <destination>, <source>
The destination and source values will either be a register, a memory address, or a value.
The mov operation will move a value from the source to the destination, sub will subtract, inc will increment.
For example:
804841d: 89 e5 mov ebp,esp
8048422: 83 ec 10 sub esp,0x10
The instructions will move the value from ESP to EBP and then subtract from ESP (storing the result in ESP). The cmp operation is used to compare values,The jmp is used to Unconditional jump.
We will discuss about them in detail later.
The code can be shown in Intel syntax by providing an additional command-line option, -M intel, to objdump because Intel syntax is much more readable and easier to understand.
root@r00t:~/Desktop/c_programming/blog_tutorial# objdump -M intel -D a.out | grep -A20 main.:
0804841c <main>:
804841c: 55 push ebp
804841d: 89 e5 mov ebp,esp
804841f: 83 e4 f0 and esp,0xfffffff0
8048422: 83 ec 10 sub esp,0x10
8048425: c7 04 24 d0 84 04 08 mov DWORD PTR [esp],0x80484d0
804842c: e8 cf fe ff ff call 8048300 <printf@plt>
8048431: c9 leave
8048432: c3 ret
8048433: 90 nop
8048434: 90 nop
8048435: 90 nop
8048436: 90 nop
8048437: 90 nop
8048438: 90 nop
8048439: 90 nop
804843a: 90 nop
804843b: 90 nop
804843c: 90 nop
804843d: 90 nop
804843e: 90 nop
The disassembly syntax can be set to Intel by simply typing set disassembly intel or set dis intel.
root@r00t:~/Desktop/c_programming/blog_tutorial# gdb -q
(gdb) set dis intel
Ambiguous set command "dis intel": disable-randomization, disassemble-next-line, disassembly-flavor, disconnected-
tracing...
(gdb)
Well this is all for basic assembly language which help you in debugging binary program and realizes that the compiled program is what actually gets executed out in the real world.I will discuss assembly language concept in detail later. I have tried to keep simple so that You do not have any problem in understanding.
If you like this post or have any question, please feel free to comment.
Every computer has it's heart that have exactly two things : CPU and Memory.
The 8086 CPU was the first x86 processor. It was developed and manufactured by Intel.
The processor(CPU) is the heart of the your system. Basically It is designed to do following things:-
-> Fetch an instruction from memory.
-> Execute the instruction.
The CPU contains the following elements to accomplish this:
Program Counter
Instruction Decoder
Data bus
General-purpose registers
Arithmetic and logic unit
But here i am not discuss in details for the sack of simplicity.
Now let's write a very simple C program:
#include <stdio.h>
int main()
{
printf("Hello World");
}
Let’s start by looking at the machine code the main() function was translated into.
The GNU development tools include a program called objdump, which can be used to examine compiled binaries.
There are certain options -a, -d, -D, -f, -g, -G, -h, -H, -p, -r, -S, -t, -T, -V, -x that must be given to tell objdump what information to show.
Usage :
objdump [options] objfiles
options :
-d, –disassemble
Display assembler mnemonic names for the machine instructions. Disassemble only sections that are expected to contain instructions.
-D, –disassemble-all
Disassemble all sections, not just those expected to contain instructions.
-f, –file-header
Display overall header summary information.
-g, –debugging
Display debugging information.
-h, –section-header, –header
Display section-header summary information.
-H, –help
Display help information and exit.
-S, –source
Display source code intermixed with dis-assembly, if possible. Implies -d.
-V, –version
Print version information and exit.
root@r00t:~/Desktop/c_programming/blog_tutorial# gcc firstprog.c
root@r00t:~/Desktop/c_programming/blog_tutorial# objdump -D a.out | grep -A20 main.:
0804841c <main>:
804841c: 55 push %ebp
804841d: 89 e5 mov %esp,%ebp
804841f: 83 e4 f0 and $0xfffffff0,%esp
8048422: 83 ec 10 sub $0x10,%esp
8048425: c7 04 24 d0 84 04 08 movl $0x80484d0,(%esp)
804842c: e8 cf fe ff ff call 8048300 <printf@plt>
8048431: c9 leave
8048432: c3 ret
8048433: 90 nop
8048434: 90 nop
8048435: 90 nop
8048436: 90 nop
8048437: 90 nop
8048438: 90 nop
8048439: 90 nop
804843a: 90 nop
804843b: 90 nop
804843c: 90 nop
804843d: 90 nop
804843e: 90 nop
The programs are stored in memory of computer. A computer program is nothing more than a collection of numbers stored in memory.Processor needs its own “memory” to save current working information locally. These memory spaces are called registers. Registers are used to store data temporarily.The registers are physically located in the processor itself, so it doesn’t have to fetch anything from RAM.It can be considered to be a sort of basic variable, which can hold any value that the processor stores in it.
The x86 (Intel family) CPUs provide several general purpose registers for application use:
EAX, EBX, ECX, EDX, ESP, EBP, ESI, and EDI
The first four registers (EAX, ECX, EDX, and EBX) are known as general purpose registers.
The second four registers (ESP, EBP, ESI, and EDI) are also general purpose registers. These stand for Stack Pointer, Base Pointer, Source Index, and Destination Index, respectively.
The EIP register is the Instruction Pointer register, which points to the current instruction the processor is
reading. Like a child pointing his finger at each word as he reads, the processor reads each instruction using the EIP register as its finger.
GDB is used to show the state of the processor registers right before the program starts.Let's see..
root@r00t:~/Desktop/c_programming/blog_tutorial# gcc simple.c
root@r00t:~/Desktop/c_programming/blog_tutorial# gdb ./a.out
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/Desktop/c_programming/blog_tutorial/a.out...(no debugging symbols found)...done.
(gdb) break main
Breakpoint 1 at 0x804841f
(gdb) run
Starting program: /root/Desktop/c_programming/blog_tutorial/a.out
Breakpoint 1, 0x0804841f in main ()
(gdb) info registers
eax 0xbffff554 -1073744556
ecx 0x66aa2f28 1722429224
edx 0x1 1
ebx 0xb7fc1ff4 -1208213516
esp 0xbffff4a8 0xbffff4a8
ebp 0xbffff4a8 0xbffff4a8
esi 0x0 0
edi 0x0 0
eip 0x804841f 0x804841f <main+3>
eflags 0x246 [ PF ZF IF ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
(gdb)
A break point is set on the main() function so execution will stop right before our code is executed. Then GDB runs the program, stops at the break point, and is told to display all the processor registers and their current states.
Most of the instructions in assembly use these registers to read or write data, so understanding the registers of a processor is essential to understanding the instructions.
The x86 CPU provides basic followings instructions :
MOV, ADD, SUB, CMP, AND, OR, NOT, JE, JNE, JB, JBE, JA, JAE, JMP, BRK, IRET, HALT, GET, and PUT, NOP.
Seven of these instructions have two operands, eight of these instructions have a single operand, five instructions have no operands at all.
The assembly instructions in Intel syntax generally follow this style:
operation <destination>, <source>
The destination and source values will either be a register, a memory address, or a value.
The mov operation will move a value from the source to the destination, sub will subtract, inc will increment.
For example:
804841d: 89 e5 mov ebp,esp
8048422: 83 ec 10 sub esp,0x10
The instructions will move the value from ESP to EBP and then subtract from ESP (storing the result in ESP). The cmp operation is used to compare values,The jmp is used to Unconditional jump.
We will discuss about them in detail later.
The code can be shown in Intel syntax by providing an additional command-line option, -M intel, to objdump because Intel syntax is much more readable and easier to understand.
root@r00t:~/Desktop/c_programming/blog_tutorial# objdump -M intel -D a.out | grep -A20 main.:
0804841c <main>:
804841c: 55 push ebp
804841d: 89 e5 mov ebp,esp
804841f: 83 e4 f0 and esp,0xfffffff0
8048422: 83 ec 10 sub esp,0x10
8048425: c7 04 24 d0 84 04 08 mov DWORD PTR [esp],0x80484d0
804842c: e8 cf fe ff ff call 8048300 <printf@plt>
8048431: c9 leave
8048432: c3 ret
8048433: 90 nop
8048434: 90 nop
8048435: 90 nop
8048436: 90 nop
8048437: 90 nop
8048438: 90 nop
8048439: 90 nop
804843a: 90 nop
804843b: 90 nop
804843c: 90 nop
804843d: 90 nop
804843e: 90 nop
The disassembly syntax can be set to Intel by simply typing set disassembly intel or set dis intel.
root@r00t:~/Desktop/c_programming/blog_tutorial# gdb -q
(gdb) set dis intel
Ambiguous set command "dis intel": disable-randomization, disassemble-next-line, disassembly-flavor, disconnected-
tracing...
(gdb)
Well this is all for basic assembly language which help you in debugging binary program and realizes that the compiled program is what actually gets executed out in the real world.I will discuss assembly language concept in detail later. I have tried to keep simple so that You do not have any problem in understanding.
If you like this post or have any question, please feel free to comment.