[Book] PC Assembly Language

First Post:

Last Update:

Word Count:
4k

Read Time:
25 min

Background

It has been a while since the last article noting a book.

If you have never read those articles, then I would like to introduce this type of articles.

This type of articles are used for keeping notes of books that I read. Unlike other articles, such as malware analysis. This is a “notebook”.

The content will be continuously updated.

Introduction

I started reading PC Assembly Language to strengthen my understanding of low-level execution, which is essential for analyzing bootkits (e.g., Petya), rootkits, and memory corruption vulnerabilities.

Unlike typical summaries, this article serves as a long-term notebook. However, I will also highlight concepts that are directly useful for malware analysis.

The content will be continuously updated as I read through the book.

El libro

PC Assembly Language

Why This Matters for Malware Analysis

Assembly language is not just a programming language, it is the ground truth of how programs execute.

For example:

  • Bootkits operate before the OS is loaded -> requires understanding of low-level execution
  • Shellcode directly manipulates registers and memory
  • Reverse engineering often requires reading compiler output in assembly

Therefore, understanding calling conventions, stack layout, and register usage is critical.

Reflections

I previously learned RISC-V assembly, and NASM differs in several aspects.

My goal in learning assembly is to better understand low-level mechanisms such as shellcode, bootkits, and implants.

This book focuses on fundamental concepts such as memory layout and instruction behavior, making it a solid starting point for assembly programming.

Chapter 1 - Introduction

Key Concept: C Calling Convention

One important concept introduced in this chapter is the cdecl calling convention.

This convention defines:

  • How arguments are passed (stack)
  • Who cleans the stack (caller)
  • How return values are passed (EAX)

This is extremely important in reverse engineering because many malware samples rely on standard calling conventions.

Skeleton program, this program can be used for any program that you want to develop:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
; skel.asm
%include "asm_io.inc"
segment .data ; Output strings
;
; initialized data is put in the data segment here
;

segment .bss
;
; uninitialized data is put in the bss segment
;

segment .text
global asm_main
asm_main:
enter 0, 0 ; setup routine
pusha

; do something

popa
mov eax, 0 ; return back to C
leave
ret

The original author of this book developed three significant scripts for importing into other program and are widely used through the entire book:

  • cdecl.h
  • cdecl.c
  • asm_io.inc

The source code of these script are available in this GitHub repository.

The author published this book years ago, the platform that the author used is different from today’s platforms. Therefore, some compiling instructions might lead unexpected errors. After investigation, the corrected compiling procedure is shown below:

1
2
3
4
5
nasm -f elf32 -d ELF_TYPE asm_io.asm
nasm -f elf32 -d ELF_TYPE skel.asm
gcc -m32 -o skel skel.o driver.c asm_io.o

./skel

Chapter 2 - Basic Assembly Language

2.1 - Integer Operations

The program below demonstrates how to use IO system:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
%include "asm_io.inc"
segment .data ; Output strings
prompt db "Enter a number: ", 0
square_msg db "Square of input is ", 0
cube_msg db "Cube of input is ", 0
cube25_msg db "Cube of input times 25 is ", 0
quot_msg db "Quotient of cube/100 is ", 0
rem_msg db "Remainder of cube/100 is ", 0
neg_msg db "The negation of the remainder is ", 0

segment .bss
input resd 1

segment .text
global asm_main
asm_main:
enter 0, 0 ; setup routine
pusha

mov eax, prompt ; store prompt string into eax
call print_string ; call print_string function

call read_int ; call read_int function
mov [input], eax ; store input value from eax into [input]

imul eax ; eax = eax * eax
mov ebx, eax ; save answer in ebx
mov eax, square_msg
call print_string
mov eax, ebx
call print_int
call print_nl

mov ebx, eax
imul ebx, [input] ; ebx *= [input]
mov eax, cube_msg
call print_string
mov eax, ebx
call print_int
call print_nl

imul ecx, ebx, 25 ; ecx = ebx * 25
mov eax, cube25_msg
call print_string
mov eax, ecx
call print_int
call print_nl

mov eax, ebx
cdq ; initialize edx by sign extension
mov ecx, 100
idiv ecx
mov ecx, eax
mov eax, quot_msg
call print_string
mov eax, ecx
call print_int
call print_nl
mov eax, rem_msg
call print_string
mov eax, edx
call print_int
call print_nl

neg edx ; negate the remainder
mov eax, neg_msg
call print_string
mov eax, edx
call print_int
call print_nl

popa
mov eax, 0
leave
ret

2.2 - Control Structures

The adc and sbb instructions use this information in the carry flag. The adc instruction performs the following operation:

The sbb instruction performs:

How are they used? Consider the sum of 64-bit integers in EDX:EAX and EBX:ECX. The following code would store the sum in EDX:EAX:

1
2
add eax, ecx ; add lower 32-bits
adc edx, ebx ; add upper 32-bits and carry from previous sum

Subtraction is very similar. The following code subtracts EBX:ECX from EDX:EAX:

1
2
sub eax, ecx ; subtract lower 32-bits
sbb edx, ebx ; subtract upper 32-bits and borrow

For large numbers, a loop could be used. For a sum loop, it would be convenient to use adc instruction for every iteration.

Comparison

In assembly, comparison does not directly return a boolean value. Instead, the result is stored in the FLAGS register.

This is different from high-level languages like C, where comparisons return true/false.

Instead:

  • cmp performs subtraction internally
  • The result is reflected in FLAGS (ZF, CF, SF, OF)

This means that control flow depends on how we interpret these flags.

When the difference vleft - vright is computed, the flags are set accordingly. If the difference of the cmp is zero, vleft = vright, then ZF is set (i.e. 1) and the CF is unset (i.e. 0). If vleft > vright, then ZF is unset and CF is unset (no borrow). If vleft < vright, then ZF is unset and CF is set (borrow).

For signed integers, there are three flags that are important: the zero (ZF) flag, the overflow (OF) flag and the sign (SF) flag. The overflow flag is set if the result of an operation is overflow (or underflow). The sign flag is set if the result of an operation is negative. If vleft = vright, the ZF is set (just as for unsigned integers). If vleft > vright, the ZF is unset and SF = OF. If vleft < vright, ZF is unset and SF != OF.

Why does SF = OF if vleft > vright? If there is no overflow, then the difference will have the correct value and must be non-negative. Thus, SF = OF = 0. However, if there is an overflow, the difference will not have the correct value (and in fact will be negative). Thus, SF = OF = 1.

An example is shown below:

1
2
3
4
if (EAX == 0)
EBX = 1;
else
EBX = 2;

The following example demonstrates how conditional branching is implemented using FLAGS:

1
2
3
4
5
6
7
8
    cmp eax, 0 ; set flags (ZF is set if eax - 0 = 0)
jz thenblock ; if ZF is set branch to thenblock
mov ebx, 2 ; ELSE part of IF
jmp next

thenblock:
mov ebx, 1
next:

Another example is shown below:

1
2
3
4
if (EAX >= 5)
EBX = 1;
else
EBX = 2;

If EAX is greater than or equal to five, the ZF may be set or unset and SF will equal OF. Therefore, the pseudo code can be converted below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
    cmp eax, 5
js signon ; goto signon if SF = 1
jo elseblock ; goto elseblock if OF = 1 and SF = 0
jmp thenblock ; goto thenblock if SF = 0 and OF = 0

signon:
jo thenblock

elseblock:
mov ebx, 2
jmp next

thenblock:
mov ebx, 1

next:

The above code is awkward. Fortunately, the 80x86 provides additional branch instructions to make these type of tests much easier.

1
2
3
4
5
6
7
8
9
    cmp eax, 5
jge thenblock
mov ebx, 2 ; else
jmp next

thenblock:
mov ebx, 1

next:

My Takeaway

The key idea here is that assembly does not have “true/false”.
Everything is driven by FLAGS.

This explains why reverse engineering requires understanding
how conditions are implemented, not just what they mean.

Loop

The 80x86 provides several instructions designed to implement for-like loops:

  • loop: Decrements ECX, if ECX not equal 0, branches to label
  • loope, loopz: Decrements ECX (FLAGS register is not modified), if ECX not equal 0, branches to label
  • loopne, loopnz: Decrements ECX (FLAGS unchanged), if ECX not equal 0 and ZF = 0, branches to label

An example is shown below:

1
2
3
sum = 0
for (i = 10; i > 0; i--)
sum += i;

The pseudo code can be converted below:

1
2
3
4
5
    mov eax, 0 ; eax is sum
mov ecx, 10 ; ecx is i
loop_start:
add eax, ecx
loop loop_start

2.4 - Example: Finding Prime Numbers

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
unsigned guess;
unsigned factor;
unsigned limit;

printf("Find primes up to: ");
scanf("%u", &limit);
printf("2\n");
printf("3\n");
guess = 5;
while (guess <= limit) {
factor = 3;
while (factor * factor < guess && guess % factor != 0)
factor += 2;

if (guess % factor != 0)
printf("%u\n", guess);
guess += 2;
}

This pseudo code can be converted below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
%include "asm_io.inc"
segment .data
Message db "Find primes up to: ", 0

segment .bss
Limit resd 1 ; find primes up to this limit
Guess resd 1 ; the current guess for prime

segment .text
global asm_main

asm_main:
enter 0, 0
pusha

mov eax, Message
call print_string
call read_int ; scanf("%u", &limit);
mov [Limit], eax
mov eax, 2 ; printf("2\n");
call print_int
call print_nl
mov eax, 3 ; printf("3\n");
call print_int
call print_nl

mov dword [Guess], 5
while_limit: ; while ( Guess <= Limit )
mov eax, [Guess]
cmp eax, [Limit]
jnbe end_while_limit

mov ebx, 3
while_factor:
mov eax, ebx
mul eax ; eax = eax * eax
jo end_while_factor
cmp eax, [Guess]
jnb end_while_factor
mov eax, [Guess]
mov edx, 0
div ebx ; edx = eax % edx
cmp edx, 0
je end_while_factor

add ebx, 2 ; factor += 2;
jmp while_factor

end_while_factor:
je end_if
mov eax, [Guess]
call print_int
call print_nl
end_if:
add dword [Guess], 2
jmp while_limit

end_while_limit:

popa
mov eax, 0
leave
ret

Note: Using different branch instructions can help us to understand how CPU and registers handle integers.

Chapter 3 - Bit Operations

Shift Operations

Logical shifts

The number of positions to shift can be either be a constant or can be stored in the CL register. The last bit shifted out of the data is stored in the carry flag.

An example is shown below:

1
2
3
4
5
6
7
8
mov ax, 0C123H
shl ax, 1 ; shift 1 bit to left, ax = 8246H, CF = 1
shr ax, 1 ; shift 1 bit to right, ax = 4123H, CF = 0
shr ax, 1 ; shift 1 bit to right, ax = 2091H, CF = 1
mov ax, 0C123H
shl ax, 2 ; shift 2 bits to left, ax = 048CH, CF = 1
mov cl, 3
shr ax, cl ; shift 3 bits to right, ax = 0091H, CF = 1

Arithmetic shifts

The left shift remains the same. However, for right shfts, the leftmost bit (sign bit) is replicated in the vacated positions to preserve the sign of the number.

The instructions is shown below:

  • sal: Shift arithmetic left
  • sar: Shift arithmetic right

Rotate shifts

Unlike logical or arithmetic shifts, no bits are lost, and there is no padding with zeros (or sign bit). The bits “wrap around” to the opposite side.

Observations

Bit operations are extremely efficient because they directly map to CPU instructions.

Compared to multiplication or division:

  • Shifts are faster
  • Often used as low-level optimizations

For example:

  • shl eax, 1 —> equivalent to eax * 2
  • shr eax, 1 —> equivalent to eax / 2 (unsigned only)

These patterns are commonly seen in performance-critical code.

Chapter 4 - Subprograms

Simple Subprogram Example

A subprogram is an independent unit of code that can be used from different parts of a program. In other words, a subprogram is like a function in C.

A jump can be used to invoke the subprogram, but returning presents a problem. If the subprogram is to be used by different parts of the program, it must return back to the section of code that invoked it. Thus, the jump back from the subprogram can not be hard coded to a label.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
; file: sub1.asm
; Subprogrm example program
%include "asm_io.inc"

segment .data
prompt1 db "Enter a number: ", 0 ; don't forget null terminator
prompt2 db "Enter another number: ", 0
outmsg1 db "You entered: ", 0
outmsg2 db " and ", 0
outmsg3 db ", the sum of these is ", 0

segment .bss
input1 resd 1
input2 resd 1

segment .text
global asm_main
asm_main:
enter 0, 0 ; setup routine
pusha

mov eax, prompt1 ; print out prompt
call print_string

mov ebx, input1 ; store address of input1 into ebx
mov ecx, ret1 ; store return address into ecx
jmp short get_int

ret1:
mov eax, prompt2 ; print out prompt
call print_string

mov ebx, input2
mov ecx, $ + 7 ; ecx = this address + 7
jmp short get_int

mov eax, [input1] ; eax = dword at input1
add eax, [input2] ; eax += dword at input2
mov ebx, eax ; ebx = eax

mov eax, outmsg1
call print_string ; print out first message
mov eax, [input1]
call print_int ; print out input1
mov eax, outmsg2
call print_string ; print out second message
mov eax, [input2]
call print_int ; print out input2
mov eax, outmsg3
call print_string ; print out third message
mov eax, ebx
call print_int ; print out sum (ebx)
call print_nl ; print new-line

popa
mov eax, 0 ; return back to C
leave
ret

; subprogram get_int
; Parameters
; ebx: address of dword to store integer into
; ecx: address of instruction to return to
; Notes:
; value of eax is destroyed
get_int:
call read_int
mov [ebx], eax ; store input into memory
jmp ecx ; jump back to caller

The Stack

The SS segment register specifices the segment that contains the stack (usually this is the same segment data is stored into). The ESP register contains the address of the data that would be removed from the stack.

The push instruction inserts a double word on the stack by subtracting 4 (double world = 4 bytes) from ESP and then stores the double world at [ESP]. The pop instruction reads the double world at [ESP] and then adds 4 to ESP.

An example is shown below:

1
2
3
4
5
6
7
; Assume that ESP is initially 1000h
push dword 1 ; 1 stored at 0FFCh, ESP = 0FFCh
push dword 2 ; 2 stored at 0FF8h, ESP = 0ff8h
push dword 3 ; 3 stored at 0FF4h, ESP = 0FF4h
pop eax ; EAX = 3, ESP = 0FF8h
pop ebx ; EBX = 2, ESP = 0FFCh
pop ecx ; ECX = 1, ESP = 1000h

If the stack is also used inside the subprogram is store data, the number needed to be added to ESP will change. Thus, it can be very complex to use ESP when referencing parameters. To solve this problem, the 80386 supplies another register to use: EBP. This register’s only purpose is to reference data on the stack.

The C calling convention mandates that a subprogram first save the value of EBP to be equal to ESP. This allows ESP to change as data is pushed or popped off the stack without modifying EBP. At the end of the subprogram, the original value of EBP must be restored.

After the subprogram is over, the parameters that were pushed on the stack must be removed. The C calling convention specifies that the caller code must do this. Other conventions are different (ex. the Pascal calling convention specifies that the subprogram must remove the parameters).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
; file: sub3.asm

%include "asm_io.inc"

segment .data
sum dd 0

segment .bss
input resd 1

; pseudo-code algorithm
; i = 1;
; sum = 0;
; while (get_int(i, &input), input != 0) {
; sum += input;
; i++;
; }
; print_sum(num);

segment .text
global asm_main
asm_main:
enter 0, 0 ; setup routine
pusha

mov edx, 1 ; edx is 'i' in pseudo-code
while_loop:
push edx ; save i in stack
push dword input
call get_int
add esp, 8 ; remove i and &input from stack (destroyed by caller)

mov eax, [input]
cmp eax, 0
je end_while

add [sum], eax

inc edx ; i++
jmp short while_loop

end_while:
push dword [sum]
call print_sum
pop ecx

popa
mov eax, 0
leave
ret

; subprogram get_int
; Parameters (in order pushed on stack)
; number of input (at [ebp+12])
; address of word to store input into (at [ebp + 8])
; Notes:
; values of eax and ebx are destroyed
segment .data
prompt db ") Enter an integer number (0 to quit): ", 0

segment .text
get_int:
push ebp
mov ebp, esp ; store ESP value into EBP

mov eax, [ebp + 12]
call print_int

mov eax, prompt
call print_string

call read_int
mov ebx, [ebp + 8]
mov [ebx], eax ; store input into memory

pop ebp
ret

segment .data
result db "The sum is ", 0

segment .text
print_sum:
push ebp
mov ebp, esp

mov eax, result
call print_string

mov eax, [ebp+8]
call print_int
call print_nl

pop ebp
ret

Interfacing Assembly with C

Different compilers require different formats. Borland and Microsoft require MASM format. DJGPP and Linux’s gcc require GAS format. The technique of calling an assembly subroutine is much more standardized on the PC.

Calculating addresses of local variables

If x is located at EBP - 8 on the stack, one cannot just use:

1
mov eax, ebp - 8

The value that mov stores into eax must be computed by the assembler (that is, it must in the end be a constant). However, there is an instruction that does the desired calculation:

1
lea eax, [ebp - 8]

Now eax holds the address of x and could be pushed on the stack when calling the function.

Note: The lea instruction never read memory. It only computes the address that would be read by another instruction and stores this address in this first register operand. Since it does not actually read any memory, no memory size designation (e.g., dword) is needed or allowed.

Other calling conventions

The GCC compiler allows different calling conventions. For example, to declare a void function that uses the standard calling convention named f that takes a single int parameter, use the following syntax for its prototype:

1
void f(int) __attribute__((cdecl));

The function above could be declared to use this convention by replacing the cdecl with stdcall. The difference in stdcall and cdecl is that stdcall requires the suboutine to remove the parameters from the stack (as the Pascal calling convention does). Thus, the stdcall convention can only be used with functions that take a fixed number of arguments (i.e., onec not like printf and scanf).

Borland and Microsoft use a common syntax to declare calling conventions. They add the __cdecl and __stdcall keywords to C. For example, the function f above would be defined as follows for Borland and Microsoft:

1
void __cdecl f(int);

The advantage of the stdcall convention is that it uses less memory than cdecl.

Examples

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
//main5.c
#include <stdio.h>

void calc_sum(int, int *) __attribute__((cdecl));

int main(void)
{
int n, sum;

printf("Sum integers up to: ");
scanf("%d", &n);
calc_sum(n, &sum);
printf("Sum is %d\n", sum);

return 0;
}

The code above can be written as assembly code below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
; subroutine _calc_sum
; finds the sum of the integers 1 through n
; Parameters:
; n: what to sum up to (at [ebp + 8])
; sump - pointer to int to store sum into (at [ebp + 12])
; pseudo C code:
; void calc_sum(int n, int *sump)
; {
; int i, sum = 0;
; for (i = 1; i <= n; i++)
; sum += i;
; *sump = sum;
; }

segment .text
global _calc_sum
; local variable:
; sum at [ebp-4]
_calc_sum:
enter 4, 0 ; make room for sum on stack
push ebx ; important!

mov dword [ebp-4], 0
dump_stack 1, 2, 4 ; print out stack from ebp-8 to ebp+16
mov ecx, 1 ; ecx is i in pseudo code
for_loop:
cmp ecx, [ebp+8] ; cmp i and n
jnle end_for ; if not i <= n, quit

add [ebp-4], ecx ; sum += i
inc ecx ; i++
jmp short for_loop
end_for:
mov ebx, [ebp+12] ; ebx = sump
mov eax, [ebp-4] ; eax = sum
mov [ebx], eax ; *sump = eax = sum

pop ebx ; restore ebx
leave
ret

Key Insight

The stack frame structure is one of the most important concepts in low-level programming.

By understanding how EBP and ESP are used:

  • function parameters can be identified
  • local variables can be reconstructed
  • function boundaries become clearer

This is fundamental when analyzing compiled binaries.

Chapter 5 - Arrays

Introduction

Defining arrays in the data and bss segments

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
segment .data
; define an array of 10 double words initialized to 1,2,...,10
a1 dd 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

; define an array of 10 words initialized to 0
a2 dw 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

; same as before using TIMES
a3 times 10 dw 0

; define an array of bytes with 200 0's and then 100 1's
a4 times 200 db 0
times 100 db 1

segment .bss
; define an array of 10 uninitialized double words
a5 resd 10

; define an array of 100 uninitialized words
a6 resw 100

Example

The example below shows how to use an array and passes it to a function.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
%define ARRAY_SIZE 100
%define NEW_LINE 10

segment .data
FirstMsg db "First 10 elements of array", 0
Prompt db "Enter index of element to display: ", 0
SecondMsg db "Element %d is %d", NEW_LINE, 0
ThirdMsg db "Elements 20 through 20 of array", 0
InputFormat db "%d", 0

segment .bss
array resd ARRAY_SIZE

segment .text
extern _puts, _printf, _scanf, _dump_line
global asm_main
asm_main:
enter 4, 0 ; local dword variable at EBP - 4
push ebp
push esi

; initialize array to 100, 99, 98, 97, ...,
mov ecx, ARRAY_SIZE
mov ebx, array
init_loop:
mov [ebx], ecx ; assign value
add ebx, 4 ; next address (+4 byte)
loop init_loop ; ecx--, not jump if ecx is 0

push dword FirstMsg ; print out FirstMsg
call _puts
pop ecx

push dword 10
push dword array
call _print_array ; print first 10 elements of array
add esp, 8

; prompt user for element index
Prompt_loop:
push dword Prompt
call _printf
pop ecx

lea eax, [ebp-4] ; eax = address of local dword
push eax
push dword InputFormat
call _scanf
add esp, 8
cmp eax, 1 ; eax = return value of scanf
je InputOK

call _dump_line ; if input invalid
InputOK:
mov esi, [ebp-4]
push dword [array+4*esi]
push esi
push dword SecondMsg ; print out value of element
call _printf
add esp, 12

push dword ThirdMsg ; print out elements 20-29
call _puts
pop ecx

push dword 10
push dword array + 20*4 ; address of array[20]
call _print_array
add esp, 8

pop esi
pop ebx
mov eax, 0 ; return back to C
leave
ret

; routine _print_array
; C-callable routine that prints out elements of a double word array as signed integers.
; C prototype:
; void print_array(const int *a, int n);
segment .data
OutputFormat db "%-5d %5d", NEW_LINE, 0

segment .text
global _print_array
_print_array:
enter 0, 0
push esi
push ebx

xor esi, esi ; esi = 0
mov ecx, [ebp+12] ; ecx = n
mov ebx, [ebp+8] ; ebx = address of array
print_loop:
push ecx
push dword [ebx+4*esi] ; push array[esi]
push esi
push dword OutputFormat
call _printf
add esp, 12

inc esi
pop ecx
loop print_loop

pop ebx
pop esi
leave
ret

Array access in assembly is essentially pointer arithmetic.

For example:

  • [ebx + 4*esi] means:
    • base address = ebx
    • index = esi
    • element size = 4 bytes (dword)

Thus: array[i] —> base + i * sizeof(element)

The lea instruction revisited

The lea instruction can be used for other purposes than just calculating addresses. A fairly common one is for fast computations. Consider the following:

1
lea ebx, [4*eax+eax]

This effectively stores the value of 5 * eax into ebx. Using lea to do this is both easier and faster than using mul. However, one must realize that the expression inside the square brackets must be a legal indirect address. Thus, this instruction can not be used to multiple by 6 quickly.

Skipped Sections

The later chapters (e.g., classes, polymorphism, and floating point) were intentionally skipped in this note.

These topics are less relevant to my current focus on:

  • low-level execution
  • reverse engineering
  • malware analysis

They may be revisited in the future if needed.

Conclusion — Why This Book Matters for Reverse Engineering (My Perspective)

From a reverse engineering perspective, the most important takeaways so far are:

  • Calling convention -> identify function arguments
  • Stack frame -> reconstruct local variables
  • FLAGS -> understand constrol flow
  • Bit operations -> recognize compiler optimizations

These concepts directly appear in disassembly and are essential when analyzing malware.

THANKS FOR READING