[Book] PC Assembly Language

First Post:

Last Update:

Word Count:
2.1k

Read Time:
13 min

Background

It has been a while since the last article noting a book.

If you have never read those articles, then I would like to introduce this type of articles.

This type of articles are used for keeping notes of books that I read. Unlike other articles, such as malware analysis. This is a “notebook”.

The content will be continuously updated.

Introduction

I started reading PC Assembly Language to strengthen my understanding of low-level execution, which is essential for analyzing bootkits (e.g., Petya), rootkits, and memory corruption vulnerabilities.

Unlike typical summaries, this article serves as a long-term notebook. However, I will also highlight concepts that are directly useful for malware analysis.

The content will be continuously updated as I read through the book.

El libro

PC Assembly Language

Why This Matters for Malware Analysis

Assembly language is not just a programming language, it is the ground truth of how programs execute.

For example:

  • Bootkits operate before the OS is loaded -> requires understanding of low-level execution
  • Shellcode directly manipulates registers and memory
  • Reverse engineering often requires reading compiler output in assembly

Therefore, understanding calling conventions, stack layout, and register usage is critical.

Chapter 1 - Introduction

Key Concept: C Calling Convention

One important concept introduced in this chapter is the cdecl calling convention.

This convention defines:

  • How arguments are passed (stack)
  • Who cleans the stack (caller)
  • How return values are passed (EAX)

This is extremely important in reverse engineering because many malware samples rely on standard calling conventions.

Skeleton program, this program can be used for any program that you want to develop:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
; skel.asm
%include "asm_io.inc"
segment .data ; Output strings
;
; initialized data is put in the data segment here
;

segment .bss
;
; uninitialized data is put in the bss segment
;

segment .text
global asm_main
asm_main:
enter 0, 0 ; setup routine
pusha

; do something

popa
mov eax, 0 ; return back to C
leave
ret

The original author of this book developed three significant scripts for importing into other program and are widely used through the entire book:

  • cdecl.h
  • cdecl.c
  • asm_io.inc

The source code of these script are available in this GitHub repository.

The author published this book years ago, the platform that the author used is different from today’s platforms. Therefore, some compiling instructions might lead unexpected errors. After investigation, the corrected compiling procedure is shown below:

1
2
3
4
5
nasm -f elf32 -d ELF_TYPE asm_io.asm
nasm -f elf32 -d ELF_TYPE skel.asm
gcc -m32 -o skel skel.o driver.c asm_io.o

./skel

Chapter 2 - Basic Assembly Language

2.1 - Integer Operations

The program below demonstrates how to use IO system:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
%include "asm_io.inc"
segment .data ; Output strings
prompt db "Enter a number: ", 0
square_msg db "Square of input is ", 0
cube_msg db "Cube of input is ", 0
cube25_msg db "Cube of input times 25 is ", 0
quot_msg db "Quotient of cube/100 is ", 0
rem_msg db "Remainder of cube/100 is ", 0
neg_msg db "The negation of the remainder is ", 0

segment .bss
input resd 1

segment .text
global asm_main
asm_main:
enter 0, 0 ; setup routine
pusha

mov eax, prompt ; store prompt string into eax
call print_string ; call print_string function

call read_int ; call read_int function
mov [input], eax ; store input value from eax into [input]

imul eax ; eax = eax * eax
mov ebx, eax ; save answer in ebx
mov eax, square_msg
call print_string
mov eax, ebx
call print_int
call print_nl

mov ebx, eax
imul ebx, [input] ; ebx *= [input]
mov eax, cube_msg
call print_string
mov eax, ebx
call print_int
call print_nl

imul ecx, ebx, 25 ; ecx = ebx * 25
mov eax, cube25_msg
call print_string
mov eax, ecx
call print_int
call print_nl

mov eax, ebx
cdq ; initialize edx by sign extension
mov ecx, 100
idiv ecx
mov ecx, eax
mov eax, quot_msg
call print_string
mov eax, ecx
call print_int
call print_nl
mov eax, rem_msg
call print_string
mov eax, edx
call print_int
call print_nl

neg edx ; negate the remainder
mov eax, neg_msg
call print_string
mov eax, edx
call print_int
call print_nl

popa
mov eax, 0
leave
ret

2.2 - Control Structures

The adc and sbb instructions use this information in the carry flag. The adc instruction performs the following operation:

The sbb instruction performs:

How are they used? Consider the sum of 64-bit integers in EDX:EAX and EBX:ECX. The following code would store the sum in EDX:EAX:

1
2
add eax, ecx ; add lower 32-bits
adc edx, ebx ; add upper 32-bits and carry from previous sum

Subtraction is very similar. The following code subtracts EBX:ECX from EDX:EAX:

1
2
sub eax, ecx ; subtract lower 32-bits
sbb edx, ebx ; subtract upper 32-bits and borrow

For large numbers, a loop could be used. For a sum loop, it would be convenient to use adc instruction for every iteration.

Comparison

In assembly, comparison does not directly return a boolean value. Instead, the result is stored in the FLAGS register.

This is different from high-level languages like C, where comparisons return true/false.

Instead:

  • cmp performs subtraction internally
  • The result is reflected in FLAGS (ZF, CF, SF, OF)

This means that control flow depends on how we interpret these flags.

When the difference vleft - vright is computed, the flags are set accordingly. If the difference of the cmp is zero, vleft = vright, then ZF is set (i.e. 1) and the CF is unset (i.e. 0). If vleft > vright, then ZF is unset and CF is unset (no borrow). If vleft < vright, then ZF is unset and CF is set (borrow).

For signed integers, there are three flags that are important: the zero (ZF) flag, the overflow (OF) flag and the sign (SF) flag. The overflow flag is set if the result of an operation is overflow (or underflow). The sign flag is set if the result of an operation is negative. If vleft = vright, the ZF is set (just as for unsigned integers). If vleft > vright, the ZF is unset and SF = OF. If vleft < vright, ZF is unset and SF != OF.

Why does SF = OF if vleft > vright? If there is no overflow, then the difference will have the correct value and must be non-negative. Thus, SF = OF = 0. However, if there is an overflow, the difference will not have the correct value (and in fact will be negative). Thus, SF = OF = 1.

An example is shown below:

1
2
3
4
if (EAX == 0)
EBX = 1;
else
EBX = 2;

The following example demonstrates how conditional branching is implemented using FLAGS:

1
2
3
4
5
6
7
8
    cmp eax, 0 ; set flags (ZF is set if eax - 0 = 0)
jz thenblock ; if ZF is set branch to thenblock
mov ebx, 2 ; ELSE part of IF
jmp next

thenblock:
mov ebx, 1
next:

Another example is shown below:

1
2
3
4
if (EAX >= 5)
EBX = 1;
else
EBX = 2;

If EAX is greater than or equal to five, the ZF may be set or unset and SF will equal OF. Therefore, the pseudo code can be converted below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
    cmp eax, 5
js signon ; goto signon if SF = 1
jo elseblock ; goto elseblock if OF = 1 and SF = 0
jmp thenblock ; goto thenblock if SF = 0 and OF = 0

signon:
jo thenblock

elseblock:
mov ebx, 2
jmp next

thenblock:
mov ebx, 1

next:

The above code is awkward. Fortunately, the 80x86 provides additional branch instructions to make these type of tests much easier.

1
2
3
4
5
6
7
8
9
    cmp eax, 5
jge thenblock
mov ebx, 2 ; else
jmp next

thenblock:
mov ebx, 1

next:

Loop

The 80x86 provides several instructions designed to implement for-like loops:

  • loop: Decrements ECX, if ECX not equal 0, branches to label
  • loope, loopz: Decrements ECX (FLAGS register is not modified), if ECX not equal 0, branches to label
  • loopne, loopnz: Decrements ECX (FLAGS unchanged), if ECX not equal 0 and ZF = 0, branches to label

An example is shown below:

1
2
3
sum = 0
for (i = 10; i > 0; i--)
sum += i;

The pseudo code can be converted below:

1
2
3
4
5
    mov eax, 0 ; eax is sum
mov ecx, 10 ; ecx is i
loop_start:
add eax, ecx
loop loop_start

2.4 - Example: Finding Prime Numbers

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
unsigned guess;
unsigned factor;
unsigned limit;

printf("Find primes up to: ");
scanf("%u", &limit);
printf("2\n");
printf("3\n");
guess = 5;
while (guess <= limit) {
factor = 3;
while (factor * factor < guess && guess % factor != 0)
factor += 2;

if (guess % factor != 0)
printf("%u\n", guess);
guess += 2;
}

This pseudo code can be converted below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
%include "asm_io.inc"
segment .data
Message db "Find primes up to: ", 0

segment .bss
Limit resd 1 ; find primes up to this limit
Guess resd 1 ; the current guess for prime

segment .text
global asm_main

asm_main:
enter 0, 0
pusha

mov eax, Message
call print_string
call read_int ; scanf("%u", &limit);
mov [Limit], eax
mov eax, 2 ; printf("2\n");
call print_int
call print_nl
mov eax, 3 ; printf("3\n");
call print_int
call print_nl

mov dword [Guess], 5
while_limit: ; while ( Guess <= Limit )
mov eax, [Guess]
cmp eax, [Limit]
jnbe end_while_limit

mov ebx, 3
while_factor:
mov eax, ebx
mul eax ; eax = eax * eax
jo end_while_factor
cmp eax, [Guess]
jnb end_while_factor
mov eax, [Guess]
mov edx, 0
div ebx ; edx = eax % edx
cmp edx, 0
je end_while_factor

add ebx, 2 ; factor += 2;
jmp while_factor

end_while_factor:
je end_if
mov eax, [Guess]
call print_int
call print_nl
end_if:
add dword [Guess], 2
jmp while_limit

end_while_limit:

popa
mov eax, 0
leave
ret

Note: Using different branch instructions can help us to understand how CPU and registers handle integers.

Chapter 3 - Bit Operations

Shift Operations

Logical shifts

The number of positions to shift can be either be a constant or can be stored in the CL register. The last bit shifted out of the data is stored in the carry flag.

An example is shown below:

1
2
3
4
5
6
7
8
mov ax, 0C123H
shl ax, 1 ; shift 1 bit to left, ax = 8246H, CF = 1
shr ax, 1 ; shift 1 bit to right, ax = 4123H, CF = 0
shr ax, 1 ; shift 1 bit to right, ax = 2091H, CF = 1
mov ax, 0C123H
shl ax, 2 ; shift 2 bits to left, ax = 048CH, CF = 1
mov cl, 3
shr ax, cl ; shift 3 bits to right, ax = 0091H, CF = 1

Arithmetic shifts

The left shift remains the same. However, for right shfts, the leftmost bit (sign bit) is replicated in the vacated positions to preserve the sign of the number.

The instructions is shown below:

  • sal: Shift arithmetic left
  • sar: Shift arithmetic right

Rotate shifts

Unlike logical or arithmetic shifts, no bits are lost, and there is no padding with zeros (or sign bit). The bits “wrap around” to the opposite side.

Boolean Bitwise Operations

Chapter 4 - Subprograms

Simple Subprogram Example

A subprogram is an independent unit of code that can be used from different parts of a program. In other words, a subprogram is like a function in C.

A jump can be used to invoke the subprogram, but returning presents a problem. If the subprogram is to be used by different parts of the program, it must return back to the section of code that invoked it. Thus, the jump back from the subprogram can not be hard coded to a label.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
; file: sub1.asm
; Subprogrm example program
%include "asm_io.inc"

segment .data
prompt1 db "Enter a number: ", 0 ; don't forget null terminator
prompt2 db "Enter another number: ", 0
outmsg1 db "You entered: ", 0
outmsg2 db " and ", 0
outmsg3 db ", the sum of these is ", 0

segment .bss
input1 resd 1
input2 resd 1

segment .text
global asm_main
asm_main:
enter 0, 0 ; setup routine
pusha

mov eax, prompt1 ; print out prompt
call print_string

mov ebx, input1 ; store address of input1 into ebx
mov ecx, ret1 ; store return address into ecx
jmp short get_int

ret1:
mov eax, prompt2 ; print out prompt
call print_string

mov ebx, input2
mov ecx, $ + 7 ; ecx = this address + 7
jmp short get_int

mov eax, [input1] ; eax = dword at input1
add eax, [input2] ; eax += dword at input2
mov ebx, eax ; ebx = eax

mov eax, outmsg1
call print_string ; print out first message
mov eax, [input1]
call print_int ; print out input1
mov eax, outmsg2
call print_string ; print out second message
mov eax, [input2]
call print_int ; print out input2
mov eax, outmsg3
call print_string ; print out third message
mov eax, ebx
call print_int ; print out sum (ebx)
call print_nl ; print new-line

popa
mov eax, 0 ; return back to C
leave
ret

; subprogram get_int
; Parameters
; ebx: address of dword to store integer into
; ecx: address of instruction to return to
; Notes:
; value of eax is destroyed
get_int:
call read_int
mov [ebx], eax ; store input into memory
jmp ecx ; jump back to caller