articles » current » assembly » x64 » tutorial » page-2

x86-64 Assembly: Tutorial - A Quick Guide to the Changes in 64-bit Assembly - Page 2

Instructions Compared to x86, Microsoft's x64 Calling Convention

Be familiar with x86 assembly as the differences between x86 and x64 assembly are minor.

This code was compiled and linked using the Microsoft Macro Assembler for Visual Studio 2022.

Instructions Compared to x86

Just like in x86, instruction operands must be the same size and at least one must be a register.


cmp rbx, ecx               ; cannot do this
cmp rdx, rbx               ; okay

mov rbx, ecx               ; cannot do this
mov ebx, rax               ; cannot do this
mov ebx, QWORD PTR [rax]   ; cannot do this
mov rbx, 12h               ; okay
mov rdx, rbx               ; okay
mov rbx, QWORD PTR [rax]   ; okay
mov ebx, DWORD PTR [eax]   ; okay

add rbx, ecx               ; cannot do this
add rbx, rcx               ; okay

xchg rbx, ecx              ; cannot do this
xchg rbx, rcx              ; okay

div 10h                    ; cannot do this
mov [rbx], QWORD PTR [rax] ; cannot do this

Microsoft's x64 Calling Convention

Passing function/procedure parameters in x64 uses the "fastcall" calling convention. This means the first four QWORD parameters are passed in the rcx, rdx, r8 and r9 registers. More than four parameters and the stack is used to pass them to the function. The same follows for DWORD parameters except ecx, edx, r8d and r9d are used. See the following diagram:

ParamQWORDDWORDWORDBYTE
1rcxecxcxcl
2rdxedxdxdl
3r8r8dr8wr8b
4r9r9dr9wr9b
5+uses the stack
*x64 fastcall diagram

Microsoft adds something called shadow space to the stack. This is done by subtracting 32 from the RSP register. This shadow space represents the four 64-bit variables (32 bytes) that would have been passed on the stack if not for passing them in registers. This space is not wasted, it can be used to save ECX, EDX, r8 and r9 in the event a sub-procedure needs to be called, particularly a recursive one. So every time the stack is used, it must be done with 32 (or 20h) added to it. This is done in many of the examples below.

Consider the following C/C++ code:


// delcare the assembly function
extern "C" long long addx64(long long a, long long b, long long c, long long d, long long* num1, long long* num2, int *num3, int num4, int num5);

int main()
{
	long long num1 = 0x1000, num2 = 0x2000;
	int num3 = 0x4000;
	long long x = addx64(0x1, 0x2, 0x3, 0x4, &num1, &num2, &num3, 0x8000, 0x10000);
	return 0;
}

Parameters 1-4 are passed in 64-bit registers; parameters 5-8 are passed on the stack:


_TEXT	SEGMENT
addx64 PROC

	cmp rcx, 1h                ; parameter 1 (a - passed in rcx)
	jne exit
	mov rax, rcx
	cmp rdx, 2h                ; parameter 2 (b - passed in rdx)
	jne exit
	add rax, rdx
	cmp r8, 3h                 ; parameter 3 (c - passed in r8)
	jne exit
	add rax, r8
	cmp r9, 4h                 ; parameter 4 (d - passed in r9)
	jne exit
	add rax, r9
	mov r10, [rsp+(1+0)*8+32]  ; parameter 5 (&num1 - passed on the stack)
	add rax, [r10]
	mov r10, [rsp+(2+0)*8+32]  ; parameter 6 (&num2 - passed on the stack)
	add rax, [r10]
	mov r10, [rsp+(3+0)*8+32]  ; parameter 7 (&num3 - passed on the stack)
	mov r10d, [r10]
	add rax, r10
	mov r10d, [rsp+(4+0)*8+32] ; parameter 7 (num4 - passed on the stack)
	add rax, r10
	mov r10d, [rsp+(5+0)*8+32] ; parameter 8 (num5 - passed on the stack)
	add rax, r10

exit:
	ret
addx64 ENDP
_TEXT	ENDS
END

Consider the following C/C++ code:


// delcare the assembly function using six 32-bit parameters this time and three 64-bit values (pointers)
extern "C" int addx64(int a, int b, int c, int d, int* num1, int* num2, int *num3, int num4, int num5);

int main()
{
	int num1 = 0x1000, num2 = 0x2000;
	int num3 = 0x4000;
	int x = addx64(0x1, 0x2, 0x3, 0x4, &num1, &num2, &num3, 0x8000, 0x10000);
	return 0;
}

Parameters 1-4 are passed in 32-bit registers; parameters 5-8 are passed on the stack:


_TEXT	SEGMENT
addx64 PROC

	cmp ecx, 1h                          ; parameter 1 (a - passed in ecx)
	jne exit
	mov eax, ecx
	cmp edx, 2h                          ; parameter 2 (b - passed in edx)
	jne exit
	add eax, edx
	cmp r8d, 3h                          ; parameter 3 (c - passed in r8d)
	jne exit
	add eax, r8d
	cmp r9d, 4h                          ; parameter 4 (d - passed in r9d)
	jne exit
	add eax, r9d

	mov r10, QWORD PTR [rsp+(1+0)*8+32]  ; parameter 5 (&num1 - passed on the stack)
	mov r10d, DWORD PTR [r10]
	add eax, r10d

	mov r10, QWORD PTR [rsp+(2+0)*8+32]  ; parameter 6 (&num2 - passed on the stack)
	mov r10d, DWORD PTR [r10]
	add eax, r10d

	mov r10, QWORD PTR [rsp+(3+0)*8+32]  ; parameter 7 (&num3 - passed on the stack)
	mov r10d, DWORD PTR [r10]
	add eax, r10d

	mov r10d, DWORD PTR [rsp+(4+0)*8+32] ; parameter 7 (num4 - passed on the stack)
	add eax, r10d

	mov r10d, DWORD PTR [rsp+(5+0)*8+32] ; parameter 8 (num5 - passed on the stack)
	add eax, r10d

exit:
	ret
addx64 ENDP
_TEXT	ENDS
END

*Notice that the code is more easily understood using QWORD PTR and DWORD PTR but, as with x86 assembly, the x64 assembler knows which value is being copied based on the first operand.

The same holds true for passing WORD and BYTE values; that is the first four parameters are passed in 16- and 8-bit registers, respectively, and then the rest are passed on the stack.

Consider the following C/C++ code which passes four different sized parameters:


extern "C" long long addx64(char a, short b, int c, long long d);

int main()
{
	long long val = addx64(1, 2, 4, 8);
	return 0;
}

It is as simple as this to handle an 8-, 16-, 32- and 64-bit values.


_TEXT	SEGMENT
addx64 PROC

	xor rax, rax ; zero-out rax
	add al, cl   ; add bytes
	add ax, dx   ; add words
	add eax, r8d ; add dwords
	add rax, r9  ; add qwords

	ret

addx64 ENDP
_TEXT	ENDS
END
<<<[Page 2 of 3]>>>

This site uses cookies. Cookies are simple text files stored on the user's computer. They are used for adding features and security to this site. Read the privacy policy.
CLOSE