CMPT 295 - Unit - Machine-Level Programming
Lecture 22:
- Buffer Overflow + Floating-point data & operations
Last lecture
- Manipulation of 2D arrays – in x86-64
- From x86-64’s perspective, a 2D array is a contiguously allocated region of R * C * L bytes in memory where L = sizeof( T ) and T -> data type of elements stored in array
- 2D Array layout in memory: Row-Major ordering
- Memory address of each row A[i]: A + (i * C * L)
- Memory address of each element A[i][j]: A + (i * C * L) + (j * L) => A + (i * C + j) * L
Today’s Menu
- Introduction
- C program -> assembly code -> machine level code
- Assembly language basics: data, move operation
- Memory addressing modes
- Operation leaq and Arithmetic & logical operations
- Conditional Statement – Condition Code + cmovX
- Loops
- Function call – Stack
- Overview of Function Call
- Memory Layout and Stack - x86-64 instructions and registers
- Passing control
- Passing data – Calling Conventions
- Managing local data
- Recursion
- Array
- (highlighted) Buffer Overflow
- (highlighted) Floating-point data & operations
Buffer Overflow
C and Stack … so far
- C does not perform any bound checks on arrays
- stored on the stack
- Local variables in C programs
- Callee and caller saved registers
- Return addresses
- As we saw in Lab 2 and Lab 4, this may lead to trouble
What kind of trouble? -> buffer overflow (overrun)
- If function does not perform bound-check when writing to a local array …
- Here is a an example of a bound-check:
if input size <= array size write input into array
… then it may write more data that the allocated space (to array) can hold, hence overflowing the array -> buffer overflow
- Here is a an example of a bound-check:
- Effect: the function may end up writing over, i.e., %rsp corrupting, data kept on the stack such as:
- Value of local variables and registers
- Return address
- Stack smashing
M[] Stack |
---|
... |
return address |
Unused stack space |
local var |
buf[ ] |
%rsp -> Top |
Demo the trouble -> buffer overflow
(Transcriber’s note: no content on slide)
Why is buffer overflow a problem
- Corrupted data
- Corrupted return address
- Which may lead to segmentation fault
- How?
- Which also makes a system vulnerable to attacks
- How?
- Which may lead to segmentation fault
Code injection attack
void func1(){
func2();
// C statement
at return
address A
...
}
int func2() {
char buf[64];
gets(buf);
...
return ...;
}
M[] Stack | Stack Frame/Note |
---|---|
… | |
func1 stack frame | |
return address | func1 stack frame |
… | func2 stack frame |
buf[64] | func2 stack frame |
same buf[64] section labled B | func2 stack frame |
same buf[64] | func2 stack frame |
%rsp | top |
- An “attacker” could overflow the
buffer … array of char’s
- … by inputting a string that contains byte representation of malicious executable code (exploit code) instead of legitimate characters
- The string is written to array buf on stack and overwrites return address A with a return address that points to exploit code
- When func2 executes ret instruction, it pops this erroneous return address onto PC (%rip) and jumps to exploit code
- Microprocessor starts executing the exploit code at this location
How to protection against such attack
- Avoid creating overflow vulnerabilities in the code that we write by always checking bounds
- For example, by calling library functions that limit string lengths
- “Unsafe” : gets(), strcpy(), strcat(), sprintf(), …
- These functions can generate a byte sequence without being given any indication of the size of the destination buffer (see next slide) * “Safe”: fgets()
From our Lab 4
void proc1(char *s, int *a, int *b) {
int y;
int t;
t = *a;
v = proc2(*a, *b);
sprintf(s, "The result of proc2(%d,%d) is %d.", *a, *b, v);
*a = *b - 2;
*b = t;
return;
Suggestion from developer.apple.com
char destination[5];
char * source = “LARGER”;
- `strcpy(destination, source);
-
Color Value White L White A White R White G White E Brown R Brown \0 Brown - Copies the string pointed to by source (including the null character) to the destination and returns it.
-
strncpy(destination, source, sizeof(destination))
-
Color Value White L White A White R White G White E Brown Brown Brown - Copies up to sizeof(destination) -> n characters from the string pointed to by source to destination. In a case where the length of source is less than n, the remainder of destination will be padded with null bytes. In a case where the length of source is greater than n, the destination will contain a truncated version of source.
-
strlcpy(destination, source, sizeof(destination))
-
Color Value White L White A White R White G White \0 Brown Brown Brown - Copies up to sizeof(destination) - 1 -> n - 1 characters from null-terminated source to destination, it then “null” terminates destination and returns the length of source.
-
https://linux.die.net/man/3/strlcpy
How to protection against such attack
2) Employ system-level protections -> Randomized stack offsets
- At start of program, system allocates random amount of space on stack
- Effect: Shifts stack addresses (%rsp) for
entire program
- Shifts the memory address of all the stack frames allocated to program’s functions when they are called
- Hence, makes it difficult for hackers to predict start of each stack frame (hence where exploit code may have been inserted) since stack is repositioned each time program executes
M[] | Note |
---|---|
(crossed off) %rsp | |
brown shaded box, no value | |
top | %rsp |
How to protection against such attack
2) Employ system-level protections -> Non-executable code segments
- In the old days of x86, memory
segments marked as either read-only
or writeable (both implied readable)
=> 2 types of permissions
- Could execute anything readable
- x86-64 has added an explicit executable permission
- Stack segment now marked as nonexecutable M[] Stack|Note —|— …| |func1 stack frame “return address A” (crossed out) B|func1 stack frame padding (crossed out)|func2 stack frame exploit code|func2 stack frame B|func2 stack framme Top|%rsp
Any attempt to execute the bottom “B” set of code, will fail.
How to protection against such attack
3) Compiler (like gcc) uses a stack canary value
- History: Starting early 1900’s, canaries used in the coal mines to detect gas leaks
- Push a randomized canary value between an array and return address on stack (remember our Lab 4)
- Before executing a ret instruction,
canary value is checked to see if it
has been corrupted
- If so, failure reported
main: # main.c from our Lab 4
endbr64
pushq %rbp
...
subq $64, %rsp
movq %fs:40, %rax
movq %rax, 56(%rsp)
...
leaq 16(%rsp), %rbp
...
movq 56(%rsp), %rax
xorq %fs:40, %rax
jne .L5
addq $64, %rsp
popq %rbp
ret
.L5:
call __stack_chk_fail@PLT
How to protection against such attack
3) Newest version of our gcc compiler (version 8 and up) uses Control-Flow Enforcement Technology (CET) From stackoverflow
- Instruction endbr64 (End Branch 64 bit) -> Terminate Indirect Branch in 64 bit
- Microprocessor tracks indirect branching
and ensures that all indirect calls lead to
(legal) functions starting with endbr64
- If function does -> microprocessor infers that function is safe to execute
- If function does not -> microprocessor infers that control flow may have been manipulated by some exploit code, i.e., function is unsafe to execute and aborts!
Brief overview of floating-point data and operations
(Transcriber’s node: no content on slide)
Background
- Once upon a time in the ’90’s …
- Use of computer graphics and image processing (multimedia)
applications were on the rise
- Microprocessors (i.e., machine instruction sets) designed to support such applications
- Idea: speed up microprocessors by executing single instruction on multiple data -> SIMD
- Use of computer graphics and image processing (multimedia)
applications were on the rise
- Since then, microprocessors and their machine instruction sets
have evolved …
- SSE (Streaming SIMD Extensions)
- AVX (Advanced Vector EXtensions) -> textbook
XMM Registers
x86-64 registers and instructions seen so far are referred to as integer registers and integer instructions Now we introduce a new set of registers for floating point numbers:
- 16 in total, each 16-byte wide (128 bits), named: %xmm0, %xmm1, …, %xmm15
- Scalar mode:
- 1 single-precision float (32 bits). Diagram of memory showing utilization
- 1 double-precision double (64 bits) 63. Diagram of memory showing utilization
- Vector mode (packed data)
- 16 single-byte integers
- 8 16-bit integers
- 4 32-bit integers
- 4 single-precision float’s
- 2 double-precision double’s
Scalar versus Vector (SIMD) instructions
Assembly Instruction | Operation Type | Percision | Note |
---|---|---|---|
addss %xmm0,%xmm1 |
scalar | single | Add single precision at the last 32 bits of %xmm0 to the last 32 bit of %xmm1 . Save in the last 32-bits of %xmm1 . |
addps %xmm0,%xmm1 |
SMID (packed) | single | Add 4 sets of single percision numbers. Each 32 bit section of %xmm0 is added to each 32 bit section of %xmm1 . |
addsd %xmm0,%xmm1 |
scalar | double | Add two double-precision numbers. Add the last 64 bits of %xmm0 to the last 64 bits of %xmm1 . |
addpd %xmm0,%xmm1 |
SMID (packed) | double | Add a pair of double-precision numbers. Add each 64 bit sections of %xmm0 to each 64 bit section of %xmm1 . Store results in %xmm1 . |
Data movement instructions
Assembly:
float_mov:
# --------# float float_mov(float f1,
#
float *src,
#
float *dst) {
# float f2 = *src;
# *dst = f1;
# return f2;
# }
# --------# f1 in %xmm0, src in %rdi, dst in %rsi
movss (%rdi), %xmm1 # f2 = *src
movss %xmm0, (%rsi) # *dst = f1
movaps %xmm1, %xmm0
# return value = f2
ret
- The instructions we shall look at in this lecture are different than the ones presented in section 3.11 of our textbook – we shall focus on the scalar version of these instructions
- movss – move single precision
- Mem (32 bits) <–> %xmm
- movsd – move double precision
- Mem (64 bits) <–> %xmm
- First 2 instructions of program: Memory referencing operands (i.e., memory addressing mode operands) specified in the same way as for the integer mov* instructions
- movaps/movapd – move %xmm <–> %xmm
- ap -> aligned packed
Function call and register saving conventions
- Function call convention
- Integer (and pointer i.e., memory address) arguments passed in integer registers
- Floating point values passed in XMM registers
- Argument 1 to argument 8 passed in %xmm0, %xmm1, …, %xmm7
- Result returned in %xmm0
- Register saving convention
- All XMM registers caller-saved
- Can use register
%xmm8
to%xmm15
for managing local data
Data conversion instructions
Converting between data types: (“t” is for “truncate”)
from | int | float | long | double |
---|---|---|---|---|
int | N/A | cvtsi2ss |
N/A | cvtsi2sd |
float | cvttss2si |
N/A | cvttss2siq |
cvtss2sd |
long | N/A | cvtsi2ssq |
N/A | cvtsi2sdq |
double | cvtsi2sd |
cvtsd2ss |
cvttsd2siq |
N/A |
Data manipulation instructions
Arithmetic
addss
/addsd
- floating point addsubss
/subsd
- … subtractmulss
/mulsd
- … muldivss
/divsd
- … div
Logical
andps
/andpd
orps/d
xorps/d
xorpd %xmm0, %xmm0
: effect%xmm0 <- 0
Comparison: ucomiss/d
- Affects only condition codes:
CF
,ZF
- use unsigned branches
- If NaN, set all of condition codes:
CF
,ZF
andPF
- Use
jp
/jnp
to branch onPF
- Use
Others
maxss
/maxsd
- … max- For example:
maxss %xmm3, %xmm5
Effect:xmm5 <- max(xmm5, xmm3)
- For example:
minss
/minsd
- … minsqrtss
/sqrtsd
- … square root
Example
fadd:
# --------
# float fadd(float x, float y){
# return x + y;
# }
# --------
# x in %xmm0, y in %xmm1
addss
%xmm1, %xmm0
ret
dadd:
# --------
# double dadd(double x, double y){
# return x + y;
# }
# --------
# x in %xmm0, y in %xmm1
addsd
%xmm1, %xmm0
ret
Storing Data in Various Segments of Memory - Optional
(Transcriber’s note: no content on slide)
Storing Data in Memory
This material is optional –> It is for your learning pleasure!
We already know about data on stack and on heap.
- Data on stack memory (on stack frame of function)
- Temporarily use and recycle
- Lasts through life of function call
- Data on heap
- Temporarily use and recycle
- Lasts until memory is “free’ed”
- Data in fixed memory, i.e., Data segment. What does this type of data look like?
- Statically allocated data
- e.g., global variables, static variables, string constants
- Lasts while program executes
- Statically allocated data
Data stored in Data Segment
This material is optional –> It is for your learning pleasure!
- Declared using a label & a directive for size
- label is a memory address
- size: .byte (1), .word (2), .long (4), .quad (8)
- initial value
Example 1:
C:
long x = 6;
long y = 9;
void main {
...
}
x86-64:
x: .quad 6 # 0x0000000000000006
y: .quad 9 # 0x0000000000000009
label | Stack | |||||||
---|---|---|---|---|---|---|---|---|
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | |
y | 09 (LSB; remember little endian) | 00 | 00 | 00 | 00 | 00 | 00 | 00 |
x | 06 (LSB) | 00 | 00 | 00 | 00 | 00 | 00 | 00 |
Data stored in Data Segment
This material is optional –> It is for your learning pleasure!
C:
#define N 6
int A[N] = {12,34,56,78,-90,1};
void main(){
printf("The total is %d.\n", sum_arrau(A,N));
return;
}
Assembly:
main:
.LFB38:
.cfi_startproc
subq $0,%rsp
.cfi_def_cfa_offset 16
movl $6,%esi
movl $A,%edi
call sum_array
movl %.LC0,%esi
movl %eax,%eax
movl $1,%edi
xorl %eax,%eax
addq $8,%rsp
.cfi_def_cfa_offset 8
jmp __printf_chk
...
A:
.long 12 # or .long 12,34,56,78,-90,1
.long 34
.long 56
.long 78
.long -90
.long 1
.ident "GCC: (Ubuntu 7.3.0-21ubuntu1-16.04) 7.3.0"
.section .note.GNU-stack,"",@progbits
Data stored on Stack – Example 1
This material is optional –> It is for your learning pleasure!
C:
void main(int argc, char* argv){
int A[] = {12,34,56,78,-90,1}; // 12 and 34 are highlighted.
printf("The total is %d.\n", sum_array(A,N));
return;
}
Assembly:
main:
.LFB38:
.cfi_startproc
subq $40,%rsp
.cfi_def_cfa_offset 48
movl $6,%esi
movq %fs:40,%rax
movq %rax,24(%rsp)
xorl %eax,%eax
movabsq $146028888076,%rax # highloghted
movq %rsp,%rdi
movq %rax,(%rsp)
movabsq $335007449144,%rax # highlighted
movq %rax,8(%rsp)
movabsq $8589934502,%rax # highlighted
movq %rax,16(%rsp)
call sum_array
movl $.LC0,%esi
movl %eax,%edx
movl $1,%edi
xorl %eax,%eax
call __printf_chk
movq 24(%rsp), %rax
xorq %fs:40,%rax
jne .L5
addq $40,%rsp
.cfi_remember_state
.cfi_def_cfa_offset 8
ret
How does this large # end up representing 12 and 34:
- Express $146028888076 in binary
- Transform binary to hex => 0x000000220000000c
- Read hex’s LSB (32 bits) (0000000c) as a decimal => 12
- Read hex’s MSB (32 bits) (00000022) as a decimal => 34
- Repeat for other 2 operands of movabsq instructions
Summary - 1
- What is a buffer overflow
- When function writes more data in array than array can hold on stack
- Effect: data kept on the stack (value of other local variables and registers, return address) may be corrupted -> Stack smashing
- Why buffer overflow spells trouble -> it creates vulnerability
- Allowing hacker attacks
- How to protect system against such attacks
- Avoid creating overflow vulnerabilities in the code that we write
- By always checking bounds and calling “safe” library functions that consider size of array
- Employ system-level protections
- Randomized initial stack pointer and non-executable code segments
- Use compiler (like gcc) security features:
- Stack “canary” value and endbr64 instruction
- Avoid creating overflow vulnerabilities in the code that we write
Summary - 2
- Floating point data and operations
- Data held and manipulated in XMM registers
- Assembly language instructions similar to integer assembly language instructions we have seen so far
Next Lecture
Start a new unit …
- Instruction Set Architecture (ISA)