You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1150 lines
36 KiB

This file contains invisible Unicode characters!

This file contains invisible Unicode characters that may be processed differently from what appears below. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to reveal hidden characters.

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title> | tait.tech</title>
<link rel="stylesheet" href="/assets/css/style.css" id="main-stylesheet">
<meta name="viewport" content="width=device-width, initial-scale=1.0"><link rel="stylesheet" href="/assets/css/katex.css" id="math-stylesheet">
</head>
<body>
<main>
<div id="wrapper">
<h1 id="cmpt-295---unit---machine-level-programming">CMPT 295 - Unit - Machine-Level Programming</h1>
<p>Lecture 22:</p>
<ul>
<li>Buffer Overflow + Floating-point data &amp; operations</li>
</ul>
<h2 id="last-lecture">Last lecture</h2>
<ul>
<li>Manipulation of 2D arrays in x86-64
<ul>
<li>From x86-64s perspective, a 2D array is a contiguously
allocated region of R * C * L bytes in memory where
L = sizeof( T ) and T -&gt; data type of elements stored
in array</li>
<li>2D Array layout in memory: Row-Major ordering</li>
<li>Memory address of each row A[i]: A + (i * C * L)</li>
<li>Memory address of each element A[i][j]: A + (i * C * L) + (j * L) =&gt; A + (i * C + j) * L</li>
</ul>
</li>
</ul>
<h2 id="todays-menu">Todays Menu</h2>
<ul>
<li>Introduction
<ul>
<li>C program -&gt; assembly code -&gt; machine level code</li>
</ul>
</li>
<li>Assembly language basics: data, move operation
<ul>
<li>Memory addressing modes</li>
</ul>
</li>
<li>Operation leaq and Arithmetic &amp; logical operations</li>
<li>Conditional Statement Condition Code + cmovX</li>
<li>Loops</li>
<li>Function call Stack
<ul>
<li>Overview of Function Call</li>
<li>Memory Layout and Stack - x86-64 instructions and registers</li>
<li>Passing control</li>
<li>Passing data Calling Conventions</li>
<li>Managing local data</li>
<li>Recursion</li>
</ul>
</li>
<li>Array</li>
<li>(highlighted) Buffer Overflow</li>
<li>(highlighted) Floating-point data &amp; operations</li>
</ul>
<h2 id="buffer-overflow">Buffer Overflow</h2>
<h2 id="c-and-stack--so-far">C and Stack … so far</h2>
<ul>
<li>C does not perform any bound checks on arrays</li>
<li>stored on the stack
<ul>
<li>Local variables in C programs</li>
<li>Callee and caller saved registers</li>
<li>Return addresses</li>
</ul>
</li>
<li>As we saw in Lab 2 and Lab 4, this may lead to trouble</li>
</ul>
<h2 id="what-kind-of-trouble---buffer-overflow-overrun">What kind of trouble? -&gt; buffer overflow (overrun)</h2>
<ul>
<li>If function does not perform bound-check when writing to a local array …
<ul>
<li>Here is a an example of a bound-check: <code class="language-plaintext highlighter-rouge">if input size &lt;= array size
write input into array</code> … then it may write more data that the allocated
space (to array) can hold, hence overflowing the array -&gt; buffer overflow</li>
</ul>
</li>
<li>Effect: the function may end up writing over, i.e., %rsp corrupting, data kept on the stack such as:
<ul>
<li>Value of local variables and registers</li>
<li>Return address</li>
</ul>
</li>
<li>Stack smashing</li>
</ul>
<table>
<thead>
<th>M[]<br />Stack</th>
</thead>
<tbody>
<tr>
<td>...</td>
</tr>
<tr>
<td>return address</td>
</tr>
<tr>
<td>Unused stack space</td>
</tr>
<tr>
<td>local var</td>
</tr>
<tr>
<td>buf[ ]</td>
</tr>
<tr>
<td>%rsp -&gt; Top</td>
</tr>
</tbody>
</table>
<h2 id="demo-the-trouble---buffer-overflow">Demo the trouble -&gt; buffer overflow</h2>
<p>(Transcribers note: no content on slide)</p>
<h2 id="why-is-buffer-overflow-a-problem">Why is buffer overflow a problem</h2>
<ul>
<li>Corrupted data</li>
<li>Corrupted return address
<ul>
<li>Which may lead to segmentation fault
<ul>
<li>How?</li>
</ul>
</li>
<li>Which also makes a system vulnerable to attacks
<ul>
<li>How?</li>
</ul>
</li>
</ul>
</li>
</ul>
<h2 id="code-injection-attack">Code injection attack</h2>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void func1(){
func2();
// C statement
at return
address A
...
}
</code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>int func2() {
char buf[64];
gets(buf);
...
return ...;
}
</code></pre></div></div>
<table>
<thead>
<tr>
<th>M[] Stack</th>
<th>Stack Frame/Note</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td> </td>
</tr>
<tr>
<td> </td>
<td>func1 stack frame</td>
</tr>
<tr>
<td>return address</td>
<td>func1 stack frame</td>
</tr>
<tr>
<td></td>
<td>func2 stack frame</td>
</tr>
<tr>
<td>buf[64]</td>
<td>func2 stack frame</td>
</tr>
<tr>
<td>same buf[64] section labled B</td>
<td>func2 stack frame</td>
</tr>
<tr>
<td>same buf[64]</td>
<td>func2 stack frame</td>
</tr>
<tr>
<td>%rsp</td>
<td>top</td>
</tr>
</tbody>
</table>
<ul>
<li>An “attacker” could overflow the
buffer … array of chars
<ul>
<li>… by inputting a string that contains byte representation of malicious executable code (exploit code) instead of legitimate characters</li>
<li>The string is written to array buf on stack and overwrites return address A with a return address that points to exploit code</li>
<li>When func2 executes ret instruction, it pops this erroneous return address onto PC (%rip) and jumps to exploit code</li>
<li>Microprocessor starts executing the exploit code at this location</li>
</ul>
</li>
</ul>
<h2 id="how-to-protection-against-such-attack">How to protection against such attack</h2>
<ol>
<li>Avoid creating overflow vulnerabilities in the code that we write by always checking bounds
<ul>
<li>For example, by calling library functions that limit string lengths</li>
</ul>
<ul>
<li>“Unsafe” : gets(), strcpy(), strcat(), sprintf(), …
<ul>
<li>These functions can generate a byte sequence without being given any indication of the size of the destination buffer (see next slide)
* “Safe”: fgets()</li>
</ul>
</li>
</ul>
</li>
</ol>
<h2 id="from-our-lab-4">From our Lab 4</h2>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void proc1(char *s, int *a, int *b) {
int y;
int t;
t = *a;
v = proc2(*a, *b);
sprintf(s, "The result of proc2(%d,%d) is %d.", *a, *b, v);
*a = *b - 2;
*b = t;
return;
</code></pre></div></div>
<h2 id="suggestion-from-developerapplecom">Suggestion from developer.apple.com</h2>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>char destination[5];
char * source = “LARGER”;
</code></pre></div></div>
<ol>
<li>`strcpy(destination, source);
<ul>
<li>
<table>
<thead>
<tr>
<th>Color</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>White</td>
<td>L</td>
</tr>
<tr>
<td>White</td>
<td>A</td>
</tr>
<tr>
<td>White</td>
<td>R</td>
</tr>
<tr>
<td>White</td>
<td>G</td>
</tr>
<tr>
<td>White</td>
<td>E</td>
</tr>
<tr>
<td>Brown</td>
<td>R</td>
</tr>
<tr>
<td>Brown</td>
<td>\0</td>
</tr>
<tr>
<td>Brown</td>
<td> </td>
</tr>
</tbody>
</table>
</li>
<li>Copies the string pointed
to by source (including
the null character) to the
destination and returns it.</li>
</ul>
</li>
<li><code class="language-plaintext highlighter-rouge">strncpy(destination, source, sizeof(destination))</code>
<ul>
<li>
<table>
<thead>
<tr>
<th>Color</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>White</td>
<td>L</td>
</tr>
<tr>
<td>White</td>
<td>A</td>
</tr>
<tr>
<td>White</td>
<td>R</td>
</tr>
<tr>
<td>White</td>
<td>G</td>
</tr>
<tr>
<td>White</td>
<td>E</td>
</tr>
<tr>
<td>Brown</td>
<td> </td>
</tr>
<tr>
<td>Brown</td>
<td> </td>
</tr>
<tr>
<td>Brown</td>
<td> </td>
</tr>
</tbody>
</table>
</li>
<li>Copies up to
sizeof(destination) -&gt; n
characters from the
string pointed to by
source to destination. In
a case where the length
of source is less than n,
the remainder of
destination will be
padded with null bytes.
In a case where the
length of source is
greater than n, the
destination will contain
a truncated version of
source.</li>
</ul>
</li>
<li><code class="language-plaintext highlighter-rouge">strlcpy(destination, source, sizeof(destination))</code>
<ul>
<li>
<table>
<thead>
<tr>
<th>Color</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>White</td>
<td>L</td>
</tr>
<tr>
<td>White</td>
<td>A</td>
</tr>
<tr>
<td>White</td>
<td>R</td>
</tr>
<tr>
<td>White</td>
<td>G</td>
</tr>
<tr>
<td>White</td>
<td>\0</td>
</tr>
<tr>
<td>Brown</td>
<td> </td>
</tr>
<tr>
<td>Brown</td>
<td> </td>
</tr>
<tr>
<td>Brown</td>
<td> </td>
</tr>
</tbody>
</table>
</li>
<li>Copies up to
sizeof(destination) - 1
-&gt; n - 1 characters
from null-terminated
source to destination,
it then “null” terminates
destination and returns
the length of source.</li>
</ul>
</li>
</ol>
<p><a href="https://linux.die.net/man/3/strlcpy">https://linux.die.net/man/3/strlcpy</a></p>
<h2 id="how-to-protection-against-such-attack-1">How to protection against such attack</h2>
<p>2) Employ system-level protections -&gt; Randomized stack offsets</p>
<ul>
<li>At start of program, system allocates
random amount of space on stack</li>
<li>Effect: Shifts stack addresses (%rsp) for
entire program
<ul>
<li>Shifts the memory address of all the stack
frames allocated to programs functions
when they are called</li>
</ul>
</li>
<li>Hence, makes it difficult for hackers to
predict start of each stack frame (hence
where exploit code may have been
inserted) since stack is repositioned each
time program executes</li>
</ul>
<table>
<thead>
<tr>
<th>M[]</th>
<th>Note</th>
</tr>
</thead>
<tbody>
<tr>
<td> </td>
<td>(crossed off) %rsp</td>
</tr>
<tr>
<td>brown shaded box, no value</td>
<td> </td>
</tr>
<tr>
<td>top</td>
<td>%rsp</td>
</tr>
</tbody>
</table>
<h2 id="how-to-protection-against-such-attack-2">How to protection against such attack</h2>
<p>2) Employ system-level protections -&gt; Non-executable code segments</p>
<ul>
<li>In the old days of x86, memory
segments marked as either read-only
or writeable (both implied readable)
=&gt; 2 types of permissions
<ul>
<li>Could execute anything readable</li>
</ul>
</li>
<li>x86-64 has added an explicit
executable permission</li>
<li>Stack segment now marked as nonexecutable
M[] Stack|Note
—|—
…|
|func1 stack frame
“return address A” (crossed out) B|func1 stack frame
padding (crossed out)|func2 stack frame
exploit code|func2 stack frame
B|func2 stack framme
Top|%rsp</li>
</ul>
<p>Any attempt to execute the bottom “B” set of code, will fail.</p>
<h2 id="how-to-protection-against-such-attack-3">How to protection against such attack</h2>
<p>3) Compiler (like gcc) uses a stack canary value</p>
<ul>
<li>History: Starting early 1900s,
canaries used in the coal mines to
detect gas leaks</li>
<li>Push a randomized canary value
between an array and return
address on stack (remember our
Lab 4)</li>
<li>Before executing a ret instruction,
canary value is checked to see if it
has been corrupted
<ul>
<li>If so, failure reported</li>
</ul>
</li>
</ul>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main: # main.c from our Lab 4
endbr64
pushq %rbp
...
subq $64, %rsp
movq %fs:40, %rax
movq %rax, 56(%rsp)
...
leaq 16(%rsp), %rbp
...
movq 56(%rsp), %rax
xorq %fs:40, %rax
jne .L5
addq $64, %rsp
popq %rbp
ret
.L5:
call __stack_chk_fail@PLT
</code></pre></div></div>
<h2 id="how-to-protection-against-such-attack-4">How to protection against such attack</h2>
<p>3) Newest version of our gcc compiler
(version 8 and up) uses Control-Flow
Enforcement Technology (CET) <a href="#sfo">From stackoverflow</a></p>
<ul>
<li>Instruction endbr64 (End Branch 64 bit) -&gt; Terminate Indirect Branch in 64 bit</li>
<li>Microprocessor tracks indirect branching
and ensures that all indirect calls lead to
(legal) functions starting with endbr64
<ul>
<li>If function does -&gt; microprocessor infers
that function is safe to execute</li>
<li>If function does not -&gt; microprocessor
infers that control flow may have been
manipulated by some exploit code, i.e.,
function is unsafe to execute and aborts!</li>
</ul>
</li>
</ul>
<div id="sfo">Source: <a href="https://stackoverflow.com/questions/56905811/what-does-the-endbr64-instruction-actually-do">https://stackoverflow.com/questions/56905811/what-does-the-endbr64-instruction-actually-do</a></div>
<h2 id="brief-overview-of-floating-point-data-and-operations">Brief overview of floating-point data and operations</h2>
<p>(Transcribers node: no content on slide)</p>
<h2 id="background">Background</h2>
<ul>
<li>Once upon a time in the 90s …
<ul>
<li>Use of computer graphics and image processing (multimedia)
applications were on the rise
<ul>
<li>Microprocessors (i.e., machine instruction sets) designed to
support such applications</li>
<li>Idea: speed up microprocessors by executing single
instruction on multiple data -&gt; SIMD</li>
</ul>
</li>
</ul>
</li>
<li>Since then, microprocessors and their machine instruction sets
have evolved …
<ul>
<li>SSE (Streaming SIMD Extensions)</li>
<li>AVX (Advanced Vector EXtensions) -&gt; textbook</li>
</ul>
</li>
</ul>
<h2 id="xmm-registers">XMM Registers</h2>
<p>x86-64 registers and instructions seen so far are referred to as integer registers and integer instructions
Now we introduce a new set of registers for floating point numbers:</p>
<ul>
<li>16 in total, each 16-byte wide (128 bits), named: %xmm0, %xmm1, …, %xmm15</li>
<li>Scalar mode:
<ul>
<li>1 single-precision float (32 bits). Diagram of memory showing <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mn>32</mn><mn>128</mn></mfrac><mo>=</mo><mfrac><mn>1</mn><mn>4</mn></mfrac></mrow><annotation encoding="application/x-tex">\frac{32}{128} = \frac{1}{4}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.190108em;vertical-align:-0.345em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.845108em;"><span style="top:-2.6550000000000002em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">128</span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.394em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">32</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.345em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1.190108em;vertical-align:-0.345em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.845108em;"><span style="top:-2.6550000000000002em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">4</span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.394em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.345em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span> utilization</li>
<li>1 double-precision double (64 bits) 63. Diagram of memory showing <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mn>64</mn><mn>128</mn></mfrac><mo>=</mo><mfrac><mn>1</mn><mn>2</mn></mfrac></mrow><annotation encoding="application/x-tex">\frac{64}{128} = \frac{1}{2}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.190108em;vertical-align:-0.345em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.845108em;"><span style="top:-2.6550000000000002em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">128</span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.394em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">64</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.345em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:1.190108em;vertical-align:-0.345em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.845108em;"><span style="top:-2.6550000000000002em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">2</span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.394em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.345em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span> utilization</li>
</ul>
</li>
<li>Vector mode (packed data)
<ul>
<li>16 single-byte integers</li>
<li>8 16-bit integers</li>
<li>4 32-bit integers</li>
<li>4 single-precision floats</li>
<li>2 double-precision doubles</li>
</ul>
</li>
</ul>
<h2 id="scalar-versus-vector-simd-instructions">Scalar versus Vector (SIMD) instructions</h2>
<table>
<thead>
<tr>
<th>Assembly Instruction</th>
<th>Operation Type</th>
<th>Percision</th>
<th>Note</th>
</tr>
</thead>
<tbody>
<tr>
<td><code class="language-plaintext highlighter-rouge">addss %xmm0,%xmm1</code></td>
<td>scalar</td>
<td>single</td>
<td>Add single precision at the last 32 bits of <code class="language-plaintext highlighter-rouge">%xmm0</code> to the last 32 bit of <code class="language-plaintext highlighter-rouge">%xmm1</code>. Save in the last 32-bits of <code class="language-plaintext highlighter-rouge">%xmm1</code>.</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">addps %xmm0,%xmm1</code></td>
<td>SMID (packed)</td>
<td>single</td>
<td>Add 4 sets of single percision numbers. Each 32 bit section of <code class="language-plaintext highlighter-rouge">%xmm0</code> is added to each 32 bit section of <code class="language-plaintext highlighter-rouge">%xmm1</code>.</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">addsd %xmm0,%xmm1</code></td>
<td>scalar</td>
<td>double</td>
<td>Add two double-precision numbers. Add the last 64 bits of <code class="language-plaintext highlighter-rouge">%xmm0</code> to the last 64 bits of <code class="language-plaintext highlighter-rouge">%xmm1</code>.</td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">addpd %xmm0,%xmm1</code></td>
<td>SMID (packed)</td>
<td>double</td>
<td>Add a pair of double-precision numbers. Add each 64 bit sections of <code class="language-plaintext highlighter-rouge">%xmm0</code> to each 64 bit section of <code class="language-plaintext highlighter-rouge">%xmm1</code>. Store results in <code class="language-plaintext highlighter-rouge">%xmm1</code>.</td>
</tr>
</tbody>
</table>
<h2 id="data-movement-instructions">Data movement instructions</h2>
<p>Assembly:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>float_mov:
# --------# float float_mov(float f1,
#
float *src,
#
float *dst) {
# float f2 = *src;
# *dst = f1;
# return f2;
# }
# --------# f1 in %xmm0, src in %rdi, dst in %rsi
movss (%rdi), %xmm1 # f2 = *src
movss %xmm0, (%rsi) # *dst = f1
movaps %xmm1, %xmm0
# return value = f2
ret
</code></pre></div></div>
<ul>
<li>The instructions we shall look at in this
lecture are different than the ones
presented in section 3.11 of our
textbook we shall focus on the scalar
version of these instructions</li>
<li>movss move single precision
<ul>
<li>Mem (32 bits) &lt;&gt; %xmm</li>
</ul>
</li>
<li>movsd move double precision
<ul>
<li>Mem (64 bits) &lt;&gt; %xmm</li>
</ul>
</li>
<li>First 2 instructions of program: Memory
referencing operands (i.e., memory
addressing mode operands) specified
in the same way as for the integer mov*
instructions</li>
<li>movaps/movapd move %xmm &lt;&gt; %xmm
<ul>
<li>ap -&gt; aligned packed</li>
</ul>
</li>
</ul>
<h2 id="function-call-and-register-saving-conventions">Function call and register saving conventions</h2>
<ul>
<li>Function call convention
<ul>
<li>Integer (and pointer i.e., memory address) arguments passed in
integer registers</li>
<li>Floating point values passed in XMM registers</li>
<li>Argument 1 to argument 8 passed in %xmm0, %xmm1, …, %xmm7</li>
<li>Result returned in %xmm0</li>
</ul>
</li>
<li>Register saving convention
<ul>
<li>All XMM registers caller-saved</li>
<li>Can use register <code class="language-plaintext highlighter-rouge">%xmm8</code> to <code class="language-plaintext highlighter-rouge">%xmm15</code> for managing local data</li>
</ul>
</li>
</ul>
<h2 id="data-conversion-instructions">Data conversion instructions</h2>
<p>Converting between data types: (“t” is for “truncate”)</p>
<table>
<thead>
<tr>
<th>from</th>
<th>int</th>
<th>float</th>
<th>long</th>
<th>double</th>
</tr>
</thead>
<tbody>
<tr>
<th>int</th>
<td>N/A</td>
<td><code>cvtsi2ss</code></td>
<td>N/A</td>
<td><code>cvtsi2sd</code></td>
</tr>
<tr>
<th>float</th>
<td><code>cvttss2si</code></td>
<td>N/A</td>
<td><code>cvttss2siq</code></td>
<td><code>cvtss2sd</code></td>
</tr>
<tr>
<th>long</th>
<td>N/A</td>
<td><code>cvtsi2ssq</code></td>
<td>N/A</td>
<td><code>cvtsi2sdq</code></td>
</tr>
<tr>
<th>double</th>
<td><code>cvtsi2sd</code></td>
<td><code>cvtsd2ss</code></td>
<td><code>cvttsd2siq</code></td>
<td>N/A</td>
</tr>
</tbody>
</table>
<h2 id="data-manipulation-instructions">Data manipulation instructions</h2>
<p>Arithmetic</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">addss</code>/<code class="language-plaintext highlighter-rouge">addsd</code> - floating point add</li>
<li><code class="language-plaintext highlighter-rouge">subss</code>/<code class="language-plaintext highlighter-rouge">subsd</code> - … subtract</li>
<li><code class="language-plaintext highlighter-rouge">mulss</code>/<code class="language-plaintext highlighter-rouge">mulsd</code> - … mul</li>
<li><code class="language-plaintext highlighter-rouge">divss</code>/<code class="language-plaintext highlighter-rouge">divsd</code> - … div</li>
</ul>
<p>Logical</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">andps</code>/<code class="language-plaintext highlighter-rouge">andpd</code></li>
<li><code class="language-plaintext highlighter-rouge">orps/d</code></li>
<li><code class="language-plaintext highlighter-rouge">xorps/d</code></li>
<li><code class="language-plaintext highlighter-rouge">xorpd %xmm0, %xmm0</code>: effect <code class="language-plaintext highlighter-rouge">%xmm0 &lt;- 0</code></li>
</ul>
<p>Comparison: <code class="language-plaintext highlighter-rouge">ucomiss/d</code></p>
<ul>
<li>Affects only condition codes: <code class="language-plaintext highlighter-rouge">CF</code>, <code class="language-plaintext highlighter-rouge">ZF</code>
<ul>
<li>use unsigned branches</li>
</ul>
</li>
<li>If NaN, set all of condition codes:
<code class="language-plaintext highlighter-rouge">CF</code>, <code class="language-plaintext highlighter-rouge">ZF</code> and <code class="language-plaintext highlighter-rouge">PF</code>
<ul>
<li>Use <code class="language-plaintext highlighter-rouge">jp</code>/<code class="language-plaintext highlighter-rouge">jnp</code> to branch on <code class="language-plaintext highlighter-rouge">PF</code></li>
</ul>
</li>
</ul>
<p>Others</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">maxss</code>/<code class="language-plaintext highlighter-rouge">maxsd</code> - … max
<ul>
<li>For example: <code class="language-plaintext highlighter-rouge">maxss %xmm3, %xmm5</code>
Effect: <code class="language-plaintext highlighter-rouge">xmm5 &lt;- max(xmm5, xmm3)</code></li>
</ul>
</li>
<li><code class="language-plaintext highlighter-rouge">minss</code>/<code class="language-plaintext highlighter-rouge">minsd</code> - … min</li>
<li><code class="language-plaintext highlighter-rouge">sqrtss</code>/<code class="language-plaintext highlighter-rouge">sqrtsd</code> - … square root</li>
</ul>
<h2 id="example">Example</h2>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fadd:
# --------
# float fadd(float x, float y){
# return x + y;
# }
# --------
# x in %xmm0, y in %xmm1
addss
%xmm1, %xmm0
ret
dadd:
# --------
# double dadd(double x, double y){
# return x + y;
# }
# --------
# x in %xmm0, y in %xmm1
addsd
%xmm1, %xmm0
ret
</code></pre></div></div>
<h2 id="storing-data-in-various-segments-of-memory---optional">Storing Data in Various Segments of Memory - Optional</h2>
<p>(Transcribers note: no content on slide)</p>
<h2 id="storing-data-in-memory">Storing Data in Memory</h2>
<p>This material is optional &gt; It is for your learning pleasure!</p>
<p>We already
know about
data on stack
and on heap.</p>
<ul>
<li>Data on stack memory (on stack frame of function)
<ul>
<li>Temporarily use and recycle</li>
<li>Lasts through life of function call</li>
</ul>
</li>
<li>Data on heap
<ul>
<li>Temporarily use and recycle</li>
<li>Lasts until memory is “freeed”</li>
</ul>
</li>
<li>Data in fixed memory, i.e., Data segment. What does this type of data look like?
<ul>
<li>Statically allocated data
<ul>
<li>e.g., global variables, static variables, string constants</li>
</ul>
</li>
<li>Lasts while program executes</li>
</ul>
</li>
</ul>
<h2 id="data-stored-in-data-segment">Data stored in Data Segment</h2>
<p>This material is optional &gt; It is for your learning pleasure!</p>
<ul>
<li>Declared using a label &amp; a directive for size
<ul>
<li>label is a memory address</li>
<li>size: .byte (1), .word (2), .long (4), .quad (8)</li>
<li>initial value</li>
</ul>
</li>
</ul>
<h3 id="example-1">Example 1:</h3>
<p>C:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>long x = 6;
long y = 9;
void main {
...
}
</code></pre></div></div>
<p>x86-64:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>x: .quad 6 # 0x0000000000000006
y: .quad 9 # 0x0000000000000009
</code></pre></div></div>
<table>
<thead>
<tr>
<th>label</th>
<th colspan="8">Stack</th>
</tr>
<tr>
<th></th>
<th>0</th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
</tr>
</thead>
<tbody>
<tr>
<td>y</td>
<td>09 (LSB; remember little endian)</td>
<td>00</td>
<td>00</td>
<td>00</td>
<td>00</td>
<td>00</td>
<td>00</td>
<td>00</td>
</tr>
<tr>
<td>x</td>
<td>06 (LSB)</td>
<td>00</td>
<td>00</td>
<td>00</td>
<td>00</td>
<td>00</td>
<td>00</td>
<td>00</td>
</tr>
</tbody>
</table>
<h2 id="data-stored-in-data-segment-1">Data stored in Data Segment</h2>
<p>This material is optional
&gt; It is for your
learning pleasure!</p>
<p>C:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#define N 6
int A[N] = {12,34,56,78,-90,1};
void main(){
printf("The total is %d.\n", sum_arrau(A,N));
return;
}
</code></pre></div></div>
<p>Assembly:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main:
.LFB38:
.cfi_startproc
subq $0,%rsp
.cfi_def_cfa_offset 16
movl $6,%esi
movl $A,%edi
call sum_array
movl %.LC0,%esi
movl %eax,%eax
movl $1,%edi
xorl %eax,%eax
addq $8,%rsp
.cfi_def_cfa_offset 8
jmp __printf_chk
...
A:
.long 12 # or .long 12,34,56,78,-90,1
.long 34
.long 56
.long 78
.long -90
.long 1
.ident "GCC: (Ubuntu 7.3.0-21ubuntu1-16.04) 7.3.0"
.section .note.GNU-stack,"",@progbits
</code></pre></div></div>
<h2 id="data-stored-on-stack--example-1">Data stored on Stack Example 1</h2>
<p>This material is optional &gt; It is for your learning pleasure!</p>
<p>C:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void main(int argc, char* argv){
int A[] = {12,34,56,78,-90,1}; // 12 and 34 are highlighted.
printf("The total is %d.\n", sum_array(A,N));
return;
}
</code></pre></div></div>
<p>Assembly:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main:
.LFB38:
.cfi_startproc
subq $40,%rsp
.cfi_def_cfa_offset 48
movl $6,%esi
movq %fs:40,%rax
movq %rax,24(%rsp)
xorl %eax,%eax
movabsq $146028888076,%rax # highloghted
movq %rsp,%rdi
movq %rax,(%rsp)
movabsq $335007449144,%rax # highlighted
movq %rax,8(%rsp)
movabsq $8589934502,%rax # highlighted
movq %rax,16(%rsp)
call sum_array
movl $.LC0,%esi
movl %eax,%edx
movl $1,%edi
xorl %eax,%eax
call __printf_chk
movq 24(%rsp), %rax
xorq %fs:40,%rax
jne .L5
addq $40,%rsp
.cfi_remember_state
.cfi_def_cfa_offset 8
ret
</code></pre></div></div>
<p>How does this large # end up representing 12 and 34:</p>
<ul>
<li>Express $146028888076 in binary</li>
<li>Transform binary to hex =&gt; 0x000000220000000c</li>
<li>Read hexs LSB (32 bits) (0000000c) as a decimal
=&gt; 12</li>
<li>Read hexs MSB (32 bits) (00000022) as a decimal
=&gt; 34</li>
<li>Repeat for other 2 operands of movabsq
instructions</li>
</ul>
<h2 id="summary---1">Summary - 1</h2>
<ul>
<li>What is a buffer overflow
<ul>
<li>When function writes more data in array than array can hold on stack</li>
<li>Effect: data kept on the stack (value of other local variables and registers,
return address) may be corrupted
-&gt; Stack smashing</li>
</ul>
</li>
<li>Why buffer overflow spells trouble -&gt; it creates vulnerability
<ul>
<li>Allowing hacker attacks</li>
</ul>
</li>
<li>How to protect system against such attacks
<ol>
<li>Avoid creating overflow vulnerabilities in the code that we write
<ul>
<li>By always checking bounds and calling “safe” library functions that
consider size of array</li>
</ul>
</li>
<li>Employ system-level protections
<ul>
<li>Randomized initial stack pointer and non-executable code segments</li>
</ul>
</li>
<li>Use compiler (like gcc) security features:
<ul>
<li>Stack “canary” value and endbr64 instruction</li>
</ul>
</li>
</ol>
</li>
</ul>
<h2 id="summary---2">Summary - 2</h2>
<ul>
<li>Floating point data and operations
<ul>
<li>Data held and manipulated in XMM registers</li>
<li>Assembly language instructions similar to integer
assembly language instructions we have seen so far</li>
</ul>
</li>
</ul>
<h2 id="next-lecture">Next Lecture</h2>
<p>Start a new unit …</p>
<ul>
<li>Instruction Set Architecture (ISA)</li>
</ul>
<footer>
</footer>
</div>
</main>
</body>
</html>