mirror of
https://github.com/The-Art-of-Hacking/h4cker.git
synced 2026-01-06 19:15:27 -05:00
360 lines
12 KiB
Markdown
360 lines
12 KiB
Markdown
# Memory and the Stack
|
|
|
|
## Understanding Computer Memory
|
|
|
|
To understand buffer overflows, you need to understand how programs use memory. When a program runs, the operating system allocates memory to it, which is divided into several regions.
|
|
|
|
## Memory Layout of a Process
|
|
|
|
A typical process memory layout (from low to high addresses):
|
|
|
|
```
|
|
High Memory Address
|
|
┌─────────────────┐
|
|
│ Kernel Space │ ← Operating system memory (off-limits)
|
|
├─────────────────┤
|
|
│ Stack │ ← Local variables, function calls (grows downward ⬇)
|
|
│ ⬇ │
|
|
│ │
|
|
│ [free space] │
|
|
│ │
|
|
│ ⬆ │
|
|
│ Heap │ ← Dynamic memory allocation (grows upward ⬆)
|
|
├─────────────────┤
|
|
│ BSS Segment │ ← Uninitialized global/static variables
|
|
├─────────────────┤
|
|
│ Data Segment │ ← Initialized global/static variables
|
|
├─────────────────┤
|
|
│ Text Segment │ ← Program code (instructions)
|
|
└─────────────────┘
|
|
Low Memory Address
|
|
```
|
|
|
|
### Memory Segments Explained
|
|
|
|
| Segment | Purpose | Characteristics |
|
|
|---------|---------|-----------------|
|
|
| **Text** | Program code (machine instructions) | Read-only, executable, shared |
|
|
| **Data** | Initialized global/static variables | Read-write, fixed size |
|
|
| **BSS** | Uninitialized global/static variables | Read-write, zeroed at start |
|
|
| **Heap** | Dynamic memory (`malloc`, `new`) | Grows upward, managed manually |
|
|
| **Stack** | Local variables, function calls | Grows downward, automatic management |
|
|
|
|
## The Stack: Where Buffer Overflows Usually Happen
|
|
|
|
The **stack** is a Last-In-First-Out (LIFO) data structure used for:
|
|
- Storing local variables
|
|
- Managing function calls and returns
|
|
- Passing function arguments
|
|
- Saving CPU register states
|
|
|
|
### Stack Growth Direction
|
|
|
|
**Important:** The stack grows from high memory addresses to low memory addresses (downward), but buffers within the stack grow from low to high addresses (upward).
|
|
|
|
```
|
|
High Address
|
|
┌──────────────┐
|
|
│ Old Data │ ⬅ Stack starts here
|
|
├──────────────┤
|
|
│ Function 1 │
|
|
├──────────────┤
|
|
│ Function 2 │ ⬅ Stack grows down
|
|
├──────────────┤
|
|
│ Function 3 │ ⬅ Most recent function
|
|
└──────────────┘
|
|
Low Address
|
|
```
|
|
|
|
## Stack Frame Anatomy
|
|
|
|
Each function call creates a **stack frame** (also called activation record):
|
|
|
|
```
|
|
High Memory
|
|
┌─────────────────────┐
|
|
│ Function Arguments │ ⬅ Pushed by caller
|
|
├─────────────────────┤
|
|
│ Return Address │ ⬅ Where to jump back after function completes
|
|
├─────────────────────┤
|
|
│ Saved Frame Ptr │ ⬅ Previous function's base pointer (EBP/RBP)
|
|
├─────────────────────┤
|
|
│ Local Variable 1 │
|
|
├─────────────────────┤
|
|
│ Local Variable 2 │
|
|
├─────────────────────┤
|
|
│ Buffer[0..N] │ ⬅ Local arrays/buffers
|
|
├─────────────────────┤
|
|
│ ...more locals... │
|
|
└─────────────────────┘
|
|
Low Memory
|
|
```
|
|
|
|
### Key Stack Pointers
|
|
|
|
Two CPU registers track the stack:
|
|
|
|
**ESP/RSP (Stack Pointer)**
|
|
- Points to the current top of the stack
|
|
- Moves as data is pushed/popped
|
|
- Changes frequently during execution
|
|
|
|
**EBP/RBP (Base/Frame Pointer)**
|
|
- Points to the base of the current stack frame
|
|
- Used as a reference point for accessing local variables and parameters
|
|
- Remains stable during function execution
|
|
|
|
## How Function Calls Work
|
|
|
|
Let's trace what happens when `main()` calls `vulnerable()`:
|
|
|
|
### Before the Call (in main)
|
|
|
|
```
|
|
Stack:
|
|
┌─────────────────┐
|
|
│ main's vars │ ⬅ EBP, ESP here
|
|
└─────────────────┘
|
|
```
|
|
|
|
### Step 1: Push Arguments (if any)
|
|
|
|
```c
|
|
vulnerable("Hello"); // Push "Hello" pointer
|
|
```
|
|
|
|
```
|
|
Stack:
|
|
┌─────────────────┐
|
|
│ main's vars │
|
|
├─────────────────┤
|
|
│ argument │ ⬅ "Hello" pointer
|
|
└─────────────────┘
|
|
```
|
|
|
|
### Step 2: Execute CALL Instruction
|
|
|
|
The `call` instruction:
|
|
1. Pushes the **return address** (next instruction in `main`)
|
|
2. Jumps to `vulnerable()` function
|
|
|
|
```
|
|
Stack:
|
|
┌─────────────────┐
|
|
│ main's vars │
|
|
├─────────────────┤
|
|
│ argument │
|
|
├─────────────────┤
|
|
│ Return Address │ ⬅ Where to return after vulnerable()
|
|
└─────────────────┘
|
|
```
|
|
|
|
### Step 3: Function Prologue
|
|
|
|
At the start of `vulnerable()`:
|
|
|
|
```assembly
|
|
push ebp ; Save old base pointer
|
|
mov ebp, esp ; Set new base pointer
|
|
sub esp, N ; Allocate space for local variables
|
|
```
|
|
|
|
```
|
|
Stack:
|
|
┌─────────────────┐
|
|
│ main's vars │
|
|
├─────────────────┤
|
|
│ argument │
|
|
├─────────────────┤
|
|
│ Return Address │ ⬅ CRITICAL: Controls where program returns
|
|
├─────────────────┤
|
|
│ Saved EBP │ ⬅ Previous frame pointer
|
|
├─────────────────┤
|
|
│ Local Var 1 │
|
|
├─────────────────┤
|
|
│ buffer[20] │ ⬅ ESP, EBP now point here
|
|
└─────────────────┘
|
|
```
|
|
|
|
### Step 4: Function Epilogue (Normal Return)
|
|
|
|
At the end of `vulnerable()`:
|
|
|
|
```assembly
|
|
mov esp, ebp ; Restore stack pointer
|
|
pop ebp ; Restore base pointer
|
|
ret ; Pop return address and jump to it
|
|
```
|
|
|
|
The program returns to `main()` and continues normally.
|
|
|
|
## Buffer Overflow Visualization
|
|
|
|
Now let's see what happens with a buffer overflow:
|
|
|
|
### Normal Case
|
|
|
|
```c
|
|
void vulnerable() {
|
|
char buffer[8];
|
|
strcpy(buffer, "Hello"); // 5 bytes + null terminator = 6 bytes (OK)
|
|
}
|
|
```
|
|
|
|
```
|
|
Stack:
|
|
┌──────────────────┐
|
|
│ Return Address │ ⬅ 0x08048123 (unchanged)
|
|
├──────────────────┤
|
|
│ Saved EBP │ ⬅ 0xbffff678 (unchanged)
|
|
├──────────────────┤
|
|
│ buffer[4-7] │ ⬅ "\0\0\0\0"
|
|
├──────────────────┤
|
|
│ buffer[0-3] │ ⬅ "Hell"
|
|
└──────────────────┘
|
|
⬆ ESP
|
|
```
|
|
|
|
### Overflow Case
|
|
|
|
```c
|
|
void vulnerable() {
|
|
char buffer[8];
|
|
strcpy(buffer, "ThisStringIsMuchLongerThan8Bytes"); // OVERFLOW!
|
|
}
|
|
```
|
|
|
|
```
|
|
Stack Before:
|
|
┌──────────────────┐
|
|
│ Return Address │ ⬅ 0x08048123
|
|
├──────────────────┤
|
|
│ Saved EBP │ ⬅ 0xbffff678
|
|
├──────────────────┤
|
|
│ buffer[8] │
|
|
└──────────────────┘
|
|
|
|
Stack After Overflow:
|
|
┌──────────────────┐
|
|
│ Return Address │ ⬅ 0x73736572 (OVERWRITTEN! Actually "ress" from string)
|
|
├──────────────────┤
|
|
│ Saved EBP │ ⬅ 0x676e6f4c (OVERWRITTEN! Actually "Long" from string)
|
|
├──────────────────┤
|
|
│ buffer[8-11] │ ⬅ "Much"
|
|
├──────────────────┤
|
|
│ buffer[4-7] │ ⬅ "ngIs"
|
|
├──────────────────┤
|
|
│ buffer[0-3] │ ⬅ "This"
|
|
└──────────────────┘
|
|
⬆ ESP
|
|
```
|
|
|
|
**What happens next:**
|
|
1. Function tries to return
|
|
2. Pops corrupted return address (0x73736572)
|
|
3. Tries to jump to that address
|
|
4. **CRASH!** - Segmentation fault (invalid memory access)
|
|
|
|
## Exploiting Buffer Overflows
|
|
|
|
An attacker can carefully craft input to:
|
|
|
|
### 1. Control the Return Address
|
|
|
|
```
|
|
Stack Layout:
|
|
┌──────────────────┐
|
|
│ Return Address │ ⬅ Overwrite with 0xbffff7d0 (address of shellcode)
|
|
├──────────────────┤
|
|
│ Saved EBP │ ⬅ Can be junk (not critical)
|
|
├──────────────────┤
|
|
│ buffer + padding │ ⬅ Fill with NOPs + shellcode
|
|
└──────────────────┘
|
|
```
|
|
|
|
### 2. Inject Malicious Code
|
|
|
|
```
|
|
Payload Structure:
|
|
[ NOP Sled ][ Shellcode ][ Junk ][ Return Address ]
|
|
(safety) (exploit) (fill) (points to NOPs)
|
|
```
|
|
|
|
### 3. Redirect Execution
|
|
|
|
When the function returns:
|
|
1. Pops attacker-controlled return address
|
|
2. Jumps to NOP sled
|
|
3. Slides down to shellcode
|
|
4. Executes arbitrary code!
|
|
|
|
## Little Endian vs Big Endian
|
|
|
|
When overwriting addresses, byte order matters:
|
|
|
|
**Little Endian** (x86, x64):
|
|
- Least significant byte first
|
|
- Address 0x12345678 stored as: `\x78\x56\x34\x12`
|
|
|
|
**Big Endian** (some ARM, network protocols):
|
|
- Most significant byte first
|
|
- Address 0x12345678 stored as: `\x12\x34\x56\x78`
|
|
|
|
Example:
|
|
```python
|
|
# To overwrite return address with 0xdeadbeef on x86:
|
|
payload = b"A" * 32 + b"\xef\xbe\xad\xde"
|
|
```
|
|
|
|
## Stack vs Heap Overflows
|
|
|
|
### Stack Overflow Characteristics
|
|
- **Target**: Local variables, return addresses
|
|
- **Easier to exploit**: Predictable structure
|
|
- **Impact**: Code execution via return address overwrite
|
|
|
|
### Heap Overflow Characteristics
|
|
- **Target**: Dynamically allocated memory
|
|
- **Harder to exploit**: Less predictable layout
|
|
- **Impact**: Data corruption, function pointer overwrite, metadata manipulation
|
|
|
|
## Key Takeaways
|
|
|
|
1. **The stack grows downward** (high to low addresses), but **buffers grow upward** (low to high)
|
|
2. **Return addresses are stored on the stack** and can be overwritten
|
|
3. **Buffer overflow happens** when data exceeds buffer boundaries
|
|
4. **Careful memory layout understanding** is critical for both exploitation and defense
|
|
5. **Stack frames contain critical control data** that attackers want to modify
|
|
|
|
## Practical Implications
|
|
|
|
### For Attackers (Ethical Hackers)
|
|
- Need to calculate exact offset to return address
|
|
- Must understand stack layout of target function
|
|
- Payload must account for stack alignment and protections
|
|
|
|
### For Defenders (Developers)
|
|
- Use stack canaries to detect corruption
|
|
- Enable DEP/NX to prevent code execution on stack
|
|
- Use ASLR to randomize stack addresses
|
|
- Validate all input sizes
|
|
- Use safe string functions
|
|
|
|
## Next Steps
|
|
|
|
1. Learn about [CPU Registers](registers.md) used in stack operations
|
|
2. Study [Assembly Basics](assembly-basics.md) to understand low-level stack manipulation
|
|
3. Practice with [Simple Buffer Overflow Example](../examples/01-simple-overflow/)
|
|
4. Read about [Modern Mitigations](../defenses/mitigations.md)
|
|
|
|
## Further Reading
|
|
|
|
- [Smashing the Stack for Fun and Profit](http://phrack.org/issues/49/14.html) - The classic paper
|
|
- [Intel Software Developer Manual](https://software.intel.com/en-us/articles/intel-sdm) - Architecture details
|
|
- [Stack Frame Layout](https://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64/)
|
|
- [ASLR Explained](https://en.wikipedia.org/wiki/Address_space_layout_randomization)
|
|
|
|
---
|
|
|
|
**Remember:** Understanding the stack is fundamental to both exploiting and defending against buffer overflows. Master these concepts before moving to exploitation techniques.
|
|
|