3.6
Stack Pointer
The stack is mainly used for storing temporary data, for storing local variables and for storing return addresses after
interrupts and subroutine calls. The stack pointer register always points to the top of the stack. Note that the stack is
implemented as growing from higher memory locations to lower memory locations. This implies that a Stack PUSH
command decreases the stack pointer.
The stack pointer points to the data SRAM stack area where the subroutine and interrupt stacks are located. This stack
space in the data SRAM must be defined by the program before any subroutine calls are executed or interrupts are enabled.
The stack pointer must be set to point above 0x100. The stack pointer is decremented by one when data is pushed onto the
stack with the PUSH instruction, and it is decremented by two when the return address is pushed onto the stack with
subroutine call or interrupt. The stack pointer is incremented by one when data is popped from the stack with the POP
instruction, and it is incremented by two when data is popped from the stack with return from subroutine RET or return from
interrupt RETI.
The AVR® stack pointer is implemented as two 8-bit registers in the I/O space. The number of bits actually used is
implementation dependent. Note that the data space in some implementations of the AVR architecture is so small that only
SPL is needed. In this case, the SPH Register will not be present.
Bit
15
SP15
SP7
7
14
SP14
SP6
6
13
SP13
SP5
5
12
SP12
SP4
4
11
SP11
SP3
3
10
SP10
SP2
2
9
8
SP9
SP1
1
SP8
SP0
0
SPH
SPL
Read/Write
Initial Value
R/W
R/W
R/W
R/W
R/W
R/W
R/W
R/W
R/W
R/W
R/W
R/W
R/W
R/W
R/W
R/W
Top address of the SRAM (0x04FF/0x08FF/0x10FF)
3.7
Instruction Execution Timing
This section describes the general access timing concepts for instruction execution. The AVR CPU is driven by the CPU
clock clkCPU, directly generated from the selected clock source for the chip. No internal clock division is used.
Figure 3-4 shows the parallel instruction fetches and instruction executions enabled by the Harvard architecture and the fast-
access Register File concept. This is the basic pipelining concept to obtain up to 1 MIPS per MHz with the corresponding
unique results for functions per cost, functions per clocks, and functions per power-unit.
Figure 3-4. The Parallel Instruction Fetches and Instruction Executions
T1
T2
T3
T4
clkCPU
1st Instruction Fetch
1st Instruction Execute
2nd Instruction Fetch
2nd Instruction Execute
3rd Instruction Fetch
3rd Instruction Execute
4th Instruction Fetch
ATmega16/32/64/M1/C1 [DATASHEET]
15
7647O–AVR–01/15