Branch Prediction

Branch Prediction

Overview

Branch prediction is a technique that improves pipeline performance by predicting the outcome of conditional branch instructions in advance. Modern processors achieve prediction accuracy of over 90%.


Table of Contents

  1. Control Hazards and Branch Problem
  2. Static Branch Prediction
  3. Dynamic Branch Prediction
  4. Branch Target Buffer
  5. Speculative Execution
  6. Practice Problems

1. Control Hazards and Branch Problem

Pipeline Bubbles from Branches

beq $t0, $t1, target
add $s0, $s1, $s2     # Should this execute?
sub $s3, $s4, $s5     # Should this execute?
...
target:
or  $s6, $s7, $s8

Time:    1    2    3    4    5    6    7
beq:    IF   ID  [EX]  MEM  WB
                  ↑
             Branch decision
add:         IF   ID   ← May need flush
sub:              IF   ← May need flush

Branch Penalty

Penalty based on decision timing:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Decision Stage β”‚    Penalty     β”‚   Note    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚   ID stage       β”‚   1 cycle      β”‚ Early MIPSβ”‚
β”‚   EX stage       β”‚   2 cycles     β”‚ Typical   β”‚
β”‚   MEM stage      β”‚   3 cycles     β”‚ Complex   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Modern processors: 10-20 cycle pipelines
β†’ Branch prediction essential!

Necessity of Branch Prediction

Branch frequency: ~20% (1 in every 5 instructions)

Always stall without prediction:
CPI = 1 + 0.2 Γ— 3 = 1.6  (assuming 3 cycle penalty)

90% accurate prediction:
CPI = 1 + 0.2 Γ— 0.1 Γ— 3 = 1.06

Performance improvement: 1.6/1.06 = 1.5x

2. Static Branch Prediction

2.1 Always Not Taken

Strategy: Predict all branches are not taken

Pros: Simple implementation
Cons: Poor performance on loops

for (i = 0; i < 100; i++)  // 99 times taken, 1 time not taken
                           // Prediction accuracy: 1%

2.2 Always Taken

Strategy: Predict all branches are taken

Pros: Good for loops
Cons: Requires branch target address calculation

for (i = 0; i < 100; i++)  // Prediction accuracy: 99%

2.3 BTFN (Backward Taken, Forward Not Taken)

Strategy:
- Backward branches (loops): Predict Taken
- Forward branches (if statements): Predict Not Taken

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚     Prediction by Branch Direction       β”‚
β”‚                                         β”‚
β”‚  PC=100:  beq label    (label=80)       β”‚
β”‚           ↑                             β”‚
β”‚           Backward branch β†’ Predict Takenβ”‚
β”‚                                         β”‚
β”‚  PC=100:  beq label    (label=120)      β”‚
β”‚           ↑                             β”‚
β”‚           Forward branch β†’ Predict Not Takenβ”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Loop end jumping back to start: Backward branch
if statement jumping to else: Forward branch

3. Dynamic Branch Prediction

3.1 1-bit Predictor

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚           1-bit Predictor               β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  State: Taken(1) or Not Taken(0)        β”‚
β”‚                                         β”‚
β”‚  Operation:                             β”‚
β”‚  - Predict based on current state       β”‚
β”‚  - Flip state if wrong                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

State transition:
     Wrong        Wrong
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚        β”‚   β”‚        β”‚
  β–Ό        β”‚   β–Ό        β”‚
β”Œβ”€β”€β”€β”€β”     β”‚ β”Œβ”€β”€β”€β”€β”     β”‚
β”‚ NT │◀────┴─│ T  β”‚β—€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”˜  Right β””β”€β”€β”€β”€β”˜  Right

Problem: Always wrong twice at loop start and end

Limitations of 1-bit Predictor

100 loop iterations:
T T T ... T T N  (99 T, 1 N)
                 ↑ Loop exit

Prediction:   T T T ... T T T N
Actual:       T T T ... T T N ←
                      ↑ Wrong (predicted T, actual N)

Next loop start:
Prediction:   N ← Wrong (predicted N, actual T)

2 mispredictions per 100 iterations β†’ 98% accuracy

3.2 2-bit Predictor

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              2-bit Saturating Counter           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  4 states:                                      β”‚
β”‚  - 00: Strongly Not Taken                       β”‚
β”‚  - 01: Weakly Not Taken                         β”‚
β”‚  - 10: Weakly Taken                             β”‚
β”‚  - 11: Strongly Taken                           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

State transition diagram:
        N              N              N
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚        β–Ό     β”‚        β–Ό     β”‚        β–Ό
β”Œβ”€β”€β”΄β”€β”€β”  β”Œβ”€β”€β”΄β”€β”€β”  β”Œβ”€β”€β”΄β”€β”€β”  β”Œβ”€β”€β”΄β”€β”€β”
β”‚ SNT β”‚  β”‚ WNT β”‚  β”‚ WT  β”‚  β”‚ ST  β”‚
β”‚ 00  β”‚  β”‚ 01  β”‚  β”‚ 10  β”‚  β”‚ 11  β”‚
β””β”€β”€β”¬β”€β”€β”˜  β””β”€β”€β”¬β”€β”€β”˜  β””β”€β”€β”¬β”€β”€β”˜  β””β”€β”€β”¬β”€β”€β”˜
   β”‚        β–²     β”‚        β–²     β”‚        β–²
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        T              T              T

Prediction: Taken if upper bit is 1, Not Taken if 0

2-bit Predictor Operation Example

100 loop iterations:
Actual:   T T T ... T T N T T T ... (next loop)

State:  ST ST ST ... ST WT ST ST ST ...
Pred:    T  T  T ...  T  T  T  T  T ...
Correct: βœ“  βœ“  βœ“ ... βœ“  βœ—  βœ“  βœ“  βœ“ ...

Only 1 misprediction! 1 failure per loop
99%+ accuracy

3.3 Branch History Table (BHT)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         Branch History Table (BHT)              β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                 β”‚
β”‚   PC[9:2] ───▢ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”                     β”‚
β”‚                β”‚ Index   β”‚                     β”‚
β”‚                β”‚  256    β”‚                     β”‚
β”‚                β”‚ entries β”‚                     β”‚
β”‚                β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜                     β”‚
β”‚                     β”‚                          β”‚
β”‚                     β–Ό                          β”‚
β”‚              2-bit counter                     β”‚
β”‚                                                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Uses part of PC as index
- Each entry stores a 2-bit counter
- Multiple branches can map to same index (aliasing)

3.4 Correlating Predictor

Idea: Other branch outcomes affect current branch

Example code:
if (a == 2)     // B1
    a = 0;
if (b == 2)     // B2
    b = 0;
if (a != b)     // B3: Correlated with B1, B2 outcomes!
    ...

(m, n) predictor:
- m: Record last m branch outcomes (Global History)
- n: n-bit counter

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         (2, 2) Correlating Predictor     β”‚
β”‚                                          β”‚
β”‚  Global History: 2 bits (00, 01, 10, 11) β”‚
β”‚  Per history: 2-bit counter              β”‚
β”‚                                          β”‚
β”‚  Total of 4 2-bit counters               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

3.5 Tournament Predictor

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚           Tournament Predictor                  β”‚
β”‚                                                 β”‚
β”‚         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                        β”‚
β”‚  PC ───▢│  Selector   β”‚ (2-bit counter)        β”‚
β”‚         β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜                        β”‚
β”‚                β”‚                               β”‚
β”‚         β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”                        β”‚
β”‚         β–Ό             β–Ό                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”‚
β”‚  β”‚   Local    β”‚ β”‚   Global   β”‚                 β”‚
β”‚  β”‚ Predictor  β”‚ β”‚ Predictor  β”‚                 β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜                 β”‚
β”‚         β”‚             β”‚                        β”‚
β”‚         β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜                        β”‚
β”‚                β–Ό                               β”‚
β”‚           MUX (select)                         β”‚
β”‚                β”‚                               β”‚
β”‚                β–Ό                               β”‚
β”‚            Prediction                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

- Local: Uses individual branch history
- Global: Uses overall branch history
- Selector: Learns which predictor is more accurate

4. Branch Target Buffer

BTB (Branch Target Buffer)

Stores whether instruction is a branch + target address

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Branch Target Buffer               β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚    Tag    β”‚  Target  β”‚  Prediction (optional)  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 0x1000... β”‚  0x2000  β”‚        ST               β”‚
β”‚ 0x1100... β”‚  0x1500  β”‚        WT               β”‚
β”‚ 0x2000... β”‚  0x1800  β”‚        SNT              β”‚
β”‚    ...    β”‚   ...    β”‚        ...              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

BTB lookup in IF stage:
- Hit + Taken prediction: Jump to target immediately
- Miss: Normal execution, update BTB later

BTB Operation Flow

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                IF Stage                          β”‚
β”‚                                                 β”‚
β”‚   PC ───▢ BTB Lookup                            β”‚
β”‚              β”‚                                  β”‚
β”‚              β–Ό                                  β”‚
β”‚         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”                             β”‚
β”‚         β”‚  Hit?  β”‚                             β”‚
β”‚         β””β”€β”€β”€β”€β”¬β”€β”€β”€β”˜                             β”‚
β”‚              β”‚                                  β”‚
β”‚      No      β”‚      Yes                        β”‚
β”‚      β–Ό       β”‚       β–Ό                         β”‚
β”‚   PC + 4     β”‚   Taken prediction?             β”‚
β”‚              β”‚       β”‚                          β”‚
β”‚              β”‚   Yes β”‚  No                     β”‚
β”‚              β”‚    β–Ό  β”‚   β–Ό                     β”‚
β”‚              β”‚  Target  PC + 4                 β”‚
β”‚              β”‚   addr                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

5. Speculative Execution

Concept

Execute instructions on predicted path before branch result confirmed

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚           Speculative Execution                 β”‚
β”‚                                                 β”‚
β”‚   beq prediction: Taken                         β”‚
β”‚      β”‚                                          β”‚
β”‚      β–Ό                                          β”‚
β”‚   Execute target instructions speculatively     β”‚
β”‚      β”‚                                          β”‚
β”‚      β–Ό                                          β”‚
β”‚   Branch result confirmed                       β”‚
β”‚      β”‚                                          β”‚
β”‚   β”Œβ”€β”€β”΄β”€β”€β”                                      β”‚
β”‚   β”‚     β”‚                                      β”‚
β”‚ Correct Wrong                                   β”‚
β”‚ prediction prediction                           β”‚
β”‚   β”‚     β”‚                                      β”‚
β”‚   β–Ό     β–Ό                                      β”‚
β”‚ Commit Discard                                  β”‚
β”‚ result (flush)                                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Misprediction Recovery

On misprediction:
1. Pipeline flush (remove speculative instructions)
2. Restore register state
3. Re-execute on correct path

Recovery cost:
- Proportional to pipeline depth
- Modern processors: 10-20 cycle penalty

Impact of Misprediction Rate

CPI = CPI_base + (branch ratio) Γ— (misprediction rate) Γ— (penalty)

Example:
- CPI_base = 1
- Branch ratio = 20%
- Penalty = 15 cycles

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Misprediction  β”‚      CPI         β”‚
β”‚    Rate        β”‚                  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚     10%        β”‚ 1 + 0.2Γ—0.1Γ—15   β”‚
β”‚                β”‚ = 1.30           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚      5%        β”‚ 1 + 0.2Γ—0.05Γ—15  β”‚
β”‚                β”‚ = 1.15           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚      2%        β”‚ 1 + 0.2Γ—0.02Γ—15  β”‚
β”‚                β”‚ = 1.06           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚      1%        β”‚ 1 + 0.2Γ—0.01Γ—15  β”‚
β”‚                β”‚ = 1.03           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

6. Practice Problems

Basic Problems

  1. What are the 3 static branch prediction methods?

  2. What are the 4 states of a 2-bit predictor?

  3. For the sequence below, what are the predictions of a 1-bit predictor (initial state T)? T T N T T N T T

Analysis Problems

  1. How does a 2-bit predictor (initial ST) behave for this loop?
for (int i = 0; i < 4; i++) {
    // loop body
}
  1. With 25% branch ratio, 10 cycle penalty, and 95% prediction accuracy, what is the CPI?

Advanced Problems

  1. Explain why tournament predictors are better than single predictors.

  2. Compare branch handling with and without BTB.

Answers 1. Always Not Taken, Always Taken, BTFN (Backward Taken, Forward Not Taken) 2. Strongly Not Taken (00), Weakly Not Taken (01), Weakly Taken (10), Strongly Taken (11) 3.
Actual: T T N T T N T T
Pred:   T T T N T T N T
State:  T T N T T N T T
Correct: βœ“ βœ“ βœ— βœ— βœ“ βœ— βœ— βœ“  (4/8 = 50%)
4.
Iterations: T T T N (4 iterations)
State: ST ST ST WT WT
Pred:  T  T  T  T  T
Correct: βœ“  βœ“  βœ“  βœ—
1 failure in 4 = 75% accuracy 5. CPI = 1 + 0.25 Γ— 0.05 Γ— 10 = 1.125 6. Tournament predictors use both Local and Global predictors, selectively choosing the more accurate one for each branch. Some branches have important local patterns while others have important global correlations. 7. - Without BTB: Branch type and target discovered only at ID or EX stage, penalty incurred - With BTB: On BTB hit in IF stage, can jump to target immediately, minimizing penalty

Next Steps


References

to navigate between lessons