Topological Sort

Topological Sort

Overview

Topological sort is an algorithm for linearly ordering vertices of a Directed Acyclic Graph (DAG). For every edge (u, v), u appears before v in the ordering.


Table of Contents

  1. Topological Sort Concepts
  2. Kahn's Algorithm (BFS)
  3. DFS-based Topological Sort
  4. Cycle Detection
  5. Application Problems
  6. Practice Problems

1. Topological Sort Concepts

1.1 DAG (Directed Acyclic Graph)

DAG: Directed Acyclic Graph
- Has directed edges
- Has no cycles

Topological sort is possible: Graph must be a DAG

Example: Course registration

CourseA  CourseB  CourseD
           
CourseC  CourseE

Valid course orders:
A  B  C  D  E  or
A  C  B  D  E  or
A  C  B  E  D

1.2 In-degree and Out-degree

In-degree: Number of edges coming into a vertex
Out-degree: Number of edges going out from a vertex

Example:
    A → B
    ↓   ↓
    C → D

Vertex | In  | Out
-------|-----|-----
  A    |  0  |  2
  B    |  1  |  1
  C    |  1  |  1
  D    |  2  |  0

Vertex with in-degree 0: Starting point (no dependencies)

1.3 Properties of Topological Order

- Multiple topological orders may exist
- Topological sort is only possible on DAGs
- Impossible when cycle exists

Topological sort algorithms:
1. Kahn's Algorithm (BFS) - O(V + E)
2. DFS-based - O(V + E)

2. Kahn's Algorithm (BFS)

2.1 Algorithm Principle

1. Calculate in-degree for all vertices
2. Insert vertices with in-degree 0 into queue
3. Remove vertex from queue and add to result
4. Remove outgoing edges (decrease in-degree of adjacent vertices)
5. Insert newly zero in-degree vertices into queue
6. Repeat until queue is empty

Visualization:
Initial: in_degree = [0, 1, 1, 2]
      A(0)  B(1)  D(2)
              
       C(1) ←──┘

Step 1: Select A (in-degree 0)
        Queue: [A], Result: [A]
        Decrease B, C in-degree  B(0), C(0)

Step 2: Select B (or C)
        Queue: [B, C], Result: [A, B]
        Decrease D in-degree  D(1)

Step 3: Select C
        Queue: [C], Result: [A, B, C]
        Decrease D in-degree  D(0)

Step 4: Select D
        Queue: [D], Result: [A, B, C, D]

Final: [A, B, C, D]

2.2 Implementation

from collections import deque

def topological_sort_kahn(n, edges):
    """
    Kahn's Algorithm (BFS-based topological sort)
    n: Number of vertices (0 to n-1)
    edges: [(u, v), ...] u → v edges
    Time: O(V + E)
    """
    # Build graph and calculate in-degrees
    graph = [[] for _ in range(n)]
    in_degree = [0] * n

    for u, v in edges:
        graph[u].append(v)
        in_degree[v] += 1

    # Start with vertices having in-degree 0
    queue = deque()
    for i in range(n):
        if in_degree[i] == 0:
            queue.append(i)

    result = []

    while queue:
        node = queue.popleft()
        result.append(node)

        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)

    # Check for cycle
    if len(result) != n:
        return []  # Cycle exists

    return result


# Example
n = 6
edges = [(5, 2), (5, 0), (4, 0), (4, 1), (2, 3), (3, 1)]
print(topological_sort_kahn(n, edges))  # [4, 5, 2, 0, 3, 1] or other valid order

2.3 C++ Implementation

#include <vector>
#include <queue>
using namespace std;

vector<int> topologicalSortKahn(int n, vector<pair<int, int>>& edges) {
    vector<vector<int>> graph(n);
    vector<int> inDegree(n, 0);

    for (auto& [u, v] : edges) {
        graph[u].push_back(v);
        inDegree[v]++;
    }

    queue<int> q;
    for (int i = 0; i < n; i++) {
        if (inDegree[i] == 0) {
            q.push(i);
        }
    }

    vector<int> result;

    while (!q.empty()) {
        int node = q.front();
        q.pop();
        result.push_back(node);

        for (int neighbor : graph[node]) {
            if (--inDegree[neighbor] == 0) {
                q.push(neighbor);
            }
        }
    }

    if (result.size() != n) {
        return {};  // Cycle exists
    }

    return result;
}

3. DFS-based Topological Sort

3.1 Algorithm Principle

1. Perform DFS on all vertices
2. Push to stack when DFS finishes
3. Output stack in reverse order

Idea:
- When DFS on vertex v finishes, all reachable vertices are already processed
- Therefore, vertices pushed later to stack come first in topological order

Visualization:
    A  B  D
    
    C

DFS visit order: A  B  D(done)  B(done)  C(done)  A(done)
Stack (finish order): [D, B, C, A]
Result (reverse): [A, C, B, D] or [A, B, D, C]

3.2 Implementation

def topological_sort_dfs(n, edges):
    """
    DFS-based topological sort
    Time: O(V + E)
    """
    graph = [[] for _ in range(n)]
    for u, v in edges:
        graph[u].append(v)

    visited = [0] * n  # 0: unvisited, 1: visiting, 2: done
    result = []
    has_cycle = False

    def dfs(node):
        nonlocal has_cycle
        if has_cycle:
            return

        visited[node] = 1  # Visiting

        for neighbor in graph[node]:
            if visited[neighbor] == 1:
                # Met a visiting node → cycle
                has_cycle = True
                return
            if visited[neighbor] == 0:
                dfs(neighbor)

        visited[node] = 2  # Done
        result.append(node)

    for i in range(n):
        if visited[i] == 0:
            dfs(i)

    if has_cycle:
        return []

    return result[::-1]  # Reverse


# Example
n = 6
edges = [(5, 2), (5, 0), (4, 0), (4, 1), (2, 3), (3, 1)]
print(topological_sort_dfs(n, edges))  # [5, 4, 2, 3, 1, 0] or other valid order

3.3 Iterative with Stack

def topological_sort_iterative(n, edges):
    """Iterative DFS topological sort"""
    graph = [[] for _ in range(n)]
    for u, v in edges:
        graph[u].append(v)

    visited = [False] * n
    result = []

    for start in range(n):
        if visited[start]:
            continue

        stack = [(start, False)]  # (node, processed)

        while stack:
            node, processed = stack.pop()

            if processed:
                result.append(node)
                continue

            if visited[node]:
                continue

            visited[node] = True
            stack.append((node, True))  # Re-insert for completion processing

            for neighbor in graph[node]:
                if not visited[neighbor]:
                    stack.append((neighbor, False))

    return result[::-1]

4. Cycle Detection

4.1 Cycle Detection with Kahn's Algorithm

def has_cycle_kahn(n, edges):
    """
    If topological sort result length is less than n, cycle exists
    """
    result = topological_sort_kahn(n, edges)
    return len(result) != n

# Case with cycle
edges_with_cycle = [(0, 1), (1, 2), (2, 0)]  # 0 → 1 → 2 → 0
print(has_cycle_kahn(3, edges_with_cycle))  # True

4.2 DFS Color-based Cycle Detection

def has_cycle_dfs(n, edges):
    """
    Cycle detection using DFS coloring
    WHITE(0): Unvisited
    GRAY(1): Visiting (in recursion stack)
    BLACK(2): Done
    """
    graph = [[] for _ in range(n)]
    for u, v in edges:
        graph[u].append(v)

    WHITE, GRAY, BLACK = 0, 1, 2
    color = [WHITE] * n

    def dfs(node):
        color[node] = GRAY

        for neighbor in graph[node]:
            if color[neighbor] == GRAY:
                return True  # Cycle found
            if color[neighbor] == WHITE:
                if dfs(neighbor):
                    return True

        color[node] = BLACK
        return False

    for i in range(n):
        if color[i] == WHITE:
            if dfs(i):
                return True

    return False

4.3 Finding Cycle Path

def find_cycle(n, edges):
    """Return cycle path if exists"""
    graph = [[] for _ in range(n)]
    for u, v in edges:
        graph[u].append(v)

    WHITE, GRAY, BLACK = 0, 1, 2
    color = [WHITE] * n
    parent = [-1] * n

    def dfs(node):
        color[node] = GRAY

        for neighbor in graph[node]:
            if color[neighbor] == GRAY:
                # Cycle found! Restore path
                cycle = [neighbor]
                curr = node
                while curr != neighbor:
                    cycle.append(curr)
                    curr = parent[curr]
                cycle.append(neighbor)
                return cycle[::-1]

            if color[neighbor] == WHITE:
                parent[neighbor] = node
                result = dfs(neighbor)
                if result:
                    return result

        color[node] = BLACK
        return None

    for i in range(n):
        if color[i] == WHITE:
            result = dfs(i)
            if result:
                return result

    return None

# Example
edges = [(0, 1), (1, 2), (2, 3), (3, 1)]  # 1 → 2 → 3 → 1 cycle
print(find_cycle(4, edges))  # [1, 2, 3, 1]

5. Application Problems

5.1 Course Schedule (Basic)

def can_finish(num_courses, prerequisites):
    """
    Check if all courses can be taken
    prerequisites[i] = [a, b] : must take b before a
    """
    result = topological_sort_kahn(num_courses,
                                    [(b, a) for a, b in prerequisites])
    return len(result) == num_courses

# LeetCode 207. Course Schedule
print(can_finish(2, [[1, 0]]))  # True: 0 → 1
print(can_finish(2, [[1, 0], [0, 1]]))  # False: cycle

5.2 Find Course Order

def find_order(num_courses, prerequisites):
    """
    Return possible course order
    Return empty list if impossible
    """
    return topological_sort_kahn(num_courses,
                                  [(b, a) for a, b in prerequisites])

# LeetCode 210. Course Schedule II
print(find_order(4, [[1, 0], [2, 0], [3, 1], [3, 2]]))
# [0, 1, 2, 3] or [0, 2, 1, 3]

5.3 Build Order

def build_order(projects, dependencies):
    """
    Determine project build order
    dependencies: [(a, b), ...] - build b after a
    """
    # Project name → index mapping
    proj_to_idx = {p: i for i, p in enumerate(projects)}
    n = len(projects)

    edges = [(proj_to_idx[b], proj_to_idx[a]) for a, b in dependencies]
    order = topological_sort_kahn(n, edges)

    if not order:
        return None  # Circular dependency

    return [projects[i] for i in order]

# Example
projects = ['a', 'b', 'c', 'd', 'e', 'f']
dependencies = [('a', 'd'), ('f', 'b'), ('b', 'd'), ('f', 'a'), ('d', 'c')]
print(build_order(projects, dependencies))  # ['e', 'f', 'c', 'b', 'd', 'a'] etc.

5.4 Task Scheduling (Earliest Completion Time)

def earliest_completion(n, edges, times):
    """
    Calculate earliest completion time for each task
    edges: [(u, v), ...] v can start after u completes
    times[i]: Duration of task i
    """
    from collections import deque

    graph = [[] for _ in range(n)]
    in_degree = [0] * n

    for u, v in edges:
        graph[u].append(v)
        in_degree[v] += 1

    # earliest[i] = earliest start time for task i
    earliest = [0] * n

    queue = deque()
    for i in range(n):
        if in_degree[i] == 0:
            queue.append(i)

    while queue:
        node = queue.popleft()

        for neighbor in graph[node]:
            # Update neighbor's start time
            earliest[neighbor] = max(earliest[neighbor],
                                      earliest[node] + times[node])
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)

    # Calculate completion times
    completion = [earliest[i] + times[i] for i in range(n)]
    return completion, max(completion)

# Example
n = 4
edges = [(0, 2), (1, 2), (2, 3)]
times = [3, 2, 5, 4]
completion, total = earliest_completion(n, edges, times)
print(f"Completion times: {completion}")  # [3, 2, 8, 12]
print(f"Total duration: {total}")  # 12

5.5 Lexicographically Smallest Topological Order

import heapq

def lexicographic_topological_sort(n, edges):
    """
    Lexicographically smallest topological order
    Using min heap
    """
    graph = [[] for _ in range(n)]
    in_degree = [0] * n

    for u, v in edges:
        graph[u].append(v)
        in_degree[v] += 1

    # Use min heap
    heap = []
    for i in range(n):
        if in_degree[i] == 0:
            heapq.heappush(heap, i)

    result = []

    while heap:
        node = heapq.heappop(heap)
        result.append(node)

        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                heapq.heappush(heap, neighbor)

    return result if len(result) == n else []

# Example
n = 4
edges = [(0, 1), (0, 2), (1, 3), (2, 3)]
print(lexicographic_topological_sort(n, edges))  # [0, 1, 2, 3]

5.6 Count Topological Sort Orderings

def count_topological_sorts(n, edges):
    """
    Count possible topological orderings (backtracking)
    Warning: Exponential time complexity
    """
    graph = [[] for _ in range(n)]
    in_degree = [0] * n

    for u, v in edges:
        graph[u].append(v)
        in_degree[v] += 1

    count = 0

    def backtrack(path):
        nonlocal count

        if len(path) == n:
            count += 1
            return

        for i in range(n):
            if i not in path and in_degree[i] == 0:
                # Choose
                for neighbor in graph[i]:
                    in_degree[neighbor] -= 1
                path.add(i)

                backtrack(path)

                # Restore
                path.remove(i)
                for neighbor in graph[i]:
                    in_degree[neighbor] += 1

    backtrack(set())
    return count

# Example (use only for small n)
n = 4
edges = [(0, 1), (0, 2), (1, 3), (2, 3)]
print(count_topological_sorts(n, edges))  # 2: [0,1,2,3], [0,2,1,3]

5.7 Alien Dictionary

def alien_order(words):
    """
    Infer alien alphabet order
    words: List of alien words sorted in dictionary order
    """
    from collections import defaultdict

    # Collect all characters
    chars = set()
    for word in words:
        chars.update(word)

    graph = defaultdict(set)
    in_degree = {c: 0 for c in chars}

    # Compare adjacent words to infer order
    for i in range(len(words) - 1):
        w1, w2 = words[i], words[i + 1]

        # Exception: w1 is prefix of w2 but longer is impossible
        if len(w1) > len(w2) and w1.startswith(w2):
            return ""

        for c1, c2 in zip(w1, w2):
            if c1 != c2:
                if c2 not in graph[c1]:
                    graph[c1].add(c2)
                    in_degree[c2] += 1
                break

    # Topological sort
    from collections import deque
    queue = deque([c for c in chars if in_degree[c] == 0])
    result = []

    while queue:
        c = queue.popleft()
        result.append(c)

        for neighbor in graph[c]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)

    if len(result) != len(chars):
        return ""  # Cycle (invalid input)

    return "".join(result)

# LeetCode 269. Alien Dictionary
words = ["wrt", "wrf", "er", "ett", "rftt"]
print(alien_order(words))  # "wertf"

6. Practice Problems

Difficulty Problem Platform Type
⭐⭐ Line Up BOJ Basic topo sort
⭐⭐ Course Schedule LeetCode Cycle detection
⭐⭐ Course Schedule II LeetCode Topo sort order
⭐⭐⭐ Task BOJ Minimum time
⭐⭐⭐ Game Development BOJ Task scheduling
⭐⭐⭐ Problem Set BOJ Lexicographic topo
⭐⭐⭐⭐ Alien Dictionary LeetCode Order inference

Algorithm Comparison

┌─────────────────┬─────────────┬─────────────┬────────────────────┐
│ Algorithm       │ Time        │ Space       │ Characteristics    │
├─────────────────┼─────────────┼─────────────┼────────────────────┤
│ Kahn (BFS)      │ O(V + E)    │ O(V)        │ In-degree based    │
│ DFS-based       │ O(V + E)    │ O(V)        │ Finish reverse     │
├─────────────────┼─────────────┼─────────────┼────────────────────┤
│ Lexicographic   │ O((V+E)logV)│ O(V)        │ Uses min heap      │
└─────────────────┴─────────────┴─────────────┴────────────────────┘

V = number of vertices, E = number of edges

Next Steps


References

to navigate between lessons