Core Algorithms Cheat Sheet: Sorting, Graphs, Greedy, DP, Strings, NP

This is an overview of core algorithms and analysis techniques likely seen in CS: sorting, graphs, unionfind and MST, greedy vs dynamic programming, string matching, and an intro to NP.

Asymptotic Analysis & Techniques

Big O, Big Theta, Big Omega
Describe how runtime or memory grows with input size n. For algorithms we care about the worst case O(…).
Common runtimes
- O(1) -> constant
- O(log n) -> binary search
- O(n) -> linear scan
- O(n log n) -> good comparison sorts, many divide & conquer algorithms
- O(n²) -> simple nested loops (naive DP, insertion sort worst case)
- O(2ⁿ), O(n!) -> exponential / factorial (brute force subset / permutation search)
Recurrences
Divide & conquer algorithms are often expressed as recurrences T(n) = 2T(n/2) + O(n) for mergesort, solved using the Master theorem.
Amortized analysis
Average cost per operation over a sequence, even if single operations may be expensive (example: dynamic array resize, unionfind path compression).

Sorting Algorithms

Insertion Sort

Idea
Build the sorted array one element at a time by inserting each element into the already sorted prefix.
Complexity
- Best: O(n) (already sorted)
- Average/Worst: O(n²)
- In place, stable
C++ example

void insertion_sort(std::vector<int>& v) {
    int n = (int)v.size();
    for (int i = 1; i < n; ++i) {
        int key = v[i];
        int j = i - 1;
        while (j >= 0 && v[j] > key) {
            v[j + 1] = v[j];
            --j;
        }
        v[j + 1] = key;
    }
}

Mergesort

Idea
Divide array into halves, sort each half recursively, then merge two sorted halves.
Complexity
- Always O(n log n)
- Stable, not in place (needs O(n) extra space)

void merge(std::vector<int>& a, int l, int m, int r) {
    std::vector<int> tmp;
    int i = l, j = m + 1;
    while (i <= m && j <= r) {
        if (a[i] <= a[j]) tmp.push_back(a[i++]);
        else              tmp.push_back(a[j++]);
    }
    while (i <= m) tmp.push_back(a[i++]);
    while (j <= r) tmp.push_back(a[j++]);
    for (int k = 0; k < (int)tmp.size(); ++k) {
        a[l + k] = tmp[k];
    }
}

void mergesort(std::vector<int>& a, int l, int r) {
    if (l >= r) return;
    int m = l + (r - l) / 2;
    mergesort(a, l, m);
    mergesort(a, m + 1, r);
    merge(a, l, m, r);
}

Quicksort

Idea
Choose a pivot, partition array into < pivot and >= pivot, recursively sort partitions.
Complexity
- Average: O(n log n)
- Worst: O(n²) (bad pivot choices, e.g. sorted input with naive pivot)
- In place, not stable; great constants -> used in practice

int partition(std::vector<int>& a, int l, int r) {
    int pivot = a[r];
    int i = l;
    for (int j = l; j < r; ++j) {
        if (a[j] < pivot) {
            std::swap(a[i], a[j]);
            ++i;
        }
    }
    std::swap(a[i], a[r]);
    return i;
}

void quicksort(std::vector<int>& a, int l, int r) {
    if (l >= r) return;
    int p = partition(a, l, r);
    quicksort(a, l, p - 1);
    quicksort(a, p + 1, r);
}

Heapsort

Idea
Build a max heap from the array, repeatedly extract the maximum to the end.
Complexity
- O(n log n) worst case
- In place, not stable

void heapify(std::vector<int>& a, int n, int i) {
    int largest = i;
    int L = 2 * i + 1, R = 2 * i + 2;
    if (L < n && a[L] > a[largest]) largest = L;
    if (R < n && a[R] > a[largest]) largest = R;
    if (largest != i) {
        std::swap(a[i], a[largest]);
        heapify(a, n, largest);
    }
}

void heapsort(std::vector<int>& a) {
    int n = (int)a.size();
    for (int i = n / 2 - 1; i >= 0; --i)
        heapify(a, n, i);
    for (int i = n - 1; i > 0; --i) {
        std::swap(a[0], a[i]);
        heapify(a, i, 0);
    }
}

Graphs & Traversals

Representation
- Adjacency list: vector<vector<int>> -> good for sparse graphs.
- Adjacency matrix: vector<vector<bool>> -> simple, but O(n²) memory.
BFS (Breadth First Search)
Level by level traversal using a queue. Finds shortest path in unweighted graphs.
DFS (Depth First Search)
Goes as deep as possible along each branch using recursion or a stack. Used for cycle detection, topological sort, connected components.

BFS Example (Shortest Path in Unweighted Graph)

std::vector<int> bfs_shortest_paths(
    const std::vector<std::vector<int>>& adj, int start) {

    int n = (int)adj.size();
    std::vector<int> dist(n, -1);
    std::queue<int> q;
    dist[start] = 0;
    q.push(start);

    while (!q.empty()) {
        int u = q.front(); q.pop();
        for (int v : adj[u]) {
            if (dist[v] == -1) {
                dist[v] = dist[u] + 1;
                q.push(v);
            }
        }
    }
    return dist;
}

Dijkstra's Algorithm (Shortest Path)

Use case
Single source shortest paths on a weighted graph with non negative edge weights.
Complexity
- Using binary heap (priority queue): O((V + E) log V)
Key idea
Greedily expand the node with the smallest current distance, once popped from the priority queue, its distance is final.

using Edge = std::pair<int,int>; // (neighbor, weight)

std::vector<int> dijkstra(
    const std::vector<std::vector<Edge>>& adj, int src) {

    const int INF = 1e9;
    int n = (int)adj.size();
    std::vector<int> dist(n, INF);
    dist[src] = 0;

    using State = std::pair<int,int>; // (dist, node)
    std::priority_queue<State, std::vector<State>,
                        std::greater<State>> pq;
    pq.push({0, src});

    while (!pq.empty()) {
        auto [d, u] = pq.top(); pq.pop();
        if (d != dist[u]) continue; // stale
        for (auto [v, w] : adj[u]) {
            if (dist[v] > dist[u] + w) {
                dist[v] = dist[u] + w;
                pq.push({dist[v], v});
            }
        }
    }
    return dist;
}

unionfind (Disjoint Set Union, DSU)

Purpose
Maintain a collection of disjoint sets with two main operations: find(x) (which set?) and union(x,y) (merge sets). Used heavily in Kruskal's MST algorithm.
Optimizations
- Path compression in find()
- Union by rank/size
- Amortized almost O(1) per operation

struct DSU {
    std::vector<int> parent, sz;

    DSU(int n) : parent(n), sz(n, 1) {
        std::iota(parent.begin(), parent.end(), 0);
    }

    int find(int x) {
        if (parent[x] != x)
            parent[x] = find(parent[x]); // path compression
        return parent[x];
    }

    bool unite(int a, int b) {
        a = find(a); b = find(b);
        if (a == b) return false;
        if (sz[a] < sz[b]) std::swap(a, b);
        parent[b] = a;
        sz[a] += sz[b];
        return true;
    }
};

Minimum Spanning Trees (MST)

For a connected, weighted, undirected graph, an MST is a subset of edges connecting all vertices with minimum total weight and no cycles.

Kruskal's Algorithm (with unionfind)

Idea
1. Sort all edges by weight.
2. Process edges from smallest to largest.
3. Add an edge if it connects two different components (no cycle), using unionfind.
Complexity
O(E log E) due to sorting, unionfind operations are almost O(1).

struct Edge {
    int u, v, w;
};

int kruskal(int n, std::vector<Edge> edges) {
    std::sort(edges.begin(), edges.end(),
              [](const Edge& a, const Edge& b) {
        return a.w < b.w;
    });

    DSU dsu(n);
    int total = 0;
    for (auto &e : edges) {
        if (dsu.unite(e.u, e.v)) {
            total += e.w;
        }
    }
    return total;
}

Prim's Algorithm (Alternative MST)

Idea
Start from any node, always add the cheapest edge that connects the current tree to a new vertex (greedy, like Dijkstra but with edge weights only).
Complexity
O((V + E) log V) with a heap.

Topological Sort

Use case
Ordering of vertices in a directed acyclic graph (DAG) so every edge u->v goes from earlier to later in the order.
Applications
Task scheduling with dependencies, resolving build order, etc.
Algorithms
- DFS based: push node to list after exploring all neighbors, then reverse list.
- Kahn’s algorithm: repeatedly remove nodes with in degree 0.

// Kahn's algorithm
std::vector<int> topo_sort(const std::vector<std::vector<int>>& adj) {
    int n = (int)adj.size();
    std::vector<int> indeg(n, 0);
    for (int u = 0; u < n; ++u)
        for (int v : adj[u])
            ++indeg[v];

    std::queue<int> q;
    for (int i = 0; i < n; ++i)
        if (indeg[i] == 0) q.push(i);

    std::vector<int> order;
    while (!q.empty()) {
        int u = q.front(); q.pop();
        order.push_back(u);
        for (int v : adj[u]) {
            if (--indeg[v] == 0)
                q.push(v);
        }
    }
    // If order.size() < n, graph had a cycle
    return order;
}

Greedy Algorithms

Key idea
At each step, pick the locally optimal choice and hope it leads to a global optimum. Correctness usually proven using "greedy choice" and "optimal substructure".
Examples
- Interval scheduling (select maximum number of non overlapping intervals).
- Activity selection, Huffman coding, MST (Kruskal, Prim), Dijkstra.

Greedy Example: Interval Scheduling

Choose maximum number of non overlapping intervals by sorting by end time.

struct Interval {
    int start, end;
};

int max_non_overlapping(std::vector<Interval> intervals) {
    std::sort(intervals.begin(), intervals.end(),
              [](const Interval& a, const Interval& b) {
        return a.end < b.end;
    });

    int count = 0;
    int last_end = std::numeric_limits<int>::min();
    for (auto &iv : intervals) {
        if (iv.start >= last_end) {
            ++count;
            last_end = iv.end;
        }
    }
    return count;
}

Dynamic Programming (DP)

Key ideas
- Optimal substructure: optimal solution is composed of optimal sub solutions.
- Overlapping subproblems: same subproblems appear many times.
Patterns
- 1D DP (e.g., Fibonacci, LIS length).
- 2D DP (e.g., edit distance, knapsack, DP on grid).
Approaches
- Top down (memoization)
- Bottom up (tabulation)

DP Example: 0/1 Knapsack (Bottom Up)

int knapsack(const std::vector<int>& w,
             const std::vector<int>& v,
             int W) {
    int n = (int)w.size();
    std::vector<std::vector<int>> dp(n + 1, std::vector<int>(W + 1, 0));

    for (int i = 1; i <= n; ++i) {
        for (int cap = 0; cap <= W; ++cap) {
            dp[i][cap] = dp[i - 1][cap]; // skip item
            if (cap - w[i - 1] >= 0) {
                dp[i][cap] = std::max(dp[i][cap],
                                      dp[i - 1][cap - w[i - 1]] + v[i - 1]);
            }
        }
    }
    return dp[n][W];
}

String Matching

Problem
Given text T and pattern P, find all occurrences of P in T.
Naive algorithm
Try aligning P at each position in T -> O(nm) worst case.
Better algorithms
- KMP (Knuth Morris Pratt): builds prefix function, O(n + m).
- Boyer Moore / Boyer Moore Horspool: skips ahead using character heuristics, good in practice.
- Aho Corasick: multi pattern matching, builds automaton (trie + failure links).

Naive String Matching

std::vector<int> find_naive(
    const std::string& text, const std::string& pat) {

    std::vector<int> pos;
    int n = (int)text.size();
    int m = (int)pat.size();
    for (int i = 0; i + m <= n; ++i) {
        int j = 0;
        while (j < m && text[i + j] == pat[j]) {
            ++j;
        }
        if (j == m) pos.push_back(i);
    }
    return pos;
}

KMP Prefix Function

std::vector<int> build_lps(const std::string& pat) {
    int m = (int)pat.size();
    std::vector<int> lps(m, 0);
    int len = 0;
    for (int i = 1; i < m; ) {
        if (pat[i] == pat[len]) {
            lps[i++] = ++len;
        } else if (len > 0) {
            len = lps[len - 1];
        } else {
            lps[i++] = 0;
        }
    }
    return lps;
}

std::vector<int> kmp_search(
    const std::string& text, const std::string& pat) {

    std::vector<int> pos;
    if (pat.empty()) return pos;

    auto lps = build_lps(pat);
    int i = 0, j = 0;
    while (i < (int)text.size()) {
        if (text[i] == pat[j]) {
            ++i; ++j;
            if (j == (int)pat.size()) {
                pos.push_back(i - j);
                j = lps[j - 1];
            }
        } else if (j > 0) {
            j = lps[j - 1];
        } else {
            ++i;
        }
    }
    return pos;
}

Intro to NP, NP Complete, NP Hard

P
Class of decision problems solvable in polynomial time by a deterministic machine.
NP
Class of decision problems whose “yes” solutions can be verified in polynomial time given a certificate (or solvable in poly time by a nondeterministic machine).
NP Complete
Hardest problems in NP: every NP problem can be reduced to them in polynomial time, and they are themselves in NP (SAT, 3-SAT, Clique, Hamiltonian path).
NP Hard
At least as hard as NP complete, but not necessarily in NP (may not even have efficiently verifiable certificates).
P vs NP
Open problem: is P = NP or P ≠ NP? We don’t know.
Reductions
Show problem A is at least as hard as B by transforming instances of B into instances of A in polynomial time. Used to prove NP completeness.

Takeaways

Sorting: Know insertion sort (concept), mergesort, quicksort, heapsort and their time/space trade offs.
Graphs: Comfort with BFS/DFS, Dijkstra, topological sort, and basic representations.
MST + unionfind: Kruskal uses unionfind, Prim is an alternative based on greedy expansion.
Greedy vs DP: Greedy = local choices, DP = explicit subproblem states & transitions.
String Matching: Naive is simple, KMP and BM/BMH/AC matter when performance matters.
Complexity & NP: Understand Big O, recurrences, and the high level meaning of P, NP, NP complete.

← Back to blog