Assumes worst case unless otherwise state:
Data Structure | Access | Search | Insert | Delete |
---|---|---|---|---|
Array | ||||
Stack (LIFO) | ||||
Queue (FIFO) | ||||
Singly-Linked List | ||||
Doubly-Linked List | ||||
Hash Table | NA | |||
Heap | ||||
BST (Worst) | ||||
AVL Tree | ||||
Red-Black Tree | ||||
B-Tree |
Algorithm | In-Place | Stable | Best | Average | Worst | Space (Worst) |
---|---|---|---|---|---|---|
Quicksort | 1 | 0 |
|
|||
Mergesort | 0 | 1 |
|
|||
Heapsort | 1 | 0 | ||||
Bubble Sort | 1 | 1 | ||||
Insertion Sort | 1 | 1 | ||||
Selection Sort | 1 | 0 |
Graph Problem | Algorithm | Time | Space |
---|---|---|---|
SSSP No negative cycles |
Bellman Ford | ||
SSSP No negative weight edge |
Dijkstra's | ||
SSSP Unweighted/Equal weight edge |
BFS (not DFS) | ||
SSSP Directed acyclic (DAG) |
Topological Sort | ||
SSSP Tree |
BFS/DFS | ||
Path with fewest edges | BFS | ||
APSP | Floyd-Warshall | ||
SCC | Kosaraju's | ||
MST | Kruskal's | ||
MST | Prim's |
Condition | Upper Bound |
---|---|
Can only be used on a sorted list
Sorting =
# Given sorted array A, want to find index of value V
L = 0
R = A.length() - 1
while L <= R
m = floor((L + R) / 2)
if A[m] < V
L = m + 1
elif A[m] > V
R = m - 1
else
return m
return notFound
Continually swap current element with element at index + 1 if it is larger.
for i = A.length() to 0
for j = 0 to i - 1
if A[j] > A[j + 1]
swap(A[j], A[j + 1])
- In-place
- Stable (line 3, only swaps if
>
, not>=
)
Best case:
Assume first element is sorted. Iterate through the unsorted section, constantly inserting each element at the correct position in the sorted section.
# mark A[0] as sorted
for i = 1 to A.length()
j = i
while j > 0 && A[j] < A[j - 1]
swap(A[j], A[j - 1])
j--
- In-place
- Stable
Best case:
Continually select the smallest element from unsorted section and swap that element with the starting element of the unsorted section (thereby creating a sorted section element by element).
for i = 0 to A.length() - 1
minIndex = i
for j = i + 1 to A.length()
if A[j] < A[minIndex]
minIndex = j
swap(A[i], A[minIndex])
- In-place
- Non-stable: for example, given
[2a, 2b, 1]
,1
and2a
will swap to become[1, 2b, 2a]
after first iteration.
Best case:
Procedure:
- Heapify the array:
$O(n)$ - Poll the heap for min/max key:
$O(\log n)$ - Repeat step 3 till heap is empty:
$n$ times
Time:
- Useful to find
$k$ largest element in sorted order (where$k < n$ ). Time:$O(n + k \log n)$ - In-place
- Non-stable
function merge(int low, int mid, int high)
for i = low to high
buffer[i] = A[i]
i = low
j = mid + 1
k = low
while i <= middle and j <= high
if buffer[i] <= buffer[j]
A[k] = buffer[i]
i++
else
A[k] = buffer[j]
j++
k++
while i <= middle
A[k] = buffer[i]
k++
i++
Time complexity for merge
is while
loop will constantly be checked in the worst case.
Time complexity for merge sort is
function mergeSort(A, low, high)
if low < high
mid = (high + low) / 2
mergeSort(A, low, mid)
mergeSort(A, mid + 1, high)
merge(low, mid, high)
If mergeSort
takes
line 2 =
line 3 =
line 4 =
line 5 =
line 6 =
- Not in-place (requires auxiliary buffer array)
- Stable
function quicksort(A, low, high)
if low < high
p = partition(A, low, high)
quicksort(A, low, p - 1)
quicksort(A, p + 1, high)
# Lumoto's Algorithm
function partition(A, low, high)
pivot = A[low]
m = low
i = low
for i = (low + 1) to high
if A[i] < pivot
m++
swap(A[i], A[m])
swap(A[m], A[low])
return m
# Hoare's Algorithm
function partition(A, low, high)
pivot = A[low]
i = low + 1 # Initialize left index
j = high # Initialize right index
while i < j
while (A[i] < pivot) and (i <= high)
i++
while (A[j] > pivot) and (j >= low)
j--
if (i < j)
swap(A[i], A[j])
swap(A[j], A[low])
return i
- In-place
- Non-stable
- Represented by a contiguous array where root starts at index
$1$ . For given node with index$x$ ,- Left child has index
$2x$ - Right child has index
$2x + 1$
- Left child has index
- Sink/Swim: choose larger/smaller of child to swap corresponding to max/min heap to keep heap property
- Assuming Max Heap, parent key will always be
$\geq$ 2 direct child key
- Add to node at the end (first available leaf)
- Swim it up
- requires
$O(\log n)$ time
- Remove root
- Exchange last element with root
- Sink it down
- requires
$O(\log n)$ time
- Make node infinitely large
- Swim it up to root
- Run ExtractMax
function create(A)
for i from A.length/2 to 1
sink(i)
Given an unsorted array A of elements, heapify requires
Time:
Space:
- Left subtree only contains nodes with keys strictly less than node's key
- Right subtree only contains nodes with keys strictly more than node's key
- No duplicate keys
For BST with
- Next key immediately larger
- Go right child once and go left all the way (furthest left child of the right child)
- Next key immediately smaller
- Go left child once and go right all the way (furthest right child of the left child)
function search(Key k)
if k < m_key
# search left subtree
return m_leftTree.search(k)
else if k > m_key
# search right subtree
return m_rightTree.search(k)
return this
Time complexity of
Same algorithm as search
Time complexity of
- Node not found: do nothing
- Node is at a leaf: just remove
- Node is an internal node with 1 child: remove node and connect child to parent
- Node is an internal node with 2 children: swap node with its successor, and recursively call delete on that node
Time complexity of
- Height balanced
- Difference in height of left and right subtree differs by at most 1
- For
$n$ nodes, height,$h < 2\log n$ . Thus,$h = O(\log n)$
- Balance Factor,
$b(o) = h(o.left) - h(o.right)$ - By convention,
$h(\text{empty tree}) = -1$
- By convention,
- After inserting a new node,
$n_{new}$ , recurse up from$n_{new}$ and find the first node,$n_0$ , with$|b(0)| > 1$ - If
$n_0$ is imbalanced without an "elbow" shape,- If
$n_0$ is right heavy, perform left rotation - Else, perform right rotation
- If
- If
$n_0$ is imbalanced with an "elbow" shape,- if
$n_0$ is right heavy, do right rotation on right subtree of$n_0$ , then left rotation on$n_0$ - if
$n_0$ is left heavy, do left rotation on left subtree of$n_0$ , then right rotation on$n_0$
- if
Walk up the tree to the root from inserted/deleted node and update balance factors
if tree is right heavy
if tree’s right subtree is left heavy
right rotate, left rotate
else
left rotate
else if tree is left heavy
if tree’s left subtree is right heavy
left rotate, right rotate
else
right rotate
- Insertion: requires
$O(1)$ rotations - Deletion: requires
$O(\log n)$ rotations
- BST property is maintained for all nodes
- The root and leaves are always black (leaves meaning the sentinel leaves)
- Nodes are either red or black
- Parent of a red node must be black
- For every node, number of black node on the path from it to any of its leaves must remain the same (red nodes are ignored). This is called black height
- If uncle is red, then recolour parents and uncle to black
- If node is left child and uncle is black, then right rotate on grandparent, and colour parent black and grandparent red
- If node is right child and uncle is black, then left rotate on parent (to get the same tree as case 2) and perform step 2
The B-tree is a generalisation of a BST in that a node can have more than two children
- BST property is maintained for all nodes in the B-Tree
- All leaf nodes are of same depth
- All non-root nodes have between
$b - 1$ to$2b - 1$ keys inclusive, where$b$ is the branching factor - The root has at most
$2b - 1$ keys (no minimum) - An internal node with
$k$ keys must have$k + 1$ subtrees/children
Under SUHA, in a hash table where collisions are resolved by chaining, searching takes on average
- Fast to compute, preferably
$O(1)$ - Uniform, probability of hitting a bucket is same for all buckets
- Deterministic, hash of the same key should always give same answer
-
$h(k) = k \mod m$ , given a key$k$ and$m$ slots to map to -
$m$ should not be a power of$2$ - If
$m = 2^p$ , then$h(k)$ is just the$p$ LSB of$k$ - Unless we know all the
$p$ LSB of$k$ follows a normal distribution, using another$m$ is better
- If
-
$m$ should be prime not near any power of$2$ -
$k$ and$m$ should be relatively prime
-
$h(k) = \lfloor m(kA \mod 1) \rfloor$ , given a key$k$ ,$m$ slots to map to, and$0 < A < 1$ -
$kA \mod 1 = kA - \lfloor kA \rfloor$ , where$kA$ is a float -
$m$ is typically a power of$2$ - Knuth recommends
$A \approx \frac{\sqrt{5} - 1}{2}$
- index
$i = (h(k) + step \times 1) \mod m$ - Suffers from primary clustering
- Initial probe determines the entire sequence, and so only
$m$ distinct probe sequences are used
- If 2 keys have the same initial probe position, then their probe sequences are the same
- Suffers from secondary clustering
- Initial probe determines the entire sequence, and so only
$m$ distinct probe sequences are used
-
$\Theta(m^2)$ probe sequences are used
- Different keys with the same hashes are stored in a linked list
- Java's implementation
Operation has amortized cost
Note: every operation must satisfy the above
Graph
- Diameter - maximum distance between any two nodes
- Distance - shortest path between two nodes
- Degree - number of edges incident upon it (self loops are counted twice)
- Connected -
$\exists$ path between any two nodes - Component - a subgraph in which any two vertices are connected to each other by paths
- Simple Path - a sequence of edges that do not contain cycles
-
$e = (v,w)$ for$v \ne w$
All nodes are connected to one another
Diameter =
One central node. All edges connect center to the other nodes
Diameter =
Diameter =
Max Degree =
Diameter =
Max Degree =
-
Adjacency Matrix:
$O(V^2)$ - Array of size
$V \times V$
- Array of size
-
Adjacency List:
$O(V)$ - Array of size
$V$ - Linked list of size
$E = 2$
- Array of size
- Generally: If graph is dense
$[E = \Theta(V^2)]$ , use adjacency matrix. Else, use adjacency list.
Query | Adjacency Matrix | Adjacency List | Edge List |
---|---|---|---|
Check if |
|||
Find any neighbour | |||
Enumerate all neigbours | Possible |
||
Find |
|||
Given |
- Convert adjacency list to planar graph:
$O(V)$
BFS(G, s, f)
visit(s)
Queue.add(s) # FIFO
while not Queue.empty()
curr = Queue.dequeue()
if curr == f
return curr
else
for each neighbor u of curr
if u is not visited
visit(u)
Queue.enqueue(u)
return null
- See if current node is correct element
- If not, add all neighbours (if not visited yet) to a queue
- Recursively do this algorithm for each element in the queue
Can fail for graphs with
Each
For every
Time:
BFS(G, s, f)
visit(s)
Stack.add(s) # LIFO
while not Stack.empty()
curr = Stack.dequeue()
if curr == f
return curr
else
for each neighbor u of curr
if u is not visited
# store path:
edgeTo[u] = curr
visit(u)
Stack.enqueue(u)
return null
Time:
- To find shortest path, unweighted edges: only BFS works
- To find shortest path, weighted edges: none works
- Both visits every node and every edge, but not every path
Topological Ordering:
- Sequential total ordering of all nodes
- Edges only point forward
- Only possible
$\iff$ Graph is a DAG
- Start at node
$v$ with no incoming edges, and add$v$ to a list - Remove all outgoing edges from
$v$ , and go back to step 1
L = list()
S = list()
add all nodes with no incoming edge to S
while S is not empty:
remove node v from S
add v to tail of L
for each of v’s neighbors u
remove edge e where source is v
if u has no other incoming edges
add u to S
Time:
Topological ordering are not unique
Shortest Path for DAG,
- Let
$d$ be an array of the same length as$V$ ; this will hold the shortest-path distances from$s$ . Set$d[s] = 0$ , all other$d[u] = \infty$ - Let
$p$ be an array of the same length as$V$ , with all elements initialised tonull
. Each$p[u]$ will hold the predecessor of$u$ in the shortest path from$s$ to$u$ - Loop over the vertices
$u$ as ordered in$V$ , starting from$s$ :- For each vertex
$v$ directly following$u$ (ie. there exists an edge from$u$ to$v$ ):- Let
$w$ be the weight of the edge from$u$ to$v$ - Relax the edge: if
$d[v] > d[u] + w$ , set-
$d[v] = d[u] + w$ , $p[v] = u$
-
- Let
- For each vertex
Let
- If
$G + (V,E)$ contains only positive weights, then the shortest path$p$ from source vertex$s$ to a vertex$v$ must be a simple path - If
$G + (V,E)$ contains no negative weight cycles, then the shortest path$p$ from source vertex$s$ to a vertex$v$ must be a simple path -
$\delta[s,v] \leq \delta[s,u] + w[u,v]$ - shortest path from
$s$ to$v$ must be$\leq$ sum of (shortest path from$s$ to$u$ )$+$ (going from$u$ to$v$ )
- shortest path from
-
$\forall v \in V$ ,$d[s,v] \geq \delta [s,v]$ - once
$d[s,v] = \delta [s,v]$ , it never changes
- once
- If
$\exists \delta[s,u]$ , then relaxing edge$[u,v]$ will yield$\delta[s,v]$ - holds true regardless of any other relaxation steps that occur before relaxing edge
$[u,v]$ (due to theif
part in the relaxation code)
- holds true regardless of any other relaxation steps that occur before relaxing edge
- Once
$d[s,v] = \delta[s,v], \forall v \in V$ , the predecessor subgraph is a shortest-path tree rooted at$s$
A shortest path is a tree with at most
$|V| - 1$ edges Thus, to get shortest path, relax all$|E|$ edges,$|V| - 1$ times
relax(int u, int v)
if (dist[v] > dist[u] + weight(u, v))
dist[v] = dist[u] + weight(u, v)
prev[v] = u # store predecessor node
BellmanFord(V, E)
n = V.length
for i = 1 to n - 1
for Edge e in E
relax(e)
- Outer for loop runs
$V-1$ times - Can be terminated early if an entire sequence of
$|E|$ relax operations have no effect - To find out if
$\exists$ negative cycles, run one more time of relaxation - To find out exactly which nodes are affected by a negative cycle, run
$V-1$ more times of relxation
Time:
Only works for graphs with non-negative weight edges
- Begin with empty shortest-path-tree
- Repeat
- Consider vertex with minimum estimate
- Add vertex to shortest-path-tree
- Relax all outgoing edges
Step 2.1 (insert
/deleteMin
): done relax
/decreaseKey
): done
Time:
Condition | Algorithm | Time Complexity |
---|---|---|
No negative weight cycles | Bellman-Ford | |
Unweighted/equal weight graphs | BFS | |
No negative edge weights | Dijkstra's | |
On Tree | BFS/DFS | |
On DAG | Topological Sort |
- Subpaths of shortest paths are themselves shortest paths
- Only gives correct answer if graph does not have negative cycles
- To detect negative cycles, check if diagonals of result has negative value (ie. travelling from a node back to the same node is negative)
Function FloydWarshall(G)
# memoization table S has |V| rows and |V| columns
S = Array of size |V|x|V|
# Initialize every pair of nodes
for v = 0 to |V|-1
for w = 0 to |V|-1
S[v,w] = E[v,w]
# For sets P0, P1, P2, P3, ..., for every pair (v,w)
for k = 0 to |V|-1
for v = 0 to |V|-1
for w = 0 to |V|-1
S[v,w] = min(S[v,w], S[v,k]+S[k,w])
return S
Time:
Definitions:
- Spanning Tree: an acyclic subset of the edges that connects all nodes
- MST: A spanning tree with minimum weight
- Cut: a partition of the vertices into 2 disjoint subsets
Assumptions:
- Edge weights are distinct
- Graph is connected
Properties of MSTs:
- No cycles
- If you cut an MST, the two pieces are both MSTs
-
$\forall$ cycle, the maximum edge is not in the MST -
$\forall$ partition of nodes, the minimum weight edge across the cut is in the MST $E = V-1$
Warnings:
- MSTs cannot be used to find the shortest path
-
$\forall$ cycle, the minimum weight edge may or may not be in the MST -
$\forall$ vertex, the minimum incident/outgoing edge is always part of the MST -
$\forall$ vertex, the maximum incident/outgoing edge may or may not be part of the MST - Kruskal's and Prim's only work on undirected graphs
Utilises Cut Property
Idea:
- Maintain set of visited nodes
- Greedily grow the set by adding node connected via the lightest edges
- Use min heap to order nodes by edge weight
Each vertex is added/removed once from heap:
Each edge can incur one decreaseKey
:
Time:
Can only work on undirected graphs
Any sequence of
Basic idea: recursively choose the minimum weight edge to connect 2 disconnected trees
// Sort edges and initialize
Edge[] sortedEdges = sort(G.E());
ArrayList<Edge> mstEdges = new ArrayList<Edge>();
UnionFind uf = new UnionFind(G.V());
// Iterate through all the edges, in order
for (int i = 0; i < sortedEdges.length; i++) {
Edge e = sortedEdges[i]; // get edge
Node v = e.one(); // get node endpoints
Node w = e.two();
if (!uf.find(v, w)) { // in the same tree?
mstEdges.add(e); // save edge
uf.union(v, w); // combine trees
}
}
Sorting (line 2):
Union (line 11-13):
Time:
Can only work on undirected graphs