COMP3506 Data Structures: Exam Topics & Practice Strategy

StudyPilot Team

Education Experts

COMP3506 (Algorithms and Data Structures) is one of the courses where the gap between students who pass and students who excel is widest. The material is not difficult to understand the first time you see it. It is difficult to recall and apply under exam pressure, with no IDE to test your code. This guide covers the six topic clusters that have appeared on almost every recent past paper, a worked example of amortised analysis (a topic students consistently lose marks on), and a practice strategy designed for the open-book / closed-book hybrid format the course tends to use.

What the COMP3506 final looks like

The COMP3506 final is typically a 2-hour paper, weighted around 50–60% of the final mark. The format mixes short conceptual questions (Big-O, true/false on data structure properties) with longer questions that ask you to design, modify, or analyse an algorithm. Check your current semester's course profile for the exact format — open-book versus closed-book has changed between offerings.

The skills the exam tests, in rough order of mark allocation:

Big-O analysis of given code (~25%) — worst case, average case, and tight bounds.
Data structure design and operations (~25%) — choose the right structure for a problem and justify the choice.
Algorithm tracing (~20%) — walk through an algorithm step by step on a given input.
Algorithm design (~20%) — write or modify pseudocode to solve a problem with a given complexity target.
Conceptual short answer (~10%) — "what is amortised analysis," "when does quicksort hit its worst case," and similar.

If you only practise coding questions and skip Big-O proofs, you will lose around a quarter of the marks before you start.

6 topic clusters that always appear

Big-O and complexity analysis. Every past paper has 3–5 Big-O questions. They ask you to find the tightest bound (Θ) for a given function, prove it from first principles, or compare two functions and rank them. The traps are subtle: log(n!) is Θ(n log n) by Stirling's approximation, not Θ(n) or Θ(log n). Nested loops where the inner bound depends on the outer index sum to a quadratic, not a linear-times-log. And f(n) = O(g(n)) does not mean f(n) is the same complexity as g(n) — only that g is an upper bound.

When asked to prove a Big-O bound, write the formal definition first ("there exist constants c and n₀ such that…"), pick concrete values for c and n₀, then verify the inequality. Hand-wavy answers lose half the marks.

Hash tables. Know both major collision strategies in detail. Chaining: each slot holds a linked list, average-case operations are O(1) if the load factor stays bounded, worst case is O(n) if every key hashes to the same slot. Open addressing: linear probing, quadratic probing, double hashing — each with different clustering behaviour. Know how the load factor (n/m) affects performance and when rehashing is triggered. Past papers ask you to trace insertions into a hash table with a given hash function and collision strategy.

Trees. Binary search trees in their basic form (O(log n) average, O(n) worst case when unbalanced), AVL trees (self-balancing via rotations, guaranteed O(log n)), and heaps (priority queues, O(log n) insert and extract-min, O(1) peek). For AVL, know the four rotation cases (LL, LR, RL, RR) and which case applies based on the balance factor of the imbalanced node and its child. Heaps appear in graph algorithms (priority queue for Dijkstra) and in sorting (heapsort), so they earn double the practice time.

Graph algorithms. Most-tested cluster on past papers. Know BFS and DFS — what order they visit nodes, when each is appropriate, and how to use them for connectivity, cycle detection, and topological sort. For shortest paths, know Dijkstra with a heap-based priority queue (O((V + E) log V)) and Bellman-Ford (handles negative weights, O(VE)). For minimum spanning trees, know both Prim and Kruskal and when each is faster. Be able to trace any of these on a small graph (4–6 nodes) by hand.

Sorting. Quicksort (O(n log n) average, O(n²) worst when pivot choice is bad), mergesort (O(n log n) guaranteed, stable, O(n) extra space), heapsort (O(n log n) guaranteed, in-place, not stable). Past papers often ask you to compare them on specific inputs (already-sorted, reverse-sorted, all-duplicates) and explain why one beats another. Counting sort and radix sort appear less often but are worth knowing for the conceptual questions.

Dynamic programming and amortised analysis. DP appears as one larger problem most semesters — define subproblems, write the recurrence, identify the order to fill the table. Classic problems: longest common subsequence, knapsack, edit distance. Amortised analysis appears as a short proof — usually about dynamic arrays (the doubling-on-resize trick) or about splay trees. Worked example below.

Worked example: amortised analysis

Here is a past-paper-style question. A dynamic array starts with capacity 1. Each push appends one element. When the array is full, push first doubles the capacity (copying all existing elements to the new array) and then appends. What is the amortised cost of a single push?

The naive answer is "O(n) because resizing is O(n)." That is the worst-case cost of a single push, but not the amortised cost.

Use the aggregate method. Suppose we do n pushes starting from capacity 1. Resizes happen when the array fills up — at sizes 1, 2, 4, 8, …, up to the largest power of 2 not exceeding n. Each resize at size k costs k (copying k elements).

Total cost of all resizes:

1 + 2 + 4 + 8 + ... + 2^k   where 2^k <= n
= 2^(k+1) - 1
< 2n

Total cost of the n pushes themselves (the actual append operation, ignoring resizes): n.

Total cost: less than 3n. So the amortised cost per push is less than 3, which is O(1).

Why this matters: amortised O(1) means that across any sequence of operations, the average cost is constant — even though individual operations can be expensive. This is why Python lists and Java ArrayLists can claim O(1) appends despite occasionally needing to resize.

A common mistake on this question is to use the accounting method (charging extra to each push to pay for future resizes) without justifying the per-push charge. The marker wants to see the formal argument, not just the final answer.

3 mistakes that lose the most marks

Ignoring constants and lower-order terms when the question asks for them. Big-O hides constants, but exam questions sometimes ask for the exact number of comparisons, swaps, or hash function calls. "Approximately n log n" loses marks if the question asks for an exact count.

Forgetting the base case in recursive algorithms. Mergesort and quicksort have an O(1) base case (a single element or empty array). Pseudocode that does not handle the base case will go into infinite recursion. The marker counts this as wrong even if the recursive part is correct.

Confusing space complexity with time complexity. An algorithm can be O(n log n) in time but O(n) in space (mergesort), or O(n log n) in time but O(1) in space (heapsort, in-place quicksort). When the question asks about space, do not give the time complexity by accident. Read the question twice.

How to practise efficiently

COMP3506 rewards mixed practice. Doing 20 Big-O questions in a row will make you good at Big-O for an hour and then plateau. Doing 5 Big-O, then 3 hash table, then 4 graph, then 3 DP, then back to Big-O — for the same total time — produces noticeably better exam performance. This is called interleaving in the cognitive science literature, and it works because it forces your brain to identify the question type each time, which is a skill the exam tests directly.

If you already have the Python fundamentals down (if not, see the CSSE1001 guide), spend 70% of your COMP3506 prep time on past papers and topic-specific question sets. The remaining 30% goes to reviewing lecture notes for topics that broke during practice. Do not flip this ratio — passive reading is the worst use of time you have.

For the underlying technique that turns one past paper into a week of focused study, see How to Efficiently Use Past Exam Papers.

What to do next

COMP3506 looks like a course about memorising data structures. It is really a course about applying the right structure to the right problem and justifying the choice with a complexity analysis. The six topic clusters above are the boundary — students who excel know each one cold; students who struggle have a gap in one of them (usually amortised analysis or graph algorithms).

If you want a structured question bank organised by these six clusters, with worked solutions in the style this guide uses, that is what StudyPilot's COMP3506 question bank is built for. Start with whichever cluster broke during your last past paper, and the 4-step framework will close the gap in 1–2 weeks.