APCSP AP Exam guide · IHHS Study Guide Hub

Learning objectives

By the end of this guide you will be able to:

Recite the five Big Ideas and their weights on the exam
Read AP CSP pseudocode fluently and translate it to Python
Convert between binary, decimal, and hexadecimal without a calculator
Trace any loop, conditional, or procedure call and predict the output
Calculate algorithmic efficiency and distinguish reasonable from unreasonable time
Identify sources of computing bias and the main types of encryption
Walk into the test knowing the pacing, the question format, and the most common traps

TL;DR

The APCSP exam is 70 multiple-choice questions in 2 hours, worth 100% of your AP score. Questions are split across five Big Ideas (with weights below). You get the AP CSP Exam Reference Sheet during the test, so memorizing pseudocode syntax is not the goal. Fluency in reading it is.

Big Idea	Topic	Weight
1, CRD	Creative Development	10-13%
2, DAT	Data	17-22%
3, AAP	Algorithms and Programming	30-35%
4, CSN	Computer Systems and Networks	11-15%
5, IOC	Impact of Computing	21-26%

Big Ideas 3 and 5 together are over half the test. Prioritize them.

The five Big Ideas at a glance

Glossary

The 20 terms you will see everywhere. Hover any underlined word later in the guide to see its definition inline.

Abstraction Hiding complexity behind a simpler interface. A function call hides the steps inside it. , the single most important CS word.
Algorithm A finite, well-defined sequence of steps to solve a problem.
API Application Programming Interface: a set of functions another program can call.
Bandwidth How much data a connection can transmit per second.
Binary Base-2 number system, digits are 0 and 1.
Bit One binary digit, 0 or 1. 8 bits make a byte.
Cipher An algorithm that encrypts or decrypts data.
Citizen science Research that involves the general public collecting or analyzing data.
Crowdsourcing Getting work done by many contributors, often over the internet (Wikipedia, Kickstarter).
Digital divide The gap between people with and without reliable access to computing and the internet.
Encryption Encoding data so only authorized parties can read it.
Heuristic A rule of thumb: a strategy that usually works but is not guaranteed optimal.
Iteration Repeating a block of code, a loop.
Lossy compression Compression that discards some data (JPEG, MP3), irreversible.
Lossless compression Compression that perfectly reconstructs the original (PNG, ZIP, FLAC).
Metadata Data about data, e.g. a photo’s timestamp and GPS location.
Packet A small chunk of a message sent over a network, tagged with routing info.
PII Personally Identifiable Information: data that can identify a specific person (name, SSN, email).
Procedure A reusable named block of code, a function.
Protocol A set of rules that lets computers communicate (TCP, HTTP, DNS).

Big Idea 1: Creative Development (10-13%)

Collaboration and program design

Computing is a team sport. The exam tests whether you can:

Design a program by identifying the problem, defining requirements, and breaking it into smaller pieces (decomposition)
Collaborate using version control, code review, pair programming
Iterate: build a prototype, get feedback, refine

Debugging

The exam loves debugging questions. Three classes of bugs:

Bug type	What it looks like	How you find it
Syntax error	Code refuses to run, misspelled keyword, missing colon	Compiler/interpreter flags the line
Runtime error	Crashes partway, division by zero, list index out of range	Stack trace at the moment of crash
Logic error	Runs cleanly but gives wrong output	Test cases, print statements, step debugger

Test cases matter. An exam question often asks “which test case reveals the bug?” Think about boundary values: empty lists, negative numbers, the first and last element.

Q A program works on the list [1, 2, 3, 4, 5] but gives the wrong answer for [5, 4, 3, 2, 1]. What kind of error is this?

A logic error. The program runs without crashing, but the output is wrong for some inputs. The bug probably assumes the list is already sorted. Syntax errors fail before running; runtime errors crash.

Big Idea 2: Data (17-22%)

Bits, bytes, and binary

Everything in a computer is ultimately bits: 0 or 1. Eight bits make a byte.

Binary to decimal. Each bit position is a power of 2, right-to-left starting at $2^0$ :

1011_2 = (1 \cdot 2^3) + (0 \cdot 2^2) + (1 \cdot 2^1) + (1 \cdot 2^0) = 8 + 0 + 2 + 1 = 11_{10}

Decimal to binary. Repeated division by 2, read remainders bottom-up:

13 \div 2 = 6\ r\,1, \quad 6 \div 2 = 3\ r\,0, \quad 3 \div 2 = 1\ r\,1, \quad 1 \div 2 = 0\ r\,1 \Rightarrow 1101_2

py Try it live Idle

Number of distinct values vs bits

Analog vs digital

Analog signals are continuous (a vinyl record, a thermometer’s mercury column). Digital signals are discrete samples (MP3, JPEG). Converting analog to digital always involves sampling (picking moments in time) and quantization (rounding each sample to a fixed number of levels). Higher sampling rate + more levels = closer to the original, but larger file.

Lossy vs lossless compression

Match each compression concept to its description 0 / 4 matched

Terms

Definitions

Metadata and extracting information

Metadata is data about data: a photo’s timestamp, camera settings, GPS coordinates. Often more revealing than the data itself.

The exam likes questions of the form “given this dataset, which conclusion is supported?” Trap: overreaching. Only claim what the data actually proves. Correlation is not causation. A biased sample does not generalize.

Q A fitness app records every user's step count. You notice users in downtown Chicago walk 30% more than users in rural Montana. What conclusions are and are not supported?

Supported: App users in downtown Chicago walked more steps than app users in rural Montana during the recorded period.

Not supported: People in Chicago are healthier than people in Montana. City dwellers walk more than rural dwellers in general. The difference is not due to chance. The data is only about users of this specific app who kept their phone on them, which is a biased sample.

Big Idea 3: Algorithms and Programming (30-35%)

This is the biggest slice of the exam. Spend the most time here.

Reading AP pseudocode

The exam uses a made-up pseudocode that combines Python-like semantics with block-like syntax. Key things to know:

a ← 5                    # assignment
a ← a + 1                # increment
DISPLAY(a)               # print
REPEAT 3 TIMES { ... }   # bounded loop
REPEAT UNTIL (condition) { ... }  # loop until condition true
FOR EACH item IN aList { ... }    # iterate list
PROCEDURE name(params) { ... }    # define procedure
IF (cond) { ... } ELSE { ... }    # conditional
aList[1]                 # FIRST element, lists are 1-indexed
LENGTH(aList)            # list length
INSERT(aList, i, value)  # insert at position i
APPEND(aList, value)     # append to end
REMOVE(aList, i)         # remove at position i

Variables, expressions, operators

Variables hold values; the assignment arrow ← writes into them. Evaluation goes right-to-left: compute the right side, store in the left.

x ← 10
y ← x * 2 + 1   # y = 21
x ← y           # x = 21

Operators: +, -, *, /, MOD (remainder). Integer MOD is tested all the time.

13 \bmod 5 = 3, \quad 20 \bmod 4 = 0, \quad 7 \bmod 10 = 7

Conditionals and nested logic

IF (age ≥ 16) {
  IF (hasLicense = true) {
    DISPLAY("can drive")
  } ELSE {
    DISPLAY("eligible but no license")
  }
} ELSE {
  DISPLAY("too young")
}

Boolean operators: AND, OR, NOT. Common trap: short-circuit evaluation. (x ≠ 0) AND (10 / x > 1) is safe because if x = 0, the second part never runs.

py Trace this in Python Idle

Iteration (loops)

Three forms the exam uses:

REPEAT 5 TIMES { DISPLAY("hi") }         # bounded, runs exactly 5 times

REPEAT UNTIL (guess = secret) {            # runs until condition is true
  guess ← INPUT()
}

FOR EACH item IN aList {                   # visit each element
  DISPLAY(item)
}

Order the steps to sum the first 5 positive integers

FOR EACH i IN [1, 2, 3, 4, 5]
total ← 0
total ← total + i
DISPLAY(total)

Procedures (functions)

A procedure packages a computation with a name, parameters, and optionally a return value.

PROCEDURE square(n) {
  RETURN (n * n)
}

DISPLAY(square(4))    # 16

Parameters are the inputs named inside the procedure. Arguments are the actual values passed in. Scope: variables defined inside a procedure are not visible outside it.

Lists

aList ← [10, 20, 30, 40]
DISPLAY(aList[1])          # 10 (1-indexed!)
DISPLAY(LENGTH(aList))     # 4
APPEND(aList, 50)          # [10, 20, 30, 40, 50]
INSERT(aList, 2, 15)       # [10, 15, 20, 30, 40, 50]
REMOVE(aList, 1)           # [15, 20, 30, 40, 50]

Common list patterns on the exam:

Linear search: walk the list; best O(n), worst O(n), unsorted OK.
Binary search: jump to middle, halve each step. Must be sorted. O(log n).
Sum / average: accumulator variable, single pass.
Filter: build a second list holding only elements that match a condition.

Algorithmic efficiency

The exam doesn’t use formal Big-O on you, but it asks:

Which algorithm runs in reasonable time (polynomial in input size) vs unreasonable time (exponential, factorial, etc.)?
If we double the input, does the running time double, quadruple, or grow by a constant?

Growth rates at a glance (smaller is faster)

Reasonable time (polynomial: constant, linear, quadratic): problems of any realistic size are solvable. Unreasonable time (exponential 2ⁿ, factorial n!): even tiny inputs blow up. A 30-element brute-force subset search ≈ 1 billion operations. A 60-element one ≈ 10¹⁸. Unsolvable in practice.

Watch sorting algorithms race

Comparisons: 0 Swaps: 0

Binary search: the most-tested algorithm

PROCEDURE binarySearch(aList, target) {
  low ← 1
  high ← LENGTH(aList)
  REPEAT UNTIL (low > high) {
    mid ← (low + high) / 2
    IF (aList[mid] = target) {
      RETURN mid
    } ELSE IF (aList[mid] < target) {
      low ← mid + 1
    } ELSE {
      high ← mid - 1
    }
  }
  RETURN -1
}

Requirement: the list must be sorted. Each step discards half the remaining candidates. For a million items, binary search finishes in ~20 steps.

py Try binary search in Python Idle

Undecidable problems

Some problems cannot be solved by any algorithm, not because we haven’t found the algorithm, but because one provably does not exist. The classic example is the Halting Problem: given any program and input, determine whether the program eventually stops.

You only need to know: such problems exist, they are called undecidable, and the exam will give you an example and ask you to recognize it.

Scratch pad

<> python

Loading editor...

Big Idea 4: Computer Systems and Networks (11-15%)

How the internet actually works

A message from your laptop to a server is broken into packets, each tagged with source IP, destination IP, and a sequence number. Packets travel independently through many routers and reassemble at the destination.

Packet routing (simplified)

Protocols you must know

Layer	Protocol	Job
Application	HTTP / HTTPS	Web page requests. HTTPS encrypts.
Application	DNS	Translates `example.com` into an IP address.
Application	SMTP / IMAP	Sending and receiving email.
Transport	TCP	Reliable, ordered delivery with acknowledgment.
Transport	UDP	Fast, unreliable, no acknowledgment. Good for streaming.
Internet	IP (IPv4, IPv6)	Addressing and routing between networks.

Open protocol. HTTP, TCP, IP are all open: no company owns them. That openness is what let the internet grow globally.

Fault tolerance

A system is fault-tolerant if it keeps working when parts fail. The internet achieves this with redundancy: multiple paths between any two points. If router B is down, packets route through C or D.

Parallel vs sequential vs distributed

Model	What it means	When it helps
Sequential	One operation at a time, in order	Simple, predictable
Parallel	Multiple operations simultaneously on one machine with multiple cores	CPU-bound work that can be split
Distributed	Multiple machines cooperating over a network	Massive problems, fault tolerance, crowdsourcing

Speedup measures the benefit of parallelism:

\text{speedup} = \frac{\text{sequential time}}{\text{parallel time}}

Not all problems parallelize. If one step depends on the previous one, you can’t split them.

Q A task takes 60 seconds sequentially. Split into parallel: 20 seconds of shared setup, then 40 seconds of work split across 4 cores. What's the parallel time and speedup?

Parallel time = 20 seconds setup + (40 / 4) seconds work = 30 seconds.

Speedup = 60 / 30 = 2×. Not 4× because the 20-second setup can’t be parallelized. This is why adding more cores eventually stops helping (Amdahl’s Law, informally).

Big Idea 5: Impact of Computing (21-26%)

Beneficial and harmful effects

Every computing innovation brings both. The exam always presents both sides. Smartphones connect families across the world; they also enable 24/7 tracking and addictive design. Self-driving cars could eliminate most traffic deaths; they could also put millions of drivers out of work.

On exam questions that ask “which is a potential harmful effect”, look for answers about: surveillance, job displacement, bias amplification, security vulnerabilities, mental health impact, and environmental cost (data centers use a lot of power).

The digital divide

The digital divide is the gap between people with reliable, affordable internet + modern devices + digital literacy and those without. Crosses urban/rural, rich/poor, national borders, and generations.

Computing bias

Computing bias is bias that gets baked into software, usually because:

Training data under-represents some group (facial recognition trained mostly on lighter skin)
The proxy variable encodes a historical inequality (zip code as credit-risk proxy, which tracks historical redlining)
The feedback loop reinforces itself (predictive policing sends more officers to an area, generating more arrest records from that area)

Fixes: diverse and balanced training data, audits for disparate impact, human oversight for high-stakes decisions.

Safe computing

Multi-factor authentication (MFA): something you know (password) + something you have (phone) + something you are (fingerprint).
Phishing: social engineering, not a software flaw. Train users.
PII (Personally Identifiable Information): name, address, SSN, account numbers, biometric data. Minimize what you collect and encrypt what you store.

Encryption

Match the encryption concept to its definition 0 / 4 matched

Terms

Definitions

Crowdsourcing and Creative Commons

Crowdsourcing spreads work across many volunteers: Wikipedia, Kickstarter, Foldit, Waze. Works best when individual contributions are small, the platform can validate them, and there’s a clear shared goal.

Creative Commons licenses let creators grant specific rights (share, adapt, commercial use) while keeping others. CC-BY means use it freely as long as you credit. CC-BY-NC adds “not for commercial use”. CC-BY-SA adds “if you modify, you must share under the same license”.

Worked examples

Example 1: Trace this algorithm

PROCEDURE mystery(n) {
  count ← 0
  REPEAT UNTIL (n = 1) {
    IF (n MOD 2 = 0) {
      n ← n / 2
    } ELSE {
      n ← 3 * n + 1
    }
    count ← count + 1
  }
  RETURN count
}

Trace mystery(6):

Step	n	MOD 2	Action	New n	count
0	6	0 (even)	n ← n/2	3	1
1	3	1 (odd)	n ← 3n+1	10	2
2	10	0	n ← n/2	5	3
3	5	1	n ← 3n+1	16	4
4	16	0	n ← n/2	8	5
5	8	0	n ← n/2	4	6
6	4	0	n ← n/2	2	7
7	2	0	n ← n/2	1	8

Loop exits because n = 1. Returns 8. (This is the Collatz conjecture.)

Example 2: Convert 45 to binary

Repeatedly divide by 2, track remainders, read bottom-up:

45 ÷ 2	quotient	remainder
	22	1
	11	0
	5	1
	2	1
	1	0
	0	1

Read bottom-up: 101101. Verify: 32 + 8 + 4 + 1 = 45. ✓

Example 3: Is this algorithm reasonable?

“For each pair of users in the system, check if they share any friends. The system has 100,000 users.”

Pairs of users = $\binom{100{,}000}{2} \approx 5 \cdot 10^9$ . Five billion pair-checks. At a million checks per second, that’s 5,000 seconds = 83 minutes. Technically reasonable, but expensive. If the system grew to 10 million users, pairs = $5 \cdot 10^{13}$ , and now it would take over a year. Quadratic growth: when input size doubles, time quadruples.

Example 4: Spotting bias

“A hiring tool is trained on resumes from the last 10 years at a tech company where 90% of historical hires were men. The tool now predicts which applicants to interview.”

The model will learn patterns associated with male applicants and reproduce the 90/10 split, even if the company now wants to hire more women. This is historical bias: the training data embeds a past inequality. Fix: balance the training data, remove gender-correlated proxy variables (certain sports, schools, hobbies), audit outputs for disparate impact.

Practice problems

Q What is 0b1101 in decimal?

$1 \cdot 8 + 1 \cdot 4 + 0 \cdot 2 + 1 \cdot 1 = \mathbf{13}$ .

Q How many distinct colors can a 24-bit color system display?

$2^{24} = 16{,}777{,}216$ colors. (Hence “16 million colors” marketing for True Color.)

Q A REPEAT UNTIL loop with condition `(x > 10)` starts with x = 3. Inside the loop, x increases by 2 each time. How many times does the loop body run?

Loop body runs until x > 10, so it runs while x ≤ 10. x values: 3 (run), 5 (run), 7 (run), 9 (run), 11 (condition true, stop). 4 iterations. Then the final body, after which x = 11 and we exit. Actually let me recount: after iteration 1 x = 5, after 2 x = 7, after 3 x = 9, after 4 x = 11 → stop. 4 iterations.

Q A linear search on an unsorted list of 1,000,000 items. What is the worst-case number of comparisons?

1,000,000. Linear search is O(n), and in the worst case the target is the last element or absent.

Q Why can binary search only be used on a sorted list?

Binary search works by halving the search range based on whether the middle element is less than or greater than the target. Without sorted order, ‘less than’ and ‘greater than’ give no information about where the target lives.

Q A medical records system encrypts files at rest but emails them as plaintext. Is this secure?

No. Data at rest is only one surface. Data in transit (email) is wide open. A complete scheme encrypts at rest AND in transit (HTTPS/TLS), controls who has keys, and logs access.

Self-quiz

15 AP-style questions. Some have one correct answer, some have two. Read carefully.

Self-quiz

0 of 15 answered

01
Which is the largest portion of the exam by weight?

Why: Big Idea 3 is 30-35% of the exam, the single biggest slice. Big Idea 5 is second at 21-26%.
02
What is 11001 in binary converted to decimal?

Why: 16 + 8 + 0 + 0 + 1 = 25. Positions are 2^4=16, 2^3=8, 2^2=4, 2^1=2, 2^0=1.
03
Given aList ← [5, 10, 15, 20, 25] in AP pseudocode, what is aList[3]?

Why: AP pseudocode is 1-indexed. aList[1] = 5, aList[2] = 10, aList[3] = 15. Python would answer 20.
04
Which type of compression can always perfectly reconstruct the original data?

Why: Lossless compression (PNG, ZIP, FLAC) is reversible. Lossy compression (JPEG, MP3) discards data.
05
A loop iterates through a sorted list of 1,048,576 items using binary search. What is the approximate maximum number of comparisons?

Why: log₂(1,048,576) = 20. Each comparison halves the search range, so it takes at most ~20 steps.
06
Which is NOT a source of computing bias?

Why: The implementation language has no bearing on bias. The other three are textbook sources of computing bias.
07
Select all that are OPEN protocols used on the Internet. (multi-select)

Why: HTTP, TCP, and IP are all open standards (anyone can implement them). Proprietary protocols owned by a single company are not. Closest single answer: HTTP is the canonical open protocol; the full set is HTTP, TCP, and IP.
08
What is the primary purpose of a DNS server?

Why: DNS (Domain Name System) is the internet's phone book: it maps example.com to an IP address like 93.184.216.34.
09
Which describes an algorithm that runs in REASONABLE time?

Why: Polynomial-time algorithms (linear, quadratic, cubic, ...) are considered reasonable. Exponential and factorial time are unreasonable.
10
A program uses the algorithm: for each of n items, compare against every other item. What is its efficiency class?

Why: n items, each compared against all others, gives n * (n-1) ≈ n² comparisons. Quadratic.
11
What does HTTPS provide that HTTP does not?

Why: HTTPS adds TLS, which encrypts the HTTP traffic. It's slightly slower than HTTP (not faster) but protects against eavesdropping and tampering.
12
Which approach best reduces the risk of phishing attacks on a school's users?

Why: Phishing is social engineering. Users need to recognize it, AND MFA ensures that a stolen password alone doesn't grant access.
13
A task runs in 100 seconds sequentially. With 4 cores and no overhead, what's the theoretical minimum parallel time?

Why: If fully parallelizable, 100 / 4 = 25 seconds. In practice there's always some overhead, but the theoretical minimum is 25.
14
Select all examples of PII. (multi-select intent)

Why: PII is anything that can identify a specific person, especially in combination. Name+DOB is classic PII; biometric data like fingerprints is also PII. The other two are not identifiable.
15
What is the Halting Problem an example of?

Why: The Halting Problem (determine whether any program on any input will eventually halt) was proven by Turing to be undecidable. No algorithm can solve it in general.

Flashcards

APCSP vocab deck (30 terms)

1 / 30 · browse mode

Test-taking strategies

Mnemonics

“Coders Don’t Always Consider Impact” = order of Big Ideas: Creative development, Data, Algorithms, Computer systems/networks, Impact.

“SLICE” for things to check in an algorithm question: Scope (is the variable visible here?), Loop bounds (off by one?), Index base (0 or 1?), Conditionals (is the Boolean right?), Edge cases (empty list, first/last element).

“HTTPS = HTTP Secure”: HTTPS is just HTTP plus TLS encryption. S for Secure.

“DRY, not WET”: Don’t Repeat Yourself. Write Everything Twice is a bug factory. Use procedures.

Common pitfalls

Confusing bit with byte. 8 bits = 1 byte. “8-bit color” means 256 colors; “8-byte color” is 2⁶⁴ and doesn’t exist.
Mixing 0-indexed and 1-indexed. AP pseudocode is 1-indexed; Python is 0-indexed. The exam uses AP pseudocode.
Assuming linear search works on unsorted lists, binary search doesn’t. Linear search works on any list (O(n)). Binary search requires a sorted list (O(log n)).
Correlation vs causation. “A and B are related in this dataset” does not mean A causes B.
Forgetting that lossy compression is irreversible. Once a JPEG is compressed, the original pixels are gone. Editing and resaving lossy formats compounds the loss.
Treating encryption as a silver bullet. Encryption protects confidentiality, not availability or integrity. A ransomware attack that encrypts YOUR files uses the same tool against you.
Overreaching with data conclusions. Only claim what the sample supports. “Users of this app” is not “all people”.
Short-circuit boolean errors. (x ≠ 0) AND (10 / x > 1) is safe; (10 / x > 1) AND (x ≠ 0) crashes when x = 0.

Cheat sheet

AP pseudocode syntax at a glance

Concept	AP pseudocode
Assignment	`a ← 5`
Display / print	`DISPLAY(x)`
Input	`INPUT()`
Bounded loop	`REPEAT n TIMES { ... }`
Conditional loop	`REPEAT UNTIL (cond) { ... }`
Iterate list	`FOR EACH item IN aList { ... }`
If / else	`IF (cond) { ... } ELSE { ... }`
Define procedure	`PROCEDURE name(params) { ... }`
Call procedure	`name(args)`
Return	`RETURN value`
Math	`+`, `-`, `*`, `/`, `MOD`
Booleans	`AND`, `OR`, `NOT`, `=`, `≠`, `<`, `>`, `≤`, `≥`
First element	`aList[1]`, 1-indexed
List length	`LENGTH(aList)`
Insert at i	`INSERT(aList, i, value)`
Append	`APPEND(aList, value)`
Remove	`REMOVE(aList, i)`

Binary / decimal / hex conversions

Decimal	Binary	Hex
0	0000	0
1	0001	1
2	0010	2
4	0100	4
8	1000	8
9	1001	9
10	1010	A
15	1111	F
16	1 0000	10
255	1111 1111	FF

Efficiency classes (slow to fast)

Class	Example	1000 items
O(n!)	Trying every permutation	Impossible
O(2ⁿ)	Trying every subset	Impossible
O(n²)	Nested loops, bubble sort	1,000,000 ops
O(n log n)	Merge sort, quick sort	~10,000
O(n)	Linear search	1,000
O(log n)	Binary search	10
O(1)	Hash table lookup	1

Encryption at a glance

Type	Keys	Speed	Use case
Symmetric	One shared key	Fast	Bulk encryption of data
Asymmetric	Public + private	Slow	Key exchange, digital signatures
Hash	None (one-way)	Fast	Passwords, integrity checks

Internet protocols

Protocol	Purpose
HTTP / HTTPS	Web page requests (S adds TLS encryption)
DNS	Name to IP address lookup
TCP	Reliable ordered delivery
UDP	Fast unordered delivery
IP	Addressing and routing

Study session

Use the floating timer in the bottom-left corner of the page. Four 25-minute blocks with breaks cover 2 hours of focused review, which is exactly one exam’s worth of stamina.

Good luck. Read the question twice, trust the reference sheet, flag the hard ones, and bring a snack for the break.

← Previous

Welcome to the IHHS Study Guide Hub

Chapter 16 WW2 Full guide