Watch — a 2-minute tour of this branch, narrated in my own AI-cloned voice.

1. Theoretical foundations

1.1 Computability theory

Modern computer science begins with Alan Turing’s 1936 paper on computable numbers, which formalised the concept of an algorithm and proved the existence of undecidable problems[1]. The Turing Machine — an abstract device with a tape, head, and state table — remains the canonical model of general-purpose computation. Alonzo Church developed an equivalent formalism through the lambda calculus, leading to the Church–Turing thesis: any effectively computable function can be computed by a Turing machine.

Gödel’s incompleteness theorems (1931), predating Turing, established that no sufficiently powerful formal system can be both complete and consistent, a result with deep implications for automated theorem proving and formal verification.

1.2 Information theory

Claude Shannon’s 1948 paper, A Mathematical Theory of Communication[2], introduced the bit as the fundamental unit of information and defined entropy as a measure of uncertainty:

H(X) = -∑ P(xᵢ) · log₂ P(xᵢ)

Shannon also proved the channel capacity theorem: every noisy channel has a maximum error-free transmission rate (Shannon limit) that no code can exceed. These results underpin data compression (ZIP, PNG, MP3), error-correcting codes (used in every storage device and wireless protocol), and, by extension, the entire telecommunications industry.

1.3 Complexity theory

Complexity theory classifies problems by the resources (time, space) required to solve them. The most famous open question in all of mathematics and computer science is whether P = NP — that is, whether every problem whose solution can be verified quickly can also be solved quickly[6]. The conjecture that P ≠ NP undergirds modern public-key cryptography: factoring large integers is assumed to be in NP but not in P.

ClassInformal definitionExample problem
PSolvable in polynomial timeSorting, shortest path
NPSolution verifiable in polynomial timeBoolean satisfiability (SAT)
NP-completeIn NP; every NP problem reduces to itTravelling salesman (decision)
PSPACESolvable using polynomial spaceQuantified Boolean formula
EXPSolvable in exponential timeGeneralised chess

2. Algorithms & data structures

Knuth’s multi-volume The Art of Computer Programming[7] and Cormen et al.’s Introduction to Algorithms[8] (CLRS) define the canonical curriculum. Algorithm analysis uses asymptotic notation (Big-O, Θ, Ω) to characterise performance independent of hardware.

2.1 Fundamental data structures

  • Arrays & linked lists — O(1) random access vs. O(1) insertion
  • Hash tables — average O(1) lookup; collision resolution via chaining or open addressing
  • Binary search trees / balanced trees (AVL, red-black) — O(log n) operations guaranteed
  • Heaps — O(1) find-min, O(log n) insert; backbone of priority queues
  • Graphs — adjacency lists/matrices; vertices and edges model networks, dependencies, state machines

2.2 Landmark algorithms

“An algorithm must be seen to be believed.”
— Donald E. Knuth, The Art of Computer Programming[7]
AlgorithmYearComplexityApplication
Merge sort1945 (von Neumann)O(n log n)External/internal sorting
Dijkstra’s shortest path1956 / pub. 1959O((V+E) log V)Routing protocols (OSPF)
Fast Fourier Transform1965 (Cooley–Tukey)O(n log n)Signal processing, polynomial multiplication
RSA1977O(k³) key genPublic-key encryption
PageRank1998 (Page et al.)O(n · iter)Web search ranking

2.3 Dynamic programming & divide-and-conquer

Richard Bellman formalised dynamic programming in the 1950s: decompose a problem into overlapping subproblems and cache results (memoisation). Classic applications include the Bellman–Ford shortest-path algorithm, sequence alignment in bioinformatics (Smith–Waterman), and the Viterbi algorithm used in speech recognition and error-correcting codes.

3. Programming language theory

3.1 Paradigms

Programming paradigms are fundamental styles of computation. No real-world language is purely one paradigm; most modern languages are multi-paradigm.

  • Imperative — explicit sequences of statements mutating state (C, Assembly)
  • Object-oriented — encapsulation, inheritance, polymorphism (Java, C++, Python)
  • Functional — first-class functions, immutability, referential transparency (Haskell, Lisp, Erlang)
  • Logic / declarative — state what is true, not how to compute it (Prolog, SQL)
  • Concurrent / reactive — model computation as communicating processes (Go, Erlang, Rust async)

3.2 Type systems

A type system is a formal method for classifying values and preventing certain categories of errors. The Hindley–Milner type inference algorithm (used in Haskell and ML) derives types automatically without explicit annotations. Rust’s type system extends this with lifetime annotations, enabling compile-time memory safety proofs (see the Rust page).

3.3 Compilation pipeline

Aho, Lam, Sethi, and Ullman’s Compilers: Principles, Techniques, and Tools[9] (the “Dragon Book”) is the definitive text. A compiler pipeline typically includes:

  1. Lexical analysis — tokenisation via finite automata
  2. Parsing — syntax tree construction via context-free grammars
  3. Semantic analysis — type checking, scope resolution
  4. Intermediate representation — e.g., LLVM IR, SSA form
  5. Optimisation — dead-code elimination, inlining, vectorisation
  6. Code generation — target-specific machine code

LLVM (Low Level Virtual Machine)[10] has become the dominant compiler infrastructure, used as the backend for Clang (C/C++), rustc (Rust), and Swift, among others.

4. Computer architecture

Patterson and Hennessy’s Computer Organisation and Design[11] and Computer Architecture: A Quantitative Approach[12] remain the standard texts. The von Neumann architecture (1945) — shared memory for instructions and data, a single bus — underpins nearly every processor manufactured to this day, with its primary bottleneck being the von Neumann bottleneck (limited memory bandwidth).

4.1 The memory hierarchy

Performance in modern systems is dominated by memory latency, not raw computation. The hierarchy from fastest/smallest to slowest/largest:

LevelTypical latencyTypical size
CPU registers< 1 ns~1 KB
L1 cache~1–3 ns32–64 KB
L2 cache~5–10 ns256 KB – 1 MB
L3 cache~20–40 ns4–64 MB
Main RAM (DRAM)~60–100 ns8–512 GB
NVMe SSD~100 µs1–8 TB
HDD~5–10 ms1–20 TB

4.2 Instruction set architectures

The two dominant ISA philosophies are CISC (Complex Instruction Set Computing, exemplified by x86-64) and RISC (Reduced Instruction Set Computing, exemplified by ARM and RISC-V). The post-2020 era saw ARM ascend to dominance in mobile and, with Apple Silicon, in high-performance laptops and servers. RISC-V, an open royalty-free ISA, is gaining traction in embedded systems and research processors.

4.3 Parallelism

Flynn’s taxonomy classifies parallel architectures: SISD, SIMD (vector units in modern CPUs and GPUs), MISD, and MIMD (multi-core CPUs, distributed systems). GPU architecture — thousands of simple cores executing the same instruction over different data — is the hardware foundation of modern deep learning[3].

5. Operating systems

Tanenbaum’s Modern Operating Systems[13] and Silberschatz et al.’s Operating System Concepts[14] cover the field comprehensively. An OS provides three core abstractions:

  • Process — an instance of a running program with isolated address space
  • File system — a hierarchical namespace mapping names to persistent byte sequences
  • Network socket — an endpoint for inter-process communication, local or remote

5.1 The Linux kernel

Linus Torvalds released Linux 0.01 in 1991. Today the Linux kernel powers the vast majority of servers, cloud instances, Android devices, and supercomputers. Its monolithic architecture (with loadable modules) contrasts with Tanenbaum’s MINIX microkernel design, a debate that played out in a famous public exchange on the comp.os.minix newsgroup in 1992. The kernel’s scheduler, memory manager, virtual file system (VFS), and networking stack are each complex subsystems studied in graduate-level OS courses.

5.2 Concurrency & synchronisation

Dijkstra introduced the semaphore in the early 1960s (ca. 1962–63, during work on the THE multiprogramming system) as a primitive for mutual exclusion, and in 1965 formulated the dining philosophers problem as a canonical illustration of deadlock. Modern kernels use futexes (fast userspace mutexes) to implement synchronisation with minimal system-call overhead. Lock-free data structures based on compare-and-swap (CAS) hardware primitives are essential for high-performance concurrent systems.

6. Computer networking

The Internet’s foundational protocol, TCP/IP, was designed by Cerf and Kahn in 1974[15]. The OSI reference model (1984) provides a conceptual seven-layer framework for understanding protocol composition, though the practical Internet maps more directly to a four-layer TCP/IP model.

6.1 The Internet protocol stack

LayerProtocol examplesKey concept
ApplicationHTTP/3, DNS, SMTP, TLSSemantics for user-facing services
TransportTCP, UDP, QUICReliable / unreliable end-to-end delivery
NetworkIPv4, IPv6, ICMP, BGPGlobal routing and addressing
LinkEthernet, Wi-Fi (802.11), ARPHop-by-hop delivery on a single link

6.2 The World Wide Web

Tim Berners-Lee invented the World Wide Web at CERN in 1989, publishing a proposal for a hypertext information system. His 1994 paper[16] describes the three technologies that underpin the Web to this day: HTML (HyperText Markup Language), URI (Uniform Resource Identifier), and HTTP (HyperText Transfer Protocol). The Web transitioned from a document-retrieval system to an application platform through JavaScript, AJAX (2005), and modern frameworks like React and WebAssembly.

7. Databases & storage

Codd’s 1970 paper[17] introduced the relational model, formalising data as tables (relations) with a declarative query language that became SQL. The ACID properties — Atomicity, Consistency, Isolation, Durability — define the correctness guarantees of relational transactions.

The NoSQL movement (2009 onwards) sacrificed some of these guarantees for horizontal scalability, giving rise to the CAP theorem (Brewer, 2000): a distributed system can guarantee at most two of Consistency, Availability, and Partition tolerance. This theorem motivated the design of systems like Amazon Dynamo (AP), Google Bigtable (CP), and Apache Kafka (distributed log).

8. Security & cryptography

Rivest, Shamir, and Adleman introduced the RSA public-key cryptosystem in 1977, based on the difficulty of factoring large composites. Diffie and Hellman’s 1976 paper on public-key exchange laid the conceptual groundwork. Modern secure communication relies on a combination of:

  • Asymmetric cryptography (RSA, ECC) — key exchange and digital signatures
  • Symmetric cryptography (AES-256) — bulk data encryption
  • Hash functions (SHA-256, SHA-3) — integrity verification
  • TLS 1.3 — the protocol binding all of the above for the modern Web