AI Foundation Β· Domain 01

Foundations of Artificial Intelligence

What AI actually is, where it came from, why symbolic AI failed, the philosophical debates that matter today, the rational agent framework, competing paradigms, and the state of the field.

1.1
Chapter 1.1 Β· Definitions & Scope
What Is Artificial Intelligence?

Most AI discourse fails at the first step: no crisp definition. AI is not magic, not a single algorithm, and not science fiction. It is the science and engineering of building systems that exhibit behaviour we would call intelligent if a human exhibited it β€” and that definition is deliberately provisional, because intelligence itself resists definition.

🧬

Biological Intelligence

  • Embodied β€” tied to a body with survival pressures
  • Developed through evolution over millions of years
  • Energy-efficient: 20 watts powers the human brain
  • Generalises from very few examples (few-shot by default)
  • Handles open-ended, ambiguous, novel situations naturally
πŸ€–

Machine Intelligence

  • Disembodied β€” no physical survival pressure
  • Built through training on human-generated data
  • Energy-hungry: GPT-4 training cost ~$100M in compute
  • Requires millions–billions of examples to learn patterns
  • Brittle outside training distribution; exceptional at defined tasks
What makes a system "intelligent"? There is no consensus. The four most-cited criteria:
  • Ability to learn from experience
  • Ability to solve novel problems
  • Ability to understand and generate language
  • Ability to reason under uncertainty
No current AI system fully satisfies all four in the general sense humans do.
The AI Field β€” Nested Subfields & Overlaps
AI ML DL Gen AI DS overlap AI Chess Β· Expert systems Β· Robotics Β· Planning Β· LLMs Machine Learning SVMs Β· Random Forests Β· XGBoost Β· Clustering Learns from data β€” no explicit rules Deep Learning CNNs Β· RNNs Β· Transformers Β· LLMs Multi-layer neural networks Generative AI GPT Β· DALL-E Β· Stable Diffusion Β· Sora Generates novel content Data Science Statistics Β· SQL Β· Visualisation Β· BI Overlaps AI/ML β€” not a true subset
TermDefinitionSubset ofExample
Artificial IntelligenceAny technique enabling machines to mimic aspects of human cognitionβ€” (broadest)Chess engines, expert systems, LLMs, robotics
Machine LearningAI that improves by learning from data rather than explicit rulesAISpam filter, fraud detection, recommendation systems
Deep LearningML using multi-layer neural networks to learn hierarchical representationsMLGPT-4, image classifiers, speech recognition
Generative AIModels that generate new content (text, images, audio, code) plausibly like training dataDeep LearningChatGPT, DALL-E, Stable Diffusion, Sora
Data ScienceInterdisciplinary field combining statistics, ML, and domain expertise to extract insights from dataOverlaps AI & StatisticsBusiness dashboards, A/B testing, churn analysis

Common misuse: "We use AI" usually means "we use ML" and often specifically means "we use a trained model." Precision matters β€” especially evaluating vendor claims. Ask: Is it rule-based? Trained? On what data? Measured by what metric?

The AI Capability Spectrum β€” ANI β†’ AGI β†’ ASI
ANI Artificial Narrow Intelligence GPT-4 Β· AlphaGo Β· FaceID ← We are here (2026) AGI Artificial General Intelligence Human-level across all domains Research goal Β· 5–50+ yr timeline ASI Artificial Super Intelligence Surpasses human cognition entirely Theoretical Β· Motivates alignment ← Specialised                      Generalised β†’ Increasing capability, autonomy & risk
🎯

ANI β€” Narrow AI

Excels at one specific task. All commercially deployed AI today is ANI β€” including GPT-4. Examples: chess engines, image classifiers, recommendation algorithms, language models.

STATUS: Present reality (2026)

🧠

AGI β€” General AI

Human-level cognitive ability across any intellectual domain. Can learn any task a human can, without domain-specific training. No AGI exists. Timelines debated: 5–50+ years by leading researchers.

STATUS: Research goal

⚑

ASI β€” Super AI

Surpasses human intelligence in every domain β€” creativity, scientific discovery, social intelligence. Purely theoretical. Motivates alignment research and existential risk discourse (Domain 10).

STATUS: Theoretical

MythReality
AI "understands" thingsAI processes statistical patterns in training data. Whether this constitutes understanding is a genuine philosophical debate (Ch 1.4) β€” but current systems don't understand in the human sense.
AI is deterministicMost modern AI uses probabilistic sampling. The same prompt produces different outputs. Temperature and sampling parameters control randomness.
AI develops goals on its ownAI optimises for its training objective. It doesn't spontaneously form desires. Alignment failures happen when the objective doesn't match human intent β€” not through "waking up."
More data always = better AIData quality, labelling accuracy, and relevance matter more than volume. 1B clean samples often outperform 10B noisy ones.
AI replaces human intelligence wholesaleCurrent AI replaces specific tasks within jobs, not entire jobs at once. Displacement patterns are complex and highly uneven across domains.
1.2
Chapter 1.2 Β· Narrative History
History of AI β€” Origin to Present

AI history is a story of alternating euphoria and collapse. Understanding why each era gave way to the next β€” not just when β€” is the best inoculation against misreading hype today. The pattern: overpromise β†’ underfund β†’ winter β†’ unexpected breakthrough β†’ repeat.

AI Progress & Hype β€” 1950 to 2026
Progress / Hype 1950 1970 1987 1997 2006 2012 2017 2026 WINTER 1 WINTER 2 Dartmouth Golden Age Deep Blue AlexNet Transformer Now
πŸ“œ

Leibniz & Boole (1600s–1800s)

Leibniz dreamed of a calculus ratiocinator β€” a machine that could calculate truth from symbols. Boole formalised logic as algebra (1854), creating the mathematical foundation for all computation that followed.

✍️

Ada Lovelace (1843)

Writing notes on Babbage's Analytical Engine, Lovelace described how the machine could be programmed to compose music β€” arguably the first vision of general-purpose computing. She also articulated its limits: it can only do what we tell it. The tension between capability and instruction persists today.

βš™οΈ

Early Mechanical Calculators

Pascal's Pascaline (1642), Babbage's Difference Engine (1822) β€” mechanical attempts to automate calculation. They mechanised arithmetic but not reasoning. The distinction matters: calculation β‰  intelligence.

The Three Founding Documents
01
McCulloch & Pitts (1943) β€” "A Logical Calculus of Ideas Immanent in Nervous Activity"
Proposed the first mathematical model of a neuron β€” a binary threshold unit that fires when inputs exceed a threshold. Showed that networks of these units could compute any logical proposition. Direct ancestor of all modern neural networks.
02
Alan Turing (1950) β€” "Computing Machinery and Intelligence"
Proposed the Imitation Game as a pragmatic test for machine intelligence. Anticipated and answered nine objections to machine thought. Introduced the "child machine" β€” a machine that learns rather than being pre-programmed. The first articulation of what we now call machine learning.
03
Dartmouth Conference (1956) β€” McCarthy, Minsky, Shannon, Simon
The proposal: "Every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it." McCarthy coined "Artificial Intelligence." Optimism proved premature β€” but the research agenda set here dominated the field for 20 years.
🌟

Early Successes

  • Logic Theorist (1955): proved 38 of 52 theorems from Principia Mathematica
  • GPS β€” General Problem Solver (1957): domain-agnostic search; separated problem description from problem-solving method
  • ELIZA (1966): Weizenbaum's chatbot simulated a Rogerian therapist. Users formed genuine emotional connections β€” a warning about anthropomorphism that still applies to LLMs today.
  • SHRDLU (1970): natural language system that manipulated blocks in a simulated world. Impressive in its toy domain; completely brittle outside it.
⚠️

Seeds of the First Winter

  • Minsky & Papert (1969) proved perceptrons can't learn XOR β€” stalling neural network research for a decade
  • ALPAC report (1966): machine translation was "twice as expensive and twice as slow" as human translation
  • Combinatorial explosion: real-world problems have exponentially more states than toy problems
  • The "Potemkin village" problem: early demos were cherry-picked; the gap between demos and general capability was enormous
Why study the winters? The pattern of hype β†’ disappointment β†’ funding collapse has repeated twice in AI history. Understanding the mechanism prevents misreading the current cycle. The winters were not caused by bad research β€” they were caused by the gap between what was promised and what was delivered, plus a fundamental limitation (compute + data) that took decades to overcome.
WinterTriggerWhat CollapsedRoot CauseWhat Survived
First (1974–80)Lighthill Report (UK, 1973); DARPA cutsGeneral-purpose AI, machine translation, symbolic reasoningCombinatorial explosion; overpromised timelines to fundersSpecialised systems; early expert system research; Prolog
Second (1987–93)Expert system maintenance failures; Lisp machine market collapseCommercial expert systems; DARPA Strategic Computing ProgramExpert systems too brittle and expensive to maintain at real-world scaleBackpropagation (1986 rediscovery); statistical ML; RL foundations
❄️

Why Expert Systems Failed

  • Knowledge acquisition bottleneck: experts couldn't articulate their own tacit knowledge
  • Brittleness: outside their narrow domain, systems failed completely β€” no graceful degradation
  • Maintenance cost: real-world rule sets grew to thousands of rules that became unmaintainable
  • No learning: every change required manual update by knowledge engineers
πŸ“–

The Lighthill Report (1973)

Sir James Lighthill's report to the UK Science Research Council concluded that AI had failed to live up to its "grandiose objectives" due to the combinatorial explosion problem. The UK effectively abandoned AI funding for a decade. The US followed.

Lesson: Overpromising specific timelines to funders is more damaging than the research being wrong.

πŸ”„

Backprop Rediscovered (1986)

Rumelhart, Hinton & Williams published the backpropagation algorithm. Earlier work by Werbos (1974) had gone unnoticed. Training multi-layer networks became practical. The door to deep learning opened β€” but GPUs weren't ready yet.

πŸ“Š

Statistical ML Rise (1990s)

SVMs (Vapnik, 1995) provided strong theoretical guarantees. Hidden Markov Models dominated speech recognition. Probabilistic graphical models matured. ML became rigorous and mathematical, focused on generalization theory rather than AI folklore.

πŸ†

Key Milestones

  • Deep Blue beats Kasparov (1997)
  • LSTM networks (Hochreiter & Schmidhuber, 1997)
  • LeNet-5 for digit recognition (LeCun, 1998)
  • Netflix Prize β€” collaborative filtering (2006–09)
  • ImageNet dataset created (Fei-Fei Li, 2009)

Three factors converged in 2012: massive datasets (ImageNet: 1.2M labelled images), GPU computing (CUDA parallelism), and algorithmic improvements (ReLU, dropout, batch normalisation). None alone was sufficient; all three together were transformative.

Three Inflection Points That Changed Everything
β–Ά
2012 β€” AlexNet (Krizhevsky, Sutskever, Hinton)
Won ImageNet with 15.3% top-5 error vs. 26.2% for the runner-up β€” an 11-point gap that shocked the computer vision community. Trained on 2 GTX 580 GPUs in 5 days. Every major research lab pivoted to neural networks within 18 months.
β–Ά
2017 β€” "Attention Is All You Need" (Vaswani et al., Google Brain)
The Transformer architecture replaces recurrent networks for sequence modelling. Self-attention enables full context access without sequential bottlenecks. Direct ancestor of BERT, GPT, T5, LLaMA, and every modern LLM.
β–Ά
2022 β€” ChatGPT (OpenAI)
100 million users in 60 days β€” fastest consumer technology adoption in history. Demonstrated that RLHF (Reinforcement Learning from Human Feedback) could align large language models to be genuinely helpful and safe. Moved AI from research into mainstream public discourse permanently.
YearEventWhy It Matters
2012AlexNet wins ImageNetDeep learning proven at scale; GPU compute validated as the path forward
2014GANs (Goodfellow)Generative models can produce realistic images; generative AI era begins
2016AlphaGo defeats Lee SedolRL + deep learning = superhuman Go; 10 years ahead of schedule
2017Transformer architectureReplaces RNNs; enables parallelism; foundation of all modern LLMs
2018BERT & GPT-1Pre-train β†’ fine-tune paradigm established for NLP
2020GPT-3 (175B params)Few-shot learning at scale; foundation model era begins
2021DALL-E, Codex, AlphaFold 2Multimodal generation; protein structure solved after 50-year open problem
2022ChatGPT; Stable DiffusionAI goes mainstream; open-source generative models democratise access
2023–24GPT-4, Claude 3, Gemini, LLaMAMultimodal capability; open weights; commercial proliferation
2025–26Agentic AI; reasoning modelso1, o3, DeepSeek-R1, Claude 3.7; agents handle full multi-step workflows
1.3
Chapter 1.3 Β· Knowledge & Reasoning
Symbolic AI & Knowledge Representation

Symbolic AI is commonly omitted from modern documentation. This is a mistake. Understanding why symbolic AI dominated for 30 years β€” and exactly why it failed to scale β€” is what explains why neural networks won. More importantly, it reveals what neural networks still cannot do.

Symbolic AI vs. Neural AI β€” How Knowledge Is Represented
SYMBOLIC AI (GOFAI) Human-written rules & logic IF fever AND cough β†’ flu [0.8] IF flu β†’ recommend tamiflu βœ“ Interpretable Β· Auditable βœ— Brittle Β· Can't learn Β· Doesn't scale Dominant 1956–1990s ⟢ NEURAL AI (CONNECTIONIST) Learned weights from data β†’ output βœ“ Learns from data Β· Scales Β· Generalises βœ— Black box Β· Data-hungry Β· Fragile OOD Dominant 2012–present
DimensionSymbolic (GOFAI)Sub-Symbolic (Connectionist)
Knowledge representationExplicit symbols, rules, logic β€” human-readableDistributed across billions of numerical weights β€” not human-readable
ReasoningFormal inference (deduction, induction, abduction)Pattern interpolation from training data
InterpretabilityFully interpretable β€” you can read and audit the rulesBlack box β€” weights are not meaningfully inspectable
LearningDifficult to learn from raw data; requires manual knowledge encodingLearns directly from raw data; scales with data and compute
GeneralisationBrittle outside defined rule coverageGeneralises within training distribution; fails on distribution shift
Current formKnowledge graphs, ontologies, formal methods, SAT solversLLMs, CNNs, diffusion models, transformers
Semantic Network β€” Knowledge as a Graph
Socrates Human Mortal Philosopher Knowledge Mammal is-a is-a is-a pursues is-a

Nodes = concepts Β· Edges = relationships Β· Inheritance flows along is-a links

πŸ•ΈοΈ

Semantic Networks

Graph-based: nodes are concepts, edges are relationships. Socrates β†’ is-a β†’ Human β†’ is-a β†’ Mortal. Allows inheritance β€” Socrates automatically inherits all human properties. Precursor to modern knowledge graphs (Wikidata, Google Knowledge Graph).

πŸ“‹

Frames (Minsky, 1974)

Structured representations with named slots and default values. A "Restaurant" frame has slots: name, type, price_range, menu, hours. When visiting a new restaurant, you fill known slots and use defaults for unknown ones. Directly influenced object-oriented programming.

πŸ—οΈ

Ontologies

Formal specification of concepts and relationships within a domain. The Cyc project (Lenat, 1984–present) attempted to encode all human common-sense knowledge β€” 25M+ rules over 40 years. Still operational but not competitive with neural approaches on most tasks.

πŸ”’

Description Logics

Formal languages for ontologies with decidable reasoning. Basis of OWL (Web Ontology Language) β€” standard for semantic web and knowledge graphs. Allows automated consistency checking and inference over structured knowledge.

βš•οΈ

MYCIN (Stanford, 1972)

Diagnosed bacterial infections and recommended antibiotics using 600 production rules and certainty factors. Outperformed medical students and matched specialists in controlled tests. Never deployed clinically β€” not due to technical failure, but to liability concerns.

IF organism-stain = gram-negative
AND organism-morphology = rod
AND patient-compromised-host = true
THEN organism = pseudomonas [CF: 0.6]
πŸ”¬

DENDRAL (Stanford, 1965)

Identified organic molecules from mass spectrometry data. First expert system with real scientific impact β€” published findings human chemists hadn't noticed. Proved that narrow domain expertise could be encoded computationally and productively.

Forward vs. Backward Chaining
Forward chaining (data-driven): start from known facts, apply rules to derive new facts until goal is reached. Used in production rule systems. Backward chaining (goal-driven): start from the goal, work backwards to find rules that could prove it. Used in Prolog and MYCIN. Backward chaining is more efficient when the goal is specific and the search space is large.
πŸ”£

Propositional Logic

Atomic propositions connected by ∧ (AND), ∨ (OR), Β¬ (NOT), β†’ (implies). Truth-functional β€” truth of compound depends only on truth of components. Decidable but lacks expressive power: no variables, quantifiers, or relations between objects.

P = "It rains" Q = "Ground is wet"
P β†’ Q // if it rains, ground is wet
P // it rains (given)
∴ Q // modus ponens: ground is wet
βˆ€

First-Order Logic (FOL)

Extends propositional logic with objects, predicates, and quantifiers (βˆ€ for all, βˆƒ there exists). Can represent most mathematical knowledge. Semi-decidable: you can prove theorems but not always disprove them in finite time.

βˆ€x: Human(x) β†’ Mortal(x) // all humans are mortal
Human(Socrates) // Socrates is human
∴ Mortal(Socrates) // universal instantiation
Non-Monotonic Reasoning β€” Classical logic is monotonic: adding new knowledge never invalidates prior conclusions. But the real world isn't like that. "Birds fly" is true β€” until you learn the bird is a penguin. Non-monotonic reasoning (default logic, circumscription) handles exceptions and retractions. Neural networks handle this implicitly through pattern matching β€” but without explicit auditable reasoning trails.
AlgorithmStrategyComplete?Optimal?TimeSpace
BFSExpand shallowest nodes firstβœ“ Yesβœ“ Yes (unit cost)O(bd)O(bd)
DFSExpand deepest nodes firstβœ— Noβœ— NoO(bm)O(bm)
IDDFSDFS with increasing depth limitβœ“ Yesβœ“ YesO(bd)O(bd)
Uniform CostExpand lowest-cost node firstβœ“ Yesβœ“ Yes (admissible h)O(bC/Ξ΅)O(bC/Ξ΅)
Greedy Best-FirstExpand node closest to goal (h only)βœ— Noβœ— NoO(bm)O(bm)
A*f(n) = g(n) + h(n) β€” cost + heuristicβœ“ Yesβœ“ Yes (admissible h)O(bd)O(bd)
A* Worked Example: g(n) = actual cost from start to node n. h(n) = heuristic estimate to goal (e.g., straight-line distance for routing). f(n) = g(n) + h(n).

Admissibility condition: h(n) must never overestimate actual cost. If admissible, A* is guaranteed to find the optimal path. A better heuristic means fewer nodes expanded β€” the difference between seconds and hours on large graphs.

Minimax & Alpha-Beta: Two-player zero-sum games use Minimax (maximise your score, opponent minimises). Alpha-Beta pruning eliminates branches that cannot affect the final decision, reducing the effective branching factor from b to √b β€” roughly doubling searchable depth for the same compute.
1.4
Chapter 1.4 Β· Philosophy of Mind
The Turing Test & Philosophy of Mind

These aren't abstract puzzles. The Turing Test, Chinese Room, and symbol grounding problem are actively debated in the context of LLMs right now. They directly motivate alignment research, interpretability, and legal questions around AI. Don't treat this chapter as optional theory.

The Imitation Game β€” Turing Test Setup
πŸ§‘β€βš–οΈ INTERROGATOR (C) Can only see text β€”β€”β€” TEXT ONLY BARRIER β€”β€”β€” πŸ‘€ HUMAN (A) πŸ€– MACHINE (B) Which one is human?
The Original Imitation Game (Turing, 1950)
Three participants: a human interrogator (C), a human (A), and a machine (B). The interrogator communicates only via text. The machine's goal is to convince the interrogator it is human. Turing proposed this not as a definition of intelligence, but as a pragmatic operational test β€” if a machine can consistently fool a human interrogator, we have sufficient reason to call it intelligent, whatever "intelligent" means.
CritiqueWhy It Matters Today
Tests behaviour, not cognition β€” passes without understandingGPT-4 informally passes casual Turing tests. The understanding question remains fully open.
Anthropocentric β€” measures human-likeness, not intelligence per seA genuinely intelligent alien system might fail. The test conflates intelligence with human-mimicry.
Humans can fail it too β€” adversarial settings trip humans upCAPTCHA systems exploit this. Humans fail certain adversarial Turing variants more than LLMs do.
Tests only conversational fluency, not reasoning or knowledgeLLMs excel at fluency but fail structured reasoning and factual tests at the same time.
Modern variations: CAPTCHA (reverse Turing test β€” machines trying to verify humans), Winograd Schema Challenge (pronoun disambiguation requiring commonsense reasoning β€” harder for machines than conversational mimicry), and the Loebner Prize (formal Turing Test competition β€” no system has convincingly passed under rigorous adversarial evaluation).
Searle's Chinese Room β€” Syntax Without Semantics
THE ROOM πŸ§‘ English speaker Understands nothing πŸ“– RULEBOOK If δ½  β†’ reply ε₯½ If 吃 β†’ reply ι₯­ looks up δ½ ε₯½ INPUT slot δ½ ε₯½ OUTPUT 🧐 Chinese speaker "The room knows Chinese!" Correct output β‰  understanding Β· Syntax β‰  semantics
Searle's Thought Experiment (1980)
Imagine a person locked in a room receiving Chinese characters through a slot. They have a rulebook (in English) telling them how to manipulate the symbols and which characters to return. From outside, the room appears to understand Chinese β€” it passes any Turing test for Chinese comprehension. But the person inside understands nothing. They're just manipulating symbols according to rules.

Searle's conclusion: Programs manipulate syntax (formal symbol structures). Understanding requires semantics (meaning). Syntax alone is not sufficient for semantics. Therefore no program β€” regardless of performance β€” can have genuine understanding.
Counter-ArgumentSearle's ReplyCurrent Status
Systems Reply: The room as a whole understands Chinese, even if no individual part doesImagine the person memorises the entire rulebook β€” they still understand nothing. Systems don't have understanding any more than their parts.Debated β€” many find systems reply compelling
Brain Simulator Reply: If the program simulates individual neurons, would it understand?Simulating neurons is not the same as having neurons. A simulated storm doesn't make you wet.Debated β€” depends on theory of consciousness
Robot Reply: Connect the system to sensors and actuators β€” grounded meaning would emergeAdding I/O is just more symbol manipulation. The grounding problem remains.Partial concession β€” embodiment may matter
πŸ’­

The Hard Problem of Consciousness

Chalmers (1995) distinguishes "easy problems" (explaining cognitive functions: attention, memory, behaviour β€” all tractable in principle) from the "hard problem": why is there subjective experience at all? Why does information processing feel like something from the inside?

Even a complete functional account of the brain wouldn't explain why there's a "what it's like to be" that brain. This is the central unsolved problem in philosophy of mind β€” and applies directly to AI sentience claims.

🌐

Integrated Information Theory (IIT)

Tononi (2004): consciousness corresponds to integrated information (Ξ¦, "phi"). A system is conscious to the degree its parts share information in an irreducibly integrated way. Higher Ξ¦ = more conscious.

Implication: feedforward networks (including transformers, which have no recurrent loops) have Ξ¦ β‰ˆ 0. If IIT is correct, current LLMs are not conscious β€” not even slightly.

🧩

Multiple Intelligences (Gardner)

Eight distinct intelligences: linguistic, logical-mathematical, spatial, musical, bodily-kinaesthetic, interpersonal, intrapersonal, naturalistic. AI today dominates the first two; is largely absent from the last five.

🀸

Embodied Cognition

Intelligence is shaped by having a body that interacts with the world. Physical AI (robotics + LLMs) is the active frontier precisely because disembodied language models lack grounding in physical causality.

πŸ›οΈ

Cognitive Architectures

ACT-R (Anderson) and SOAR (Laird, Newell) are computational models of human cognition with procedural memory, declarative memory, and attention modules β€” making testable predictions verified against human reaction time data.

Harnad's Argument (1990)
How do symbols get their meaning? In a dictionary, words are defined by other words β€” circular. A child grounds symbols in perceptual experience: "red" is grounded in actually seeing red things. Harnad argued that purely symbolic AI can never escape this circularity β€” symbols need to be grounded in the world, not just in other symbols.
πŸ”—

How Neural Nets Partially Address This

Multimodal models (CLIP, GPT-4V, Gemini) ground language in images β€” "red" is statistically associated with red pixel patterns. This is partial grounding. But the model never sees red in the physical sense β€” it sees statistical co-occurrence in training data. Whether this constitutes genuine grounding is disputed.

πŸ€”

Implications for LLMs

Text-only LLMs have no sensory grounding. Their concept of "pain" is the statistical distribution of the word "pain" in training data. This may explain why LLMs discuss concepts fluently but fail tasks requiring genuine understanding of physical causality, embodied experience, or common-sense spatial reasoning.

1.5
Chapter 1.5 Β· AIMA Framework
Problem-Solving & Rational Agents

Russell & Norvig's Artificial Intelligence: A Modern Approach (AIMA) defines AI as the study of agents that perceive their environment and act to maximise their performance measure. This framework is the conceptual backbone of Domain 8 (Agentic AI) β€” understanding it now is essential.

The Rational Agent β€” Perceive β†’ Decide β†’ Act Loop
ENVIRONMENT SENSORS Percepts β†’ data AGENT State Β· Model Β· Goals Utility Β· Learning ACTUATORS Decisions β†’ actions percept action feedback loop β€” learn from outcomes S = Sensors P = Performance Β· E = Environment A = Actuators
Definition: A rational agent selects actions expected to maximise its performance measure, given the evidence in its percept sequence and built-in knowledge.

Rationality β‰  omniscience (knowing all outcomes). Rationality β‰  perfection (always choosing optimally). Rationality = expected utility maximisation given available information.
PEASDefinitionSelf-Driving CarLLM Agent
Performance MeasureWhat "doing well" means β€” the objectiveSafety, speed, comfort, law complianceTask completion, accuracy, user satisfaction
EnvironmentEverything the agent interacts withRoads, vehicles, pedestrians, weatherWeb pages, APIs, files, databases, other agents
ActuatorsMechanisms for taking actionSteering, brakes, acceleratorCode execution, web search, API calls, text output
SensorsMechanisms for perceiving the environmentCameras, LiDAR, GPS, radarContext window, tool outputs, memory retrieval
PropertyType AType BExample (A)Example (B)
ObservabilityFully observablePartially observableChess (full board visible)Poker (opponent's cards hidden)
DeterminismDeterministicStochasticChess (outcomes fully determined)Autonomous driving (weather, pedestrians)
EpisodicityEpisodicSequentialImage classification (each independent)Chess (moves affect future states)
DynamicsStaticDynamicCrossword puzzleStock trading, real-time robotics
ContinuityDiscreteContinuousChess (finite legal moves)Robot arm control (infinite positions)
AgentsSingle-agentMulti-agentSudoku solverMultiplayer games, multi-agent AI systems
Why environment type matters for design: A fully observable, deterministic, episodic, discrete environment (like chess) can in principle be solved with a lookup table. A partially observable, stochastic, sequential, continuous, multi-agent environment (like autonomous driving) requires probabilistic reasoning, memory, planning, and robustness to uncertainty β€” all simultaneously. The gap in engineering complexity is enormous.
Agent Sophistication Spectrum β€” From Reflex to Learning
SIMPLE REFLEX IF β†’ THEN rules Thermostat MODEL-BASED + Internal state Robot vacuum GOAL-BASED + Search & planning Route planner UTILITY-BASED + Preferences RL agent LEARNING AGENT + Self-improvement GPT Β· Claude Β· LLM agents Simple Sophisticated β†’ Increasing autonomy, flexibility & complexity β†’
πŸ”¦

Simple Reflex Agent

IF-THEN rules that map current percepts directly to actions. No memory. No history. Example: thermostat (IF temp < setpoint THEN heat on). Fast and auditable but completely brittle β€” fails whenever current percept doesn't capture all relevant state.

πŸ—ΊοΈ

Model-Based Reflex Agent

Maintains an internal state that tracks the world β€” knows what it can't currently perceive. Can handle partial observability. Example: a robot vacuum that tracks which areas have been cleaned even when not currently there.

🎯

Goal-Based Agent

Has an explicit goal and searches for action sequences to achieve it. Uses search algorithms (A*, BFS) to plan. More flexible than reflex agents β€” multiple paths to the same goal. Requires a model of how actions change the world.

πŸ“Š

Utility-Based Agent

Uses a utility function β€” a graded preference ordering over states, not just goal/no-goal. Chooses actions maximising expected utility. Handles stochastic outcomes naturally. Foundation of decision theory. Modern RL agents and LLM agents with reward models are utility-based.

πŸŽ“

Learning Agent

Any agent type augmented with a learning component that modifies behaviour based on experience. Composed of: learning element (improves performance), performance element (selects actions), critic (evaluates against a standard), problem generator (suggests exploratory actions). All modern AI systems are learning agents.

ElementDefinitionExample: 8-Puzzle
Initial stateStarting configurationRandom tile arrangement
ActionsSet of possible moves from each stateSlide tile left, right, up, or down
Transition modelResult of taking each action in each stateNew tile arrangement after sliding
Goal testDetermines if current state is goalTiles in order 1-2-3-4-5-6-7-8-blank
Path costNumeric cost of a pathNumber of moves taken
Why search-based AI was replaced: The 8-puzzle has ~181,000 reachable states. The 15-puzzle has ~1012. Chess: ~1046. Go: ~10170. Real-world AI problems are combinatorially intractable for exhaustive search β€” which is precisely why machine learning (pattern recognition from data) displaced search as the dominant paradigm for perception, language, and unstructured reasoning tasks.
1.6
Chapter 1.6 Β· Schools of Thought
Key Paradigms & Schools of Thought

AI is not one field β€” it is a collection of competing intellectual traditions with different assumptions about what intelligence is and how to build it. Understanding the camps explains why researchers from different traditions argue past each other, and why hybrid approaches are gaining traction.

DimensionSymbolicismConnectionismCurrent Status
Core claimIntelligence = symbol manipulation under logical rulesIntelligence emerges from densely connected simple unitsConnectionism dominant (2012–present)
Key figuresMcCarthy, Minsky, Newell, SimonRosenblatt, Rumelhart, Hinton, LeCun, BengioHinton, LeCun, Bengio won 2018 Turing Award
StrengthsInterpretable, logically consistent, handles explicit structured knowledgeLearns from data, generalises to new inputs, handles noiseBoth needed; neither sufficient alone
WeaknessesBrittle, doesn't scale, knowledge acquisition bottleneckBlack box, data-hungry, fails on distribution shiftInterpretability and robustness remain unsolved
Modern formKnowledge graphs, formal verification, constraint solversLarge language models, diffusion models, transformersHybrid (neuro-symbolic) is active research frontier
Neuro-Symbolic AI β€” The Hybrid Frontier
Systems combining neural pattern recognition with symbolic reasoning. AlphaGeometry (2024): uses a neural model to generate proof steps and a symbolic geometry solver to verify them β€” solves IMO-level geometry problems. Program synthesis: neural nets generate code candidates; symbolic testing verifies correctness. This is the current research direction for systematic generalisation beyond interpolation.
🎲

Bayesian AI

Models uncertainty explicitly using probability distributions. Bayes' theorem updates beliefs when new evidence arrives: P(H|E) ∝ P(E|H) Γ— P(H). Principled framework for reasoning under uncertainty β€” but computing exact posteriors is often intractable, requiring approximations (MCMC, variational inference).

Applications: spam filtering, medical diagnosis, sensor fusion in robotics, weather forecasting

πŸ”—

Probabilistic Graphical Models

Bayesian networks: directed acyclic graphs where nodes are random variables and edges are conditional dependencies. Efficient inference in structured domains. Hidden Markov Models (HMMs): sequences of hidden states with observable outputs. Dominated speech recognition from 1970s–2010s until deep learning.

Still widely used in robotics, bioinformatics, and probabilistic programming (Stan, Pyro)

Inspired by biological evolution. Genetic algorithms maintain a population of candidate solutions, select for fitness, recombine (crossover) and mutate. Used for optimisation where the landscape is rugged and gradient-based methods fail. Active in neural architecture search (NAS), hyperparameter optimisation, and hardware co-design.

Rodney Brooks (MIT) argued in the 1980s that traditional AI was fundamentally misguided β€” intelligence doesn't require symbolic internal representations. His subsumption architecture layered simple reactive behaviours to produce surprisingly complex robot behaviour. "Intelligence without representation." Now re-emerging: physical AI labs (Figure, Boston Dynamics, 1X) combine embodied robotics with LLM reasoning.

ACT-R (Anderson): models human cognition with procedural memory, declarative memory, and attention modules β€” makes testable predictions verified against reaction time data. SOAR (Newell, Laird, Rosenbloom): unified theory of cognition with problem spaces, operators, and impasse-driven learning. Both influenced the architecture of modern AI agents in Domain 8. OpenCog: open-source AGI architecture combining probabilistic logic networks with deep learning.

1.7
Chapter 1.7 Β· Current Landscape
The AI Landscape Today & Tomorrow

This chapter connects Domain 1's history and theory to the present, and gives you a map of where each subsequent domain fits. AI in 2026 is defined by foundation models as the new default paradigm β€” but with hard limitations that motivate everything that follows in this curriculum.

πŸ›οΈ

Foundation Models β€” The New Paradigm

Large models pre-trained on internet-scale data, adaptable to almost any task. The shift: from task-specific models trained from scratch β†’ general-purpose models fine-tuned for tasks. One model (GPT-4o, Claude 3.7, Gemini 1.5) outperforms hundreds of specialised predecessors.

  • Open weights: LLaMA 3, Mistral, Qwen, DeepSeek-V3, Phi-3
  • Closed API: GPT-4o, Claude 3.7, Gemini 1.5 Pro, Command R+
  • Specialised: Codex/StarCoder (code), Whisper (audio), SAM (segmentation)
βš–οΈ

Open vs. Closed Source

Closed: Better safety filtering, proprietary advantage, pay-per-use, rate limits
Open weights: Full control, privacy, fine-tuning, zero inference cost, safety risk if misused

Compute scaling trends: training cost for frontier models grows ~4Γ— per year. GPT-4: ~$100M. Next-generation frontier: estimated $1B+.

CapabilityWhat It MeansState of the Art (2026)Domain Coverage
PerceptionInterpreting sensory data (vision, audio, text)Superhuman on ImageNet; near-human speech; strong OCRDomain 6 (CV), Domain 5 (NLP)
ReasoningDrawing valid inferences from informationGood at formal logic; poor at common-sense causalityDomain 5 (LLMs), Domain 1 (logic)
PlanningGenerating action sequences toward goalsImproving with o1/o3 reasoning models; still brittle at scaleDomain 8 (Agents), Domain 7 (RL)
GenerationCreating novel text, images, audio, code, videoHuman-competitive in text/image; superhuman in code assistDomain 5 (LLMs), Domain 6 (CV)
ActionTaking actions in physical or digital environmentsEarly stage; computer-use agents emerging; physical robotsDomain 8 (Agents), Domain 7 (RL)
Goodhart's Law applied to AI benchmarks: "When a measure becomes a target, it ceases to be a good measure." AI benchmarks are routinely gamed β€” models are fine-tuned on test distributions, inflating scores without genuine capability improvement. Benchmark saturation happens faster than the underlying capability warrants. Always ask: what does this benchmark actually measure, and what can't it measure?
BenchmarkDomainCurrent StatusKey Concern
ImageNet (ILSVRC)Computer VisionModels exceed human ~97%+Real-world robustness far lower; adversarial examples break all models
GLUE / SuperGLUENLPSaturated β€” models exceed human baselineSaturated within 2 years of creation; replaced by harder benchmarks
MMLUKnowledge (57 subjects)GPT-4 ~87%; Claude 90%+Multiple-choice is gameable; doesn't test applied or procedural knowledge
BIG-Bench HardReasoningFrontier models pass most tasksBeing saturated; successor benchmarks in development
HumanEval / SWE-benchCode~90% HumanEval; ~50% SWE-benchSWE-bench is more realistic but still a limited test suite of GitHub issues
ARC-AGIAbstract Reasoning~75–80% best models (2025)Designed to resist pattern matching; remains the hardest general reasoning eval
🧩

Common-Sense Reasoning

Humans know ice cream melts in sun, dropped objects fall, elephants don't fit in cars β€” without being told. LLMs absorbed much of this from text but fail unpredictably on novel physical scenarios. The Winogrande benchmark reveals systematic errors humans find trivial. Commonsense is the gap between pattern matching and genuine world understanding.

πŸ”€

Causal Inference

Current AI is fundamentally correlational. Correlation β‰  causation. "Countries with more hospitals have more disease" β€” a correlational model would conclude hospitals cause disease. Judea Pearl's causal hierarchy (association β†’ intervention β†’ counterfactual) identifies exactly what's missing. No current neural system reliably reasons causally from observational data.

πŸ—“οΈ

Long-Horizon Planning

Current agents reliably handle ~10–50 step tasks. Real-world projects span hundreds of interdependent steps over days or weeks. Errors compound: one wrong action early invalidates downstream planning. Agents also suffer context window limitations (memory) and lack persistent state across sessions.

🎯

Sample Efficiency & Robustness

Humans learn from a handful of examples; LLMs require billions. Sample efficiency β€” learning more from less data β€” is largely unsolved. Robustness β€” consistent performance under distribution shift β€” is equally unsolved. Both are critical for safety-critical deployment (medicine, autonomous vehicles, infrastructure).

πŸ“‹ Domain 01 β€” Key Takeaways
  • AI = systems that perceive, reason, learn, and act β€” not a single technology; a collection of competing approaches
  • All current AI is ANI (Narrow). AGI timelines are genuinely debated; ASI is theoretical. Precision about which level you mean matters enormously.
  • AI history is a cycle of overpromise β†’ collapse β†’ breakthrough. The current wave is real β€” but understanding the winters prevents misreading today's hype.
  • AI winters were caused by combinatorial explosion + knowledge acquisition bottleneck + overpromising timelines β€” not bad science
  • Symbolic AI failed to scale but wasn't wrong β€” it lives on in knowledge graphs, formal methods, and neuro-symbolic hybrids
  • The Turing Test, Chinese Room, and symbol grounding problem are actively debated in the context of LLMs β€” they motivate alignment, interpretability, and AI rights discourse
  • The PEAS framework (Performance, Environment, Actuators, Sensors) formally describes any agent β€” including modern LLM-based agents in Domain 8
  • Connectionism dominates today; Bayesian, evolutionary, and embodied approaches remain active and complementary
  • Benchmarks are routinely gamed β€” benchmark saturation does not mean a capability is solved. Always interrogate what the benchmark actually measures.
  • Hard open problems: common-sense reasoning, causal inference, long-horizon planning, sample efficiency, robustness