AI Foundation · Domain 12

Emerging Technologies

What’s next in AI research — from foundation models and world models to neuromorphic computing, quantum AI, embodied intelligence, AGI concepts, and the open frontiers that will define the next decade.

12.1

Chapter 12.1

Foundation Models — Evolution & Scaling

Foundation models — large pre-trained models adapted to downstream tasks — have become the dominant paradigm in AI. A single model trained once on internet-scale data now powers chatbots, code assistants, image generators, and scientific tools simultaneously.

Evolution & Scaling Laws Core

The term “foundation model” was coined by Stanford HAI in 2021 to describe models like GPT-3, BERT, and CLIP that serve as a base for many applications. The key insight: scaling compute, data, and parameters together yields predictable capability gains — the Chinchilla scaling laws (Hoffmann et al., 2022).

Foundation Model Timeline — from BERT to Frontier Models

Chinchilla Scaling Law L(N, D) ≈ E + A/N^α + B/D^β L = loss, N = parameters, D = training tokens, E = irreducible entropy. Optimal: tokens ≈ 20 × parameters.

Model	Params	Training Tokens	Modalities	Open?	Key Innovation
GPT-4o	~1.8T (MoE)	~13T	Text, Image, Audio	Closed	Native multimodal I/O
Claude 3.5 Sonnet	Undisclosed	Undisclosed	Text, Image	Closed	200K context, tool use
Gemini 1.5 Pro	~MoE	Undisclosed	Text, Image, Video, Audio	Closed	1M token context
Llama 3.1 405B	405B	15T	Text	Open weights	Competitive with closed
Mistral Large	~123B	Undisclosed	Text	Open weights	Efficient MoE
DeepSeek-V3	671B (MoE)	14.8T	Text	Open weights	$5.5M training cost

Multimodal & Post-Training In-depth

The frontier has shifted from text-only to natively multimodal models that process text, images, audio, and video in a single architecture. Equally important: post-training techniques (RLHF, DPO, constitutional AI) now matter as much as pre-training scale.

🌐

Multimodal Fusion

Early: separate encoders (CLIP text + ViT image). Now: unified tokenisation across modalities. GPT-4o processes audio natively — no speech-to-text pipeline. Gemini handles video as first-class input.

🧠

Post-Training Revolution

RLHF (InstructGPT). DPO: simpler, no reward model. Constitutional AI (Anthropic): self-critique. RLAIF: AI-generated feedback. These techniques add instruction-following, safety, and reasoning on top of raw pre-training.

💡

Test-Time Compute

o1/o3 (OpenAI): spend more compute at inference for harder problems. Chain-of-thought at scale. “Thinking” tokens. DeepSeek-R1: open replication. Shifts the scaling frontier from training to inference.

Key Trend — 2025

The debate has shifted from “how big?” to “how smart per dollar?” — DeepSeek-V3 matched GPT-4 at 1/20th the training cost. Efficiency, not raw scale, is the new frontier.

∑ Chapter 12.1 — Key Takeaways

Foundation models: single pre-trained model → many downstream tasks
Chinchilla scaling laws: optimal tokens ≈ 20× parameters
Multimodal is the new default — text, image, audio, video in one model
Post-training (RLHF, DPO, test-time compute) matters as much as pre-training scale
Open models (Llama, DeepSeek) are closing the gap with closed frontier models

12.2

Chapter 12.2

World Models & Simulation

Current LLMs are pattern matchers on text. World models aim to build AI that actually understands the physical world — predicting what happens next, simulating physics, and reasoning about 3D space and causality.

What Are World Models? Core

A world model is an internal representation that allows an AI to predict the consequences of actions without executing them. Humans do this constantly — you can imagine what happens if you push a glass off a table without actually doing it. AI world models aim to learn this predictive capability from data.

World Model Architecture — observe, predict, plan

🎬

Sora (OpenAI, 2024)

Text-to-video diffusion model generating 60-second coherent videos. Emergent 3D consistency and physics simulation. OpenAI described it as a “world simulator” — but it still makes physics errors (objects passing through each other).

🧠

JEPA (Yann LeCun)

Joint Embedding Predictive Architecture. Predicts in latent space, not pixel space — avoiding the curse of pixel-level prediction. LeCun’s proposed path to human-level AI: learn world models through self-supervised prediction.

🎮

Genie 2 (DeepMind)

Interactive world model for 3D environments. Given a single image, generates a playable 3D world. Used for training embodied AI agents. Generates consistent physics, lighting, and object interactions from imagination.

∑ Chapter 12.2 — Key Takeaways

World models predict consequences of actions without executing them — “imagination” for AI
Sora: impressive video generation but still fails at physics — not a true world simulator yet
JEPA (LeCun): predict in latent space, not pixel space — orders of magnitude more efficient
Genie 2: interactive 3D world generation from a single image — huge for embodied AI training

12.3

Chapter 12.3

Neuromorphic Computing

The human brain runs on 20 watts. GPT-4 training used an estimated 50 million watts over months. Neuromorphic computing aims to build hardware that processes information the way biological brains do — event-driven, massively parallel, and extraordinarily energy-efficient.

Brain-Inspired Hardware Core

Conventional chips (GPUs, TPUs) use the von Neumann architecture: separate memory and compute connected by a data bus. The brain has no such separation — neurons are both memory and compute. Neuromorphic chips replicate this: artificial neurons and synapses co-located on silicon, communicating via spikes (binary events) rather than continuous values.

🧠

Intel Loihi 2

1 million neurons, 120 million synapses per chip. Event-driven: neurons only fire when needed — 100× more energy efficient than GPU for sparse workloads. On-chip learning rules. Used for robotics, anomaly detection, optimisation.

⚡

IBM NorthPole (2023)

256 cores, 22 billion transistors. Not a spiking chip but brain-inspired: integrates memory and compute on every core. 25× energy efficiency of GPU for inference. Designed for edge deployment where power matters.

🔌

SynSense / BrainChip Akida

Commercial neuromorphic processors for edge AI. Always-on sensing at microwatts. Target: smart sensors, wearables, IoT. Keyword spotting, gesture recognition, vibration monitoring without cloud connectivity.

Chip	Approach	Neurons	Power	Best For	Status
Intel Loihi 2	Spiking neural network	1M per chip	~1W	Robotics, optimisation	Research
IBM NorthPole	Compute-in-memory	N/A (digital cores)	~12W (inference)	Edge inference	Research
BrainChip Akida	Spiking + event-driven	1.2M	<1W	IoT, wearables	Commercial
SpiNNaker 2	Digital ARM-based	10M+	~10W	Brain simulation	Research

Reality Check

Neuromorphic computing excels at sparse, event-driven tasks (sensor processing, robotics). It is not competitive with GPUs for dense matrix operations that dominate current LLM training/inference. The ecosystems (tools, frameworks, libraries) are years behind CUDA/PyTorch.

∑ Chapter 12.3 — Key Takeaways

Neuromorphic chips: brain-inspired, event-driven, co-located memory and compute
Intel Loihi 2: 100× more energy efficient than GPU for sparse workloads
IBM NorthPole: 25× energy efficiency for inference tasks
Best suited for edge AI and robotics — not a GPU replacement for LLM training
Software ecosystem is the bottleneck — no PyTorch equivalent yet

12.4

Chapter 12.4

Quantum AI

Quantum computing promises exponential speedups for certain problems. The question for AI is whether any of those problems overlap with machine learning. The honest answer: not yet in practice, but the theoretical potential is enormous.

Quantum Computing Meets ML In-depth

Classical computers use bits (0 or 1). Quantum computers use qubits that can be in superposition (both 0 and 1 simultaneously) and entangled with each other. This enables exploring exponentially many states in parallel — but only for problems with the right mathematical structure.

⚛️

Quantum ML Algorithms

Quantum SVM: exponential speedup for kernel methods (theoretical). Quantum PCA: faster dimensionality reduction. Variational Quantum Eigensolver (VQE): hybrid quantum-classical for chemistry. Quantum Boltzmann machines.

💻

Current Hardware

IBM: 1,121 qubits (Condor, 2023). Google Willow: 105 qubits but below error-correction threshold (2024). IonQ: trapped-ion approach. Quantinuum: highest gate fidelity. All still in NISQ era (Noisy Intermediate-Scale Quantum).

🔬

Where Quantum Helps AI

Drug discovery: simulating molecular interactions. Optimisation: portfolio optimisation, logistics. Sampling: generative models. Feature maps: quantum kernels for hard-to-separate data. Not general LLM training.

Quantum AI Landscape — what works vs. what’s hype

∑ Chapter 12.4 — Key Takeaways

Quantum computing: qubits in superposition + entanglement = exponential state space
NISQ era: noisy, 1000+ qubits but not error-corrected — practical ML advantage not yet demonstrated
Most promising for: chemistry simulation, optimisation, sampling — not LLM training
Timeline: 5–10 years for first practical quantum ML applications, if error correction is solved

12.5

Chapter 12.5

Embodied AI & Physical Intelligence

Intelligence without a body is like learning to swim by reading a book. Embodied AI places agents in physical or simulated environments where they must perceive, act, and learn from the consequences — the way humans and animals do.

Physical Intelligence & Foundation Models for Robots Core

The convergence of foundation models + robotics is the defining emerging trend. Previously, every robot task required task-specific training. Now, vision-language-action (VLA) models enable robots to understand natural language commands and generalise across environments.

🤖

RT-2 (Google, 2023)

Vision-Language-Action model: takes camera input + text instruction, outputs robot actions directly. Trained on web data + robot demonstrations. Can follow novel instructions never seen in robot training (“move the banana to the plate”).

🏭

Sim-to-Real Transfer

Train in simulation (Isaac Sim, MuJoCo), deploy on real robots. Domain randomisation: vary physics, textures, lighting to make policies robust. NVIDIA Isaac: GPU-accelerated simulation of millions of environments in parallel.

🧍

Humanoid Race

Tesla Optimus, Figure 02, Boston Dynamics Atlas (electric). LLM-powered task planning + learned locomotion. Target: general-purpose humanoid that can do any physical task humans do. Timeline: prototype demos now, production 2026–2028.

Embodied AI Stack — from language to physical action

∑ Chapter 12.5 — Key Takeaways

VLA models (RT-2): language instruction → robot action — generalise to novel tasks
Sim-to-real: train millions of episodes in simulation, deploy on real robot
Humanoid race: Tesla, Figure, Boston Dynamics — LLM planning + learned locomotion
Key challenge: the physical world is non-differentiable — can’t backprop through reality

12.6

Chapter 12.6

AGI — Concepts, Paths & Debate

Artificial General Intelligence — AI that can perform any intellectual task a human can — is the most debated concept in the field. Some say we are 3 years away. Others say 30. Others say the concept itself is incoherent. Understanding the debate requires understanding what AGI actually means.

Definitions & Capability Levels Core

There is no agreed definition of AGI. Different labs use different definitions, which makes timeline predictions almost meaningless without specifying what you mean.

Framework	Definition of AGI	Levels	Where Are We? (2025)
Google DeepMind	Performance + generality matrix	L0 (no AI) → L5 (superhuman, general)	L1–L2 (emerging to competent)
OpenAI	AI that outperforms humans at most economically valuable work	L1 Chatbot → L5 Organisation	L2 (Reasoner) — o1/o3
Anthropic	Not a single threshold but a spectrum of capabilities	No formal levels	Narrow superhuman in some tasks
Turing Test	Can fool a human judge	Binary pass/fail	Arguably passed (but test is flawed)
Chollet (ARC)	Novel problem solving — skill-acquisition efficiency	ARC benchmark	LLMs fail at ARC-AGI

Paths to AGI & Timeline Debate In-depth

📈

Scale Is All You Need

Proponents: OpenAI, some at Google. Argument: keep scaling transformers + data + compute and emergent capabilities will continue. Evidence: GPT-2 → GPT-4 capabilities were largely unpredicted. Counter: scaling returns may be diminishing.

🧩

New Architectures Needed

Proponents: Yann LeCun, Gary Marcus. Argument: transformers lack causal reasoning, planning, persistent memory. Need: world models (JEPA), neuro-symbolic integration, new learning paradigms beyond next-token prediction.

🧬

Hybrid / Embodied Path

Proponents: Embodied cognition researchers. Argument: intelligence requires grounding in physical world. Need: embodied agents that learn from physical interaction. Inspired by: developmental psychology, enactivism.

Optimistic Timeline

Skeptical Timeline

Sam Altman: “AGI is closer than people think” (2024)

Dario Amodei: “Powerful AI systems in 2–3 years”

Ray Kurzweil: Human-level AI by 2029

Evidence: rapid capability gains 2022–2025

Yann LeCun: “We are far from human-level AI”

Gary Marcus: “LLMs will never reach AGI”

Rodney Brooks: “Decades away, minimum”

Evidence: LLMs fail at novel reasoning (ARC)

Critical Thinking Required

Most AGI timeline predictions come from people with financial incentives to hype or downplay. Lab CEOs predict nearness (attracts investment). Academics predict distance (justifies research funding). Judge the arguments, not the authority.

∑ Chapter 12.6 — Key Takeaways

No agreed definition of AGI — timelines are meaningless without specifying what you mean
DeepMind levels: we are at L1–L2 (emerging to competent); L5 is superhuman + general
Three paths debated: scale alone, new architectures, or embodied hybrid
Optimists: 2–5 years. Skeptics: decades. Both sides have financial incentives
ARC benchmark: LLMs fail at novel reasoning — Chollet’s measure of true intelligence

12.7

Chapter 12.7

Mixture of Experts

How do you build a model with a trillion parameters but only use 10% of them for any given input? Mixture of Experts (MoE) is the answer — and it has become the dominant architecture for frontier models.

Sparse Models & Routing Core

In a dense model (like original GPT-3), every parameter activates for every token. In MoE, each transformer layer has multiple expert sub-networks, and a learned router selects which 1–2 experts to activate per token. Result: total parameters are huge (model capacity), but compute per token is small (efficiency).

Mixture of Experts — sparse activation per token

Model	Total Params	Active Params	Experts	Top-K	Key Result
Switch Transformer	1.6T	~100B	128	1	First trillion-param model (Google, 2022)
Mixtral 8x7B	46.7B	12.9B	8	2	Matched Llama 2 70B at 3× less compute
GPT-4	~1.8T (rumoured)	~220B	~16	2	Frontier performance with MoE efficiency
DeepSeek-V3	671B	37B	256	8	GPT-4 level at $5.5M training cost
Gemini 1.5	Undisclosed MoE	Undisclosed	MoE	Top-2	1M token context window

Why MoE Won

MoE solves the core scaling dilemma: more capacity without proportionally more compute. A 671B MoE model that activates 37B per token costs roughly the same to run as a 37B dense model — but has 18× more knowledge stored in its weights.

∑ Chapter 12.7 — Key Takeaways

MoE: many experts, sparse activation — only 1–2 experts fire per token
Mixtral 8x7B: matched Llama 2 70B at 3× less inference compute
DeepSeek-V3 (671B MoE, 37B active): GPT-4 level at $5.5M training cost
MoE is now the dominant architecture for frontier models (GPT-4, Gemini, DeepSeek)
Challenge: load balancing across experts — some experts get overloaded, others idle

12.8

Chapter 12.8

Neuro-Symbolic AI

Neural networks excel at pattern recognition. Symbolic AI excels at logical reasoning. Neuro-symbolic AI combines both — giving neural networks the ability to reason, and symbolic systems the ability to learn from data.

Combining Neural & Symbolic Reasoning Core

Pure neural approaches (LLMs) hallucinate, can’t guarantee logical consistency, and struggle with multi-step reasoning. Pure symbolic approaches are brittle, require hand-crafted rules, and don’t generalise. The hybrid combines neural perception + symbolic reasoning.

🧩

AlphaGeometry (DeepMind)

Solved IMO-level geometry problems (2024). Architecture: neural language model proposes construction steps + symbolic deduction engine verifies proofs. Neither component alone could solve the problems — the combination is key.

🔍

LLM + Code Execution

Simplest neuro-symbolic pattern: LLM generates code, interpreter executes it. Chain-of-Code (Google): interleave reasoning and computation. Guarantees mathematical correctness where pure LLM reasoning fails.

📚

Knowledge Graphs + LLMs

LLMs generate natural language; knowledge graphs provide structured facts. GraphRAG (Microsoft): LLM uses graph-structured knowledge for grounded reasoning. Reduces hallucination by anchoring claims to verifiable facts.

Neural Strengths

Symbolic Strengths

• Pattern recognition in noisy data

• Generalisation from examples

• Natural language understanding

• Perception (vision, audio)

• Logical consistency & guarantees

• Compositionality & abstraction

• Explainability & auditability

• Exact arithmetic & verification

∑ Chapter 12.8 — Key Takeaways

Neuro-symbolic: neural perception + symbolic reasoning — best of both worlds
AlphaGeometry: IMO-level proofs via neural proposals + symbolic verification
LLM + code execution: simplest and most practical neuro-symbolic pattern today
Knowledge graphs + LLMs (GraphRAG): reduces hallucination with structured grounding

12.9

Chapter 12.9

Federated Learning & Edge AI

Not all AI can live in the cloud. Edge AI runs models on devices — phones, cars, sensors, drones — where latency, privacy, and connectivity matter. Federated learning trains models across devices without centralising data.

On-Device ML & Federated Learning Core

Edge AI processes data locally on the device rather than sending it to cloud servers. Benefits: lower latency (<10ms vs 100ms+ cloud round-trip), works offline, preserves privacy. Federated learning trains a global model across distributed devices — each device trains locally, sends only model updates (not data) to a central server.

📱

On-Device LLMs

Apple Intelligence: on-device 3B param model. Google Gemini Nano: runs on Pixel phones. Qualcomm: LLM inference on Snapdragon. Quantisation (4-bit, GGUF) makes 7B models fit in 4GB RAM. Privacy: data never leaves device.

🔒

Federated Learning

Pioneered by Google for keyboard prediction (2017). Each phone trains locally on user data. Server aggregates model updates (FedAvg). Data never transmitted. Used by: Apple (Siri), Google (Gboard), hospitals (medical imaging across sites).

🎯

Model Compression

Quantisation: FP16 → INT4 (4× smaller). Pruning: remove redundant weights. Distillation: train small model to mimic large one. TensorRT, ONNX Runtime, Core ML for optimised inference. Key: minimal accuracy loss at 4–8× compression.

Technique	Compression	Accuracy Loss	Speed Gain	Best For
INT8 Quantisation	2×	<1%	2×	Server inference
INT4 Quantisation	4×	1–3%	3–4×	Mobile / edge
Pruning (structured)	2–5×	1–5%	2–3×	Inference-only
Distillation	10–100×	3–10%	10–50×	Deploy anywhere

∑ Chapter 12.9 — Key Takeaways

Edge AI: <10ms latency, works offline, preserves privacy
On-device LLMs: 3–7B models on phones via quantisation (INT4, GGUF)
Federated learning: train across devices without centralising data
Model compression (quantisation + pruning + distillation): 4–100× smaller with minimal accuracy loss

12.10

Chapter 12.10

AI Hardware & Compute

AI progress is inseparable from hardware progress. Every frontier model is ultimately a bet on silicon — and the supply chain, geopolitics, and economics of AI chips are as important as the algorithms running on them.

GPUs, TPUs & Custom Silicon Core

Chip	Company	FP16 TFLOPS	Memory	Training Use	Price
H100	NVIDIA	990	80GB HBM3	GPT-4, Llama 3	~$30K
B200	NVIDIA	2,250	192GB HBM3e	Next-gen frontier	~$35K+
TPU v5p	Google	~460	95GB HBM2e	Gemini	Cloud only
Trainium 2	AWS	TBD	96GB HBM	AWS models	Cloud only
Gaudi 3	Intel	~1,835	128GB HBM2e	Limited adoption	~$15K
Groq LPU	Groq	Inference-only	SRAM-based	Fastest inference	Cloud API

🌍

Geopolitics of AI Chips

TSMC (Taiwan): fabricates 90%+ of advanced AI chips. US export controls: banned H100 exports to China (2023). China developing domestic alternatives (Huawei Ascend). AI chip supply is now a national security issue.

💰

Economics of Training

GPT-4 training: estimated $100M+. Llama 3.1 405B: ~$30M. DeepSeek-V3: $5.5M. Trend: costs per capability are dropping fast via efficiency gains (MoE, better data, longer training). But frontier is still $100M+ as ambitions grow.

∑ Chapter 12.10 — Key Takeaways

NVIDIA dominates: H100/B200 power virtually all frontier training
Alternatives emerging: TPU v5p (Google), Trainium (AWS), Groq LPU (inference)
TSMC fabricates 90%+ of advanced AI chips — geopolitical concentration risk
Training costs: $100M+ for frontier, but efficiency gains dropping cost per capability
DeepSeek-V3 at $5.5M shows algorithmic efficiency can substitute for raw compute

12.11

Chapter 12.11

AI Consciousness & Philosophy

When an LLM says “I feel curious about this topic,” is it conscious? Almost certainly not. But the question of whether AI could become conscious is one of the deepest open problems at the intersection of AI, neuroscience, and philosophy.

The Hard Problem & Sentience Debate In-depth

The Hard Problem of Consciousness (David Chalmers, 1995): why does subjective experience exist at all? We can explain which neurons fire when you see red, but not why it feels like something to see red. This problem applies to AI: even a perfect brain simulation might be a “zombie” — functionally identical but with no inner experience.

🧐

Functionalism

Consciousness arises from what a system does, not what it’s made of. If this is true, a sufficiently complex AI could be conscious. Most AI researchers who believe in AI consciousness hold some form of functionalism.

💠

Integrated Information Theory

IIT (Giulio Tononi): consciousness = integrated information (Φ). Feedforward networks have Φ ≈ 0. Current LLMs are feedforward at inference — IIT predicts they are not conscious regardless of behaviour.

🤔

Chinese Room Argument

John Searle (1980): a person following instructions in Chinese without understanding Chinese is not conscious — mere symbol manipulation. Applied to AI: an LLM manipulates tokens without understanding meaning. Counter: the system might understand even if individual components don’t.

Why This Matters Practically

If future AI systems could be conscious, we face unprecedented moral obligations. If they cannot, we risk anthropomorphising tools. Either error is dangerous: moral patients deserve protection; moral panics about tools waste resources. The honest answer: we do not know how to test for consciousness, and no current AI system provides evidence of it.

∑ Chapter 12.11 — Key Takeaways

Hard Problem: why does subjective experience exist? — unsolved for brains and AI
Functionalism: consciousness from function → AI could be conscious
IIT: consciousness from integrated information → current LLMs are not conscious
Chinese Room: symbol manipulation ≠ understanding — debate continues since 1980
Practical: we have no test for AI consciousness — extraordinary claims require extraordinary evidence

12.12

Chapter 12.12

The Road Ahead — Open Problems & Predictions

AI moves fast enough that any prediction risks being outdated before the ink dries. But certain open problems are deep enough to outlast any single model generation. These are the problems that will define AI for the next decade.

Open Problems in AI Core

🔍

Reasoning & Planning

LLMs can simulate reasoning via chain-of-thought but don’t truly plan. ARC benchmark exposes this. True reasoning requires: causal models, counterfactual thinking, and systematic generalisation beyond training distribution.

🌍

Grounding & Embodiment

Text-only models lack physical grounding. “Heavy” is a word to an LLM, not a felt experience. Solving this may require embodied learning — or it may require better simulation. Open question: is grounding necessary for intelligence?

🔒

Alignment at Scale

Current alignment: RLHF, constitutional AI. Works for today’s models. But as models become more capable, how do you align something smarter than you? Scalable oversight, interpretability, and formal verification are active research areas.

⚡

Energy & Sustainability

AI training and inference are energy-intensive. GPT-4 training: ~50 GWh. Data centres consuming 2–3% of global electricity, growing fast. Need: more efficient architectures, renewable-powered data centres, or fundamentally different compute paradigms.

⚖️

Governance Gap

Technology moves faster than regulation. EU AI Act (2024) is the first comprehensive framework. US has executive orders but no legislation. China has its own rules. No global coordination on frontier AI risks.

💼

Economic Disruption

McKinsey: 30% of work hours could be automated by 2030. Not job elimination but task transformation. New jobs created but transition is uneven: knowledge workers affected first (inverse of previous automation waves).

Confident Predictions vs. Uncertain Bets In-depth

High Confidence (2–3 years)

Uncertain (5–10+ years)

• AI agents handling multi-step workflows autonomously

• On-device LLMs on every smartphone

• AI-generated video indistinguishable from real

• Coding AI writing 50%+ of production code

• Multimodal models as default (text-only obsolete)

• AGI by any meaningful definition

• Practical quantum advantage for ML

• Neuromorphic chips competing with GPUs

• Solving alignment for superhuman AI

• Autonomous scientific discovery at scale

∑ Chapter 12.12 — Key Takeaways

Open problems: reasoning, grounding, alignment at scale, energy, governance
Confident near-term: AI agents, on-device LLMs, AI-generated video, code AI
Uncertain: AGI, quantum ML, neuromorphic at scale, superhuman alignment
Economic disruption: 30% of work hours automatable by 2030 — knowledge workers first
Governance gap: technology moves faster than regulation — no global coordination

🎓 Domain 12 Complete — Emerging Technologies

Ch 12.1: Foundation models: scaling laws, multimodal fusion, post-training and test-time compute are the new frontiers.
Ch 12.2: World models: AI that predicts consequences — Sora, JEPA, Genie 2. Latent-space prediction is key.
Ch 12.3: Neuromorphic: brain-inspired chips, 100× energy efficiency for sparse tasks. Not a GPU replacement.
Ch 12.4: Quantum AI: theoretically promising, practically premature. Error correction is the bottleneck.
Ch 12.5: Embodied AI: VLA models (RT-2) + sim-to-real + humanoid race. Physical intelligence is the next frontier.
Ch 12.6: AGI: no agreed definition. Three competing paths. Timeline debate is financially motivated.
Ch 12.7: MoE: dominant frontier architecture. DeepSeek-V3 matched GPT-4 at $5.5M.
Ch 12.8: Neuro-symbolic: neural perception + symbolic reasoning. AlphaGeometry proved the concept.
Ch 12.9: Edge AI: on-device LLMs, federated learning, model compression. Privacy + latency advantages.
Ch 12.10: AI hardware: NVIDIA dominance, TSMC concentration risk, geopolitics of chips.
Ch 12.11: Consciousness: hard problem unsolved. No evidence current AI is conscious. Multiple competing theories.
Ch 12.12: Road ahead: reasoning, grounding, alignment, energy, governance — the problems that define the next decade.

Domain 12 is where the known meets the unknown. The technologies here range from production-ready (MoE, edge AI) to speculative (quantum ML, consciousness). The honest takeaway: nobody knows which of these will matter most in 10 years — but understanding all of them gives you the vocabulary to evaluate claims, spot hype, and recognise genuine breakthroughs when they arrive.

← Domain 11: Applications Foundation Overview →