Learning Hub

Artificial Intelligence Overview

From the origins of AI to cutting-edge agentic systems — a structured guide through the ideas, mathematics, algorithms, and engineering that define modern AI.

01
Chapter One · Introduction

What Is Artificial Intelligence?

Artificial Intelligence is the science of building systems that can perceive, reason, learn, and act — performing tasks that traditionally required human intelligence.

The AI Spectrum — From Narrow to General
ANI Narrow AI Today: GPT, AlphaGo, FaceID, recommenders AGI General AI Human-level reasoning Active research frontier ASI Super AI Exceeds human cognition Theoretical / long horizon ← Specialised ·············· Generalised → ✓ We are here → ANI era. AGI timeline debated (5–20+ years by leading researchers)
A Brief History of AI
1950
Turing Test & Birth of AI
Alan Turing asks "Can machines think?" — proposes the Imitation Game. McCarthy coins "Artificial Intelligence" at the 1956 Dartmouth Conference.
1956–1974
The Golden Age
Early optimism — checkers-playing programs, ELIZA chatbot, symbolic reasoning systems. Perceptron invented by Rosenblatt (1957).
1974–1980
First AI Winter
Funding cuts as early AI hit scalability limits. Minsky & Papert's critique of perceptrons stalls neural network research.
1980–1987
Expert Systems Boom
Rule-based systems (MYCIN, XCON) deployed in industry. Backpropagation re-discovered (Rumelhart & Hinton, 1986).
1987–1993
Second AI Winter
Expert systems brittle and expensive to maintain. AI hardware market collapses. Neural network research stalls again.
1997–2011
ML Renaissance
Deep Blue beats Kasparov (1997). SVMs, kernel methods, probabilistic graphical models mature. Web-scale data emerges.
2012
Deep Learning Revolution
AlexNet wins ImageNet by a huge margin using GPU-trained CNNs. Deep learning era begins — LeCun, Bengio, Hinton (Turing Award 2018).
2017
"Attention Is All You Need"
Google Brain introduces the Transformer architecture — replaces RNNs, becomes the foundation of all modern LLMs.
2020–2023
LLM Era — GPT-3 to ChatGPT
OpenAI GPT-3 (175B params). GitHub Copilot. ChatGPT (100M users in 60 days). GPT-4, Claude, Gemini. DALL-E, Stable Diffusion, Midjourney.
2024–2026
Agentic AI Frontier
LLM agents with tools, memory, and planning. Multi-agent frameworks (LangChain, CrewAI, AutoGen). Reasoning models (o1, o3, DeepSeek-R1). Multimodal foundation models.
The Three Pillars of AI Progress
Why AI accelerated — Data × Compute × Algorithms
📊 Data ImageNet · Common Crawl Web-scale text & images RLHF preference data Fuel for learning Compute GPU / TPU acceleration NVIDIA H100 clusters Cloud-scale distributed training Engine of scale 🧬 Algorithms Backpropagation Attention / Transformers RLHF · LoRA · MoE Intelligence architecture × ×
Deep dive → AI Fundamentals: History, Types & Core Concepts
📋 Chapter 1 — Summary
  • AI = systems that perceive, reason, learn, and act
  • We are in the ANI era — narrow AI excels at specific tasks; AGI remains a research horizon
  • AI progress is driven by the data × compute × algorithms flywheel
  • Transformers (2017) and ChatGPT (2022) mark the two biggest inflection points in modern AI
02
Chapter Two · Mathematical Foundations

Mathematical Foundations of AI

All of AI rests on a mathematical foundation — linear algebra (vectors, matrices), calculus (optimization), and probability & statistics (reasoning under uncertainty).

📐

Linear Algebra

  • Vectors & matrices — the language of data
  • Matrix multiplication — core of neural nets
  • Eigenvalues, SVD — dimensionality reduction
  • Dot products — similarity & attention scores

Calculus & Optimization

  • Partial derivatives — sensitivity of loss
  • Gradient descent — the engine of learning
  • Chain rule — the heart of backprop
  • Adam, RMSProp — adaptive optimizers
🎲

Probability & Stats

  • Probability distributions — model outputs
  • Bayes' theorem — belief updating
  • MLE & MAP — parameter estimation
  • KL divergence — comparing distributions
Gradient Descent — How Models Learn
θ Loss Start (random init) Minimum (trained) θ_new = θ - α · ∇L(θ) θ = parameters (weights) α = learning rate (step size) ∇L = gradient of loss
Deep dive → Mathematical Foundations of AI
📋 Chapter 2 — Summary
  • Linear algebra — every neural network is matrix multiplications + activations
  • Gradient descent — iteratively moves parameters in the direction that reduces loss
  • Backpropagation — uses the chain rule to compute gradients for every layer
  • Probability — model outputs are distributions; Bayes' theorem underpins many algorithms
03
Chapter Three · Machine Learning

Machine Learning Essentials

Machine learning is the practice of building systems that learn patterns from data — rather than being explicitly programmed with rules. The model improves as it sees more examples.

The Three ML Paradigms
Supervised Input → Label pairs Image: 🐱 "Cat" ✓ Classification · Regression Decision Trees · SVM · LR Most common in practice Unsupervised Find structure in raw data Clustering · PCA · Autoencoders k-means · DBSCAN · t-SNE Discover hidden patterns Reinforcement Agent learns via interaction Agent Env action reward Q-Learning · Policy Gradient AlphaGo · Game AI · Robotics Learn from consequences
Key ML Algorithms at a Glance
AlgorithmTypeWhen to UseLimitation
Linear / Logistic RegressionSupervisedBaseline, interpretable, fastCan't capture non-linear patterns
Decision Tree / Random ForestSupervisedTabular data, feature importanceCan overfit; forests = less interpretable
Gradient Boosting (XGBoost)SupervisedBest accuracy on tabular dataSlow to train; many hyperparameters
k-Means ClusteringUnsupervisedGrouping unlabeled dataRequires k upfront; sensitive to init
PCAUnsupervisedDimensionality reduction, visualizationLinear only; components not interpretable
Q-Learning / DQNReinforcementDiscrete action spaces, gamesDoesn't scale to continuous actions well
Deep dive → ML Essentials: Algorithms, Evaluation & Feature Engineering
📋 Chapter 3 — Summary
  • Supervised — most common; requires labeled data; powers classification & regression
  • Unsupervised — finds hidden structure; clustering, dimensionality reduction
  • Reinforcement — agent learns via reward; powers game AI & robotics
  • Gradient Boosting (XGBoost) dominates tabular data; neural nets dominate unstructured data
04
Chapter Four · Deep Learning

Deep Learning & Neural Networks

Deep learning uses multi-layer neural networks to automatically learn hierarchical representations from raw data — powering vision, language, speech, and generative AI.

Neural Network — Layers & Forward Pass
INPUT HIDDEN · 1 HIDDEN · 2 OUTPUT x₁, x₂, x₃ ReLU(Wx+b) ReLU(Wx+b) Softmax
The Transformer — Architecture Behind LLMs
Transformer Encoder Block — Self-Attention + FFN
Input Embeddings + Positional Encoding Multi-Head Self-Attention Q · K · V matrices — each token attends to all others Add & Layer Norm Feed-Forward Network (FFN) Linear → ReLU → Linear — applied to each token independently Add & Layer Norm → Output × N layers (e.g. 96 in GPT-4) K = Key Q = Query V = Value
LLM Scale — From GPT-1 to GPT-4
ModelYearParametersTraining DataKey Milestone
GPT-12018117MBooksCorpusFirst large-scale pre-trained LM
BERT2018340MWikipedia + BooksBidirectional encoding; NLP SOTA
GPT-220191.5BWebText (40GB)"Too dangerous to release" — coherent long text
GPT-32020175BCommon Crawl (570GB)Few-shot learning emerges at scale
ChatGPT (GPT-3.5)2022~175B+ RLHF fine-tuning100M users in 60 days
GPT-42023~1T (est.)Multimodal + RLHFPasses bar exam, multimodal reasoning
Claude 3.5 / Gemini 1.52024UndisclosedLong context (1M+)Million-token context windows
Deep dive → Deep Learning: Neural Networks, CNNs, Transformers & LLMs
📋 Chapter 4 — Summary
  • Neural networks = stacked linear transformations + nonlinear activations
  • Transformers use self-attention — every token can attend to every other token in O(n²)
  • Scale law: more parameters + more data + more compute → better models (reliably)
  • LLMs are pre-trained on internet text then aligned with RLHF to be helpful & safe
05
Chapter Five · Agentic AI

Agentic AI & Autonomous Systems

Agentic AI moves beyond question-answering — LLM-powered agents that plan, use tools, maintain memory, and act autonomously to complete multi-step goals in the real world.

LLM Agent Architecture — Core Components
LLM Brain Reasoning · Planning · Deciding 🔧 Tools Web search · Code exec APIs · File I/O · Calculator Database queries 🧠 Memory Short-term: context window Long-term: vector store Episodic: past actions 📋 Planning ReAct · Chain-of-Thought Tree-of-Thoughts Task decomposition 🌐 Environment Browser · OS · Cloud User interaction Other agents
The ReAct Pattern — Reasoning + Acting
ReAct Loop — How Agents Solve Multi-Step Tasks
Thought Reason about what to do next Action Call a tool or sub-agent Observation Receive tool output / result Evaluate Goal reached? If not → repeat ← Loop until goal achieved or max steps reached
Major Agentic AI Frameworks
FrameworkFocusBest ForKey Feature
LangChainChains & agentsRAG, tool-use, pipelinesHuge ecosystem; LCEL chains
LangGraphStateful agent graphsComplex multi-step agentsCyclic graphs, controllable state
AutoGen (Microsoft)Multi-agent conversationsCode generation & reviewAgents debate/collaborate via messages
CrewAIRole-based agentsTask delegation to specialistsCrew + Roles + Tasks abstraction
LlamaIndexRAG & data indexingKnowledge-grounded agentsAdvanced retrieval pipelines
OpenAI Assistants APIManaged agentsProduction agents with toolsBuilt-in memory, code interpreter
🧩

Agent Architecture Patterns

  • ReAct — Reason + Act + Observe loop
  • Plan-and-Execute — plan full task first, then execute steps
  • Reflexion — agent critiques its own outputs
  • Tree of Thoughts — explore multiple reasoning branches
🤝

Multi-Agent Patterns

  • Orchestrator + Workers — manager delegates to specialists
  • Peer Review — agents critique each other's outputs
  • Swarm — many simple agents, emergent behaviour
  • Human-in-the-Loop — human approves critical decisions
Deep dive → Agentic AI: Frameworks, Reasoning Patterns & Multi-Agent Systems
📋 Chapter 5 — Summary
  • Agents = LLMs with tools + memory + planning — they act, not just answer
  • ReAct: reason → act → observe → repeat until goal is achieved
  • Multi-agent systems split complex tasks between specialized sub-agents
  • Key challenge: reliability & evaluation — agents can hallucinate actions
Go deeper → AI Foundation — 5 in-depth sections covering fundamentals, mathematics, ML essentials, deep learning, and agentic AI.
Summary — AI at a Glance
01 · Introduction

What Is AI?

  • Perceive, reason, learn, act
  • ANI today — AGI is the horizon
  • Data × Compute × Algorithms
  • Transformers changed everything (2017)
02 · Mathematics

Mathematical Foundations

  • Linear algebra — data representation
  • Gradient descent — how models learn
  • Backpropagation — chain rule at scale
  • Probability — reasoning under uncertainty
03 · ML

Machine Learning

  • Supervised — learn from labeled data
  • Unsupervised — discover structure
  • Reinforcement — learn from consequences
  • XGBoost for tabular; DL for unstructured
04 · Deep Learning

Neural Networks & LLMs

  • Stacked layers → hierarchical features
  • Transformers — attention is all you need
  • Scale laws — bigger = better (reliably)
  • RLHF — aligning LLMs with human intent
05 · Agents

Agentic AI

  • Tools + memory + planning loop
  • ReAct: reason → act → observe
  • Multi-agent: orchestrator + specialists
  • The frontier of AI applications (2026)