LearningTree

Design Tactics & Strategies
for Quality Goals

Principles, patterns, and solution strategies for achieving performance, adaptability, and high availability in software architecture.

01
Chapter One ยท Quality Goals

Design Strategies for Achieving High Performance

  • โ†’Develop high-level solution strategies in parallel to detailed concepts
  • โ†’Principles and patterns support the creation of a solution strategy
  • โ†’However, there is no single approach that guarantees a suitable solution
๐Ÿ–ฅ๏ธ

Additional Hardware

  • Scale out with load balancers & multiple server instances
๐Ÿ“ก

Reduce Communication

  • Caching, batching & minimising inter-component calls
๐Ÿ”€

Adjust Distribution

  • Reduce or increase sharding based on workload
๐Ÿ”ง

Reduce Flexibility

  • Trade configurability for raw performance
๐Ÿ“Š

Load Testing

  • Continuous performance benchmarking & regression checks
โšก

Energy Trade-off

  • High-end hardware & parallelism at the cost of energy
Strategy 1 โ€” Additional Hardware
1
Adding More Hardware

Scale out by deploying additional servers behind a load balancer to distribute incoming client requests across multiple application server instances.

Horizontal Scaling — Load Balancer distributes requests across server instances
CLIENT Requests LOAD BALANCER round-robin APPLICATION SERVER INSTANCES Server 1 Web App Instance Server 2 Web App Instance Server 3 Web App Instance + Server N โ€ฆ Response to Client
Horizontal Scaling: Client โ†’ Load Balancer โ†’ Web Application Servers (multiple instances) โ€” distributes incoming traffic to handle increased load without upgrading individual machines.
Strategy 2 โ€” Reduce Communication Between Components
2
Reduce Inter-Component Communication

Minimize the number of calls between system components to reduce latency overhead. Common solutions include caching and batching.

⚠ Problem — N individual calls create high latency overhead
WEB APP sends N calls BACKEND MICROSERVICES call 1 call 2 call 3 call N Service A Users Service B Products Service C Orders Service D Inventory ⚠ HIGH LATENCY N round-trips N ร— network cost
a. Cache โ€” Solution
  • Add a cache layer between the web app and backend services
  • Reduces repeated calls to downstream microservices
  • Returns pre-computed responses for common queries
Cache Layer — HIT returns instantly, MISS fetches from backend
WEB APP single request request CACHE Hit → return data Miss → fetch & store TTL / eviction policy ✓ HIT BACKEND SERVICES ✗ MISS Service A Users Service B Products Service C Orders store in cache
b. Batch โ€” Solution
  • Group multiple requests into a single call to the backend
  • Avoids: N individual calls → 1 batched call
  • Useful for bulk data operations and report generation
Batch Aggregator — N requests bundled into 1 backend call
WEB APP req A ยท req B req C ยท req D req E ยท req F N reqs BATCH AGGREGATOR A + B + C + D + E + F bundles into 1 payload N → 1 1 CALL BACKEND API process all at once single DB transaction or bulk operation ✓ bulk response (A + B + C + D + E + F) 6 individual calls → 1 batched call — reduces network round-trips by ~83%
Strategy 3 โ€” Reduce or Increase Distribution
3
Distribute via Sharding

Increase distribution by sharding data across multiple storage nodes, enabling parallel fetching and improved throughput.

Sharding by client_id — Product Reviews distributed across parallel shards
CLIENT Aโ€“D id: 1โ€“1000 CLIENT Eโ€“H id: 1001โ€“2000 CLIENT Iโ€“L id: 2001โ€“3000 PRODUCT REVIEWS SVC shard router shard key: client_id DATA SHARDS (PARALLEL) Shard 1 client_id: 1 โ€“ 1000 reviews ratings metadata Shard 2 client_id: 1001 โ€“ 2000 reviews ratings metadata Shard 3 client_id: 2001 โ€“ 3000 reviews ratings metadata + Shard N โ€ฆ ⚡ parallel fetch
Real-world example: Sharding a Product Reviews service by client_id — each shard handles a subset of clients, enabling parallel fetching and improved throughput.
Strategy 4 โ€” Reduce Flexibility of the System
4
Trade Flexibility for Speed

Flexible systems that make external config/database calls at runtime are slower. Where possible, replace with hardcoded constants to eliminate I/O on the hot path.

❌ Flexible (Slow) — External calls on every transaction
public boolean processTransaction(String type, double amount, String currency) { // Reads from JSON config โ€” external call each time if (amount < config.getJSONObject("minimumAmount").getDouble(currency)) { return false; } // Database call on every transaction double fee = database.getCurrentFee(type, currency); ... }
✓ Optimized — Hardcoded constants, zero I/O
public class TransactionProcessor { private static final double MINIMUM_AMOUNT_USD = 10; private static final double WITHDRAWAL_FEE_USD = 2; private static final double TRANSFER_FEE_USD = 3; public boolean processTransaction(String type, double amount) { if (amount < MINIMUM_AMOUNT_USD) return false; double fee = type.equals("withdrawal") ? WITHDRAWAL_FEE_USD : TRANSFER_FEE_USD; ... } }
Strategy 5 โ€” Perform Load Tests
5
Continuous Load Testing
  • Optimizing one/multiple components once is good, but systems constantly evolve
  • Optimizations may be negated by new changes
  • Performance load testing must be done often and regularly
Strategy 6 โ€” Compromise on Energy Efficiency
6
Trade Energy for Performance
High-End Hardware
  • Performance increase: +50%
  • Energy consumption increase: +150%
Disable Power Mgmt
  • CPU runs at maximum speed always
  • Cost: higher energy consumption
Increase Parallelism
  • Running on multiple GPUs
  • Distributing computation across multiple computers
📋 Chapter 1 — Summary
  • Six strategies work in concert to achieve high performance โ€” from hardware scaling to code-level optimizations
  • Additional hardware & load balancing โ€” scale horizontally to handle more load
  • Cache & batch to reduce communication overhead
  • Sharding for increased distribution across nodes
  • Reduce flexibility โ€” eliminate I/O on hot path for speed
  • Regular, continuous load testing โ€” verify performance under realistic conditions
  • Compromise energy for parallelism & speed
02
Chapter Two ยท Quality Goals

Design Strategies for Achieving Adaptability & Flexibility

Adaptability and flexibility are achieved by keeping changes local, decoupling components, and making conscious decisions about where the system needs to be flexible.

๐ŸŽฏ

Determine Flexibility

  • Identify where the system needs to adapt โ€” functionality, technology, or environment
๐Ÿ“‚

Configuration Files

  • Externalise settings so changes don't require recompilation
๐Ÿ“Œ

Keep Changes Local

  • Encapsulate change behind stable interfaces
๐Ÿ”—

Decouple Components

  • Facade & Strategy patterns to isolate third-party & algorithm dependencies
๐Ÿงฌ

Polymorphism

  • Program to interfaces โ€” swap implementations without touching callers
๐Ÿ”’

Information Hiding

  • Expose only what's needed โ€” hide internal details behind module boundaries
Determine Where the System Needs to Be Flexible
Functionality
  • Strategy Pattern โ€” swap algorithms/behaviors at runtime
  • Feature Flags โ€” toggle features without redeployment
Context «interface» execute() Strategy A CreditCard Strategy B PayPal Strategy C BankTransfer
if (config.isEnhancedProfileEnabled()) { displayEnhancedUserProfile(user); } else { displayBasicUserProfile(user); }
Data Structures / Data Model
  • Schemaless formats (JSON) โ€” evolve without migrations
  • NoSQL Databases (MongoDB) โ€” flexible document structures
Third-Party Software & External Interfaces
  • Facade Pattern โ€” decouple from complex third-party libraries
  • Swap providers without changing internal code
User Interface & Target Platform
  • Responsive Design โ€” adapt to screen/device
  • Cross-platform languages (Java, Python)
  • Containerization (Docker) โ€” platform independence
Flexibility in Functionality โ€” Strategy Pattern
Pattern: Define a family of algorithms (PaymentStrategy), encapsulate each one (CreditCard, PayPal, BankTransfer), and make them interchangeable via a common interface.
Benefit: Add new payment types (e.g., CryptoPayment) without modifying the PaymentProcessor class โ€” open/closed principle.
interface PaymentStrategy { void pay(double amount); } class CreditCardPayment implements PaymentStrategy { ... } class PayPalPayment implements PaymentStrategy { ... } class BankTransferPayment implements PaymentStrategy { ... } class PaymentProcessor { private PaymentStrategy strategy; void process(double amount) { strategy.pay(amount); } }
Facade Pattern โ€” Decoupling from Third-Party Libraries
Problem: Your WebApplication is tightly coupled to a complex Third-Party Storage Library (Credentials, ConfigurationManager, DataStorage). Changing the vendor requires massive refactoring.
Solution โ€” StorageFacade: Introduce a Facade between your application and the third-party library. The application only knows the Facade's simple interface. The complex library implementation is hidden and swappable.
WEB APPLICATION simple API STORAGE FACADE save() / load() THIRD-PARTY LIBRARY Credentials auth tokens ConfigManager endpoints, keys DataStorage read / write ↻ swappable
Use Configuration Files
Environment Configs
  • dev.config โ€” development environment settings
  • qa.config โ€” test/QA environment settings
  • prod.config โ€” production environment settings
  • Load at runtime โ€” no recompilation needed
Application Resources
  • threadPool โ€” corePoolSize, maximumPoolSize
  • databaseConnectionPool โ€” maxPoolSize, timeouts
  • cacheSettings โ€” maxEntries, eviction policy, TTL
  • messageQueue โ€” maxQueueSize, deliveryTimeout
// external-service.config thirdPartyService: apiKey: "123456789abcdefg" baseUrl: "https://api.example.com" authMethod: "Bearer" additionalConfig: timeout: 30 retryPolicy: "exponentialBackoff" maxRetries: 3
Keeping Changes Local
Monolith โ€” Tight Coupling
  • Changes to Video Catalog propagate to Payment Processor
  • Currency Conversion Data is a shared dependency
  • A change to streaming video formats requires testing the entire system
MONOLITH Video Catalog Payment Processor Streaming Service ⚠ Shared DB change propagates
Microservices โ€” Changes Stay Local
  • Web App โ†’ API Gateway โ†’ Microservices Aโ€“E
  • Each microservice has its own database
  • Changes to Service A don't affect Services Bโ€“E
API GATEWAY Service A ✓ DB-A Service B DB-B Service C DB-C ✓ Change in A does NOT affect B or C each service = own DB = isolated
Polymorphism โ€” Flexible by Design
Concept: Allows objects of different classes to be treated as objects of a common superclass. Enables flexibility to perform the same action on different objects with different implementations.
  • Interface Notification with a send() method
  • Implemented by: EmailNotification, SMSNotification, AppNotification
  • Easily extendable: add WatchNotification without changing callers
interface Notification { void send(String message); } class NotificationService { void sendAnnouncement( List<Notification> notifications, String message) { for (Notification n : notifications) { n.send(message); // polymorphic call } } }
Information Hiding
BankAccount โ€” Hide Internals, Expose Contract
Private (Can Change): cachedTransactions, dbConnection, readBuffer, bufferSize โ€” implementation details that must remain hidden.
Public (Stable Contract): addTransaction(), getBalance(), connectToDbAsync() โ€” the interface the consumer depends on.
Use Understandable & Maintainable Code
โŒ Hard to Understand
public int calc(int n) { int s = 0; for(int i = 0; i <= n; i++) { if(i % 2 == 0) s += i; } int f = 1, x = 5; for(int i = 1; i <= x; ++i) { f *= i; } return s; }
โœ“ Separate methods with meaningful names
public int sumOfEvenNumbers(int limit) { int sum = 0; for(int i = 0; i <= limit; i++) { if(isEven(i)) { sum += i; } } return sum; } public int factorial(int number) { int result = 1; for(int i = 1; i <= number; i++) { result *= i; } return result; }
📋 Chapter 2 — Summary
  • Adaptability is achieved through deliberate design decisions โ€” decoupling, hiding details, and isolating flexibility points
  • Determine where flexibility is needed: functionality, data, third-party, UI, platform
  • Strategy Pattern & Feature Flags โ€” swap behaviors at runtime
  • Facade Pattern โ€” decouple from complex third-party libraries
  • Configuration files for environment variability
  • Keep changes local โ€” microservices, modular boundaries
  • Polymorphism for extensible behavior
  • Information Hiding โ€” stable public contracts, hide internals
  • Understandable & maintainable code โ€” readability enables change
03
Chapter Three ยท Quality Goals

Design Strategies for Achieving High Availability

High availability is achieved through three pillars: Error Prevention, Error Detection, and Error Handling โ€” each playing a distinct role in keeping a system continuously operational.

๐Ÿ›ก๏ธ

Error Prevention

  • Transactions, input validation & bottleneck elimination
๐Ÿ”

Error Detection

  • Monitoring, metrics, alerts & result validation
๐Ÿ”„

Error Handling

  • Retry, fallback, rollback & redundant components
The Three Pillars of High Availability
1 โ€” Error Prevention
  • Use Transactions
  • Input Validation
  • Eliminate performance bottlenecks
2 โ€” Error Detection
  • Monitoring critical metrics
  • Validate accuracy of results across redundant components
3 โ€” Error Handling
  • Robust exception handling
  • Rollback mechanisms
  • Redundant system components
  • Auto-replace defective components
Error Prevention โ€” Using Transactions
❌ No Transaction — Partial Failure Risk
-- If the second UPDATE fails, CompanyX loses -- 1000 but Bob never receives it! UPDATE accounts SET balance = balance - 1000.00 WHERE name = 'CompanyX'; -- โ† SYSTEM CRASH HERE โš  UPDATE accounts SET balance = balance + 1000.00 WHERE name = 'Bob'; -- โŒ CompanyX lost 1000, Bob never received it!
✓ With Transaction — Atomic, All-or-Nothing (BEGIN ... COMMIT)
BEGIN; UPDATE accounts SET balance = balance - 1000.00 WHERE name = 'CompanyX'; UPDATE accounts SET balance = balance + 1000.00 WHERE name = 'Bob'; COMMIT; -- โœ“ Only if BOTH succeed โ€” atomic guarantee -- If anything fails between BEGIN and COMMIT, -- the entire transaction is rolled back automatically. -- Neither CompanyX nor Bob's balance is changed.
Error Prevention โ€” Input Validation
Strictly Define Valid Input
  • Define what is valid input at every boundary
  • Define what is invalid and reject it early
  • Prevents both accidental errors and malicious attacks
Error Prevention โ€” Performance Bottleneck Prevention
Identify & Eliminate Bottleneck Stages
  • Identify bottleneck stages in processing pipelines
  • A slow stage creates back pressure — upstream stages stall waiting for it to clear
  • Downstream stages starve — they have nothing to process
  • Throughput of the entire pipeline is limited by the slowest stage
⚠ Back Pressure — Color Correction bottleneck stalls the entire image pipeline
INPUT images Decompress ~20ms ◼ STALLED queue Resize ~15ms ◼ STALLED queue Color Correction ~200ms ⚠ BOTTLENECK Export ~10ms ◻ STARVING ←←← BACK PRESSURE ←←← upstream stages stall waiting for bottleneck to clear STARVATION → nothing to process Pipeline throughput = slowest stage = ~200ms/image
✓ Solution — Parallelize the bottleneck stage to match upstream throughput
INPUT images Decompress ~20ms ✓ Resize ~15ms ✓ PARALLEL WORKERS Color Correction #1 ~200ms Color Correction #2 ~200ms Color Correction #3 ~200ms Export ~10ms ✓ ✓ Effective throughput: ~200ms / 3 workers ≈ ~67ms/image (3× improvement)
Error Detection โ€” Monitoring
Critical Metrics to Monitor
Uptime
HTTP req/sec
Error Rate
CPU / Memory
Error Status Codes
Latency P99
Disk I/O
Queue Depth

Critical metrics must be published and aggregated for both visual (dashboards) and programmatic (alerting) monitoring.

Error Detection โ€” Validating Accuracy of Results
Cross-Validation for Data Consistency
  • Redundant data sources should produce consistent results when cross-checked
  • Compare the sum of all account balances against the sum of all wire transfers
  • If the net total doesn't match the expected value, data corruption or a bug has occurred
  • Run these checks periodically (scheduled jobs) or after critical operations

Example: A banking system has two tables — accounts (current balances) and wire_transfers (pending transfers). By summing both, the system verifies that money was neither created nor lost. If CompanyX has 10,000 and Bob has 50 in accounts, and there are pending transfers of −300 (CompanyY→CompanyZ) and +250 (Jane→Bob), the net total should be exactly 10,000. Any deviation signals a consistency error.

-- Cross-validate accounts table against wire transfers for consistency -- Expected: money is neither created nor lost SELECT (SELECT SUM(balance) FROM accounts) + (SELECT SUM(amount) FROM wire_transfers) AS net_total_balance; -- accounts: -- CompanyX = 10,000 -- Bob = 50 -- Total = 10,050 -- wire_transfers (pending): -- CompanyY โ†’ CompanyZ = -300 -- Jane โ†’ Bob = +250 -- Total = -50 -- net_total_balance = 10,050 + (-50) = 10,000 โœ“ consistent -- If result โ‰  10,000 โ†’ โš  data corruption detected!
Error Handling โ€” Robust Exception Mechanisms
Try-Catch โ€” Catch & Handle Exceptions
  • Wrap risky operations in a try-catch block
  • Catch specific exceptions — don't swallow errors silently
  • Log the error, return a meaningful response to the caller
  • Prevents unhandled crashes from bringing down the service
TRY call broker ok ✓ Success exception CATCH log + handle route retry fallback queue rethrow
Retry with Backoff
  • Server catches exception from External Broker Service
  • Waits and retries — transient failures often self-resolve
  • Use exponential backoff to avoid thundering herd
  • ✓ Green path: retry succeeds
SERVER catch(e) ✗ fail wait 2s BROKER External Svc ✓ OK 1s 2s 4s 8s โ€ฆ exp backoff
Fallback to Alternative
  • If primary broker fails, route to Another Broker Service
  • Circuit Breaker pattern prevents cascade failures
  • Consumer is unaware of the fallback — transparent recovery
SERVER catch(e) Broker A ✗ DOWN OPEN fallback Broker B ✓ OK consumer unaware of switch
Error Queue for Future Verification
  • Add failed trades to an error queue
  • Queue enables asynchronous retry and audit
  • Prevents data loss — no trade is silently dropped
SERVER catch(e) enqueue ERROR QUEUE trade A, trade B โ€ฆ async retry audit log ✓ zero data loss โ€” nothing silently dropped
Error Handling โ€” Transaction Rollback
ROLLBACK โ€” Undo Partial Work on Failure
  • Wrap related operations in a BEGIN ... COMMIT / ROLLBACK block
  • If a business rule fails mid-transaction, ROLLBACK undoes all preceding changes
  • Prevents selling out-of-stock items in a race condition
  • Atomicity guarantees data integrity — the database is never left in a half-done state

Example: An e-commerce system processes a purchase. It first inserts a sale record, then checks inventory. If inventory is zero, the ROLLBACK undoes the sale insert — the customer never gets charged for an out-of-stock item, and the database stays consistent.

✓ Transaction with conditional ROLLBACK
BEGIN; -- Step 1: Optimistically insert the sale INSERT INTO sales (product_id, user_id) VALUES (@product_id, @user_id); -- Step 2: Lock the inventory row and check count SELECT count FROM inventory WHERE product_id = @product_id FOR UPDATE; -- row-level lock prevents race conditions IF count > 0 THEN -- โœ“ In stock โ€” decrement and commit UPDATE inventory SET count = count - 1 WHERE product_id = @product_id; COMMIT; -- sale + inventory update both persist ELSE -- โœ— Out of stock โ€” undo the INSERT, nothing changes ROLLBACK; -- sale record is removed, DB unchanged END IF;
Eliminating Single Points of Failure
Redundancy at Every Layer
  • Identify every component that, if it fails, brings the whole system down
  • Replace single nodes with clustered / replicated equivalents
  • Use active-active or active-passive failover depending on RTO/RPO requirements
  • Auto-replace defective components — health checks + orchestration (e.g., Kubernetes, ECS)
  • Release Version Rollback — ability to instantly roll back a bad deployment
⚠ Before — Single points of failure at every layer
CLIENT Load Balancer ⚠ single App Server ⚠ single Database ⚠ single Any node fails = total outage ✗ Every component is a single point of failure one failure = entire system down
✓ After — Redundancy at every layer eliminates single points of failure
CLIENT LB CLUSTER LB-1 (active) LB-2 (standby) APP CLUSTER App-1 App-2 App-3 ♥ health checks DB CLUSTER Primary (R/W) Replica 1 (R) Replica 2 (R) K8s / ECS orchestrator auto-replace active-passive active-active primary-replica ✓ No single point of failure — any node can fail without system outage ↻ Release Version Rollback — instantly revert bad deployments
📋 Chapter 3 — Summary
  • High availability demands a layered defense โ€” prevent, detect, and handle errors gracefully
  • Transactions for atomic operations โ€” all-or-nothing consistency
  • Input validation at every boundary โ€” reject bad data early
  • Eliminate performance bottlenecks โ€” prevent cascading failures
  • Monitoring โ€” metrics, alerts, dashboards for real-time visibility
  • Validate accuracy across redundant components
  • Retry, fallback, error queues โ€” graceful degradation
  • Rollback โ€” transactions & release version rollback
  • No single point of failure โ€” any node can fail without system outage
Summary โ€” All Three Quality Goals at a Glance
01 ยท High Performance

Design Strategies for Performance

  • Perform load tests
  • Additional hardware & load balancing
  • Reduce / increase distribution (sharding)
  • Compromise on energy efficiency
  • Reduce communication (cache, batch)
  • Reduce flexibility of the system
02 ยท Adaptability & Flexibility

Design Strategies for Adaptability

  • Keep changes local
  • Use configuration files
  • Determine where flexibility is needed
  • Use understandable & maintainable code
  • Decouple system components
  • Use Information Hiding
  • Use polymorphism
03 ยท High Availability

Design Strategies for Availability

  • Error Prevention โ€” transactions, validation
  • Error Detection โ€” monitoring, result validation
  • Error Handling โ€” retry, fallback, rollback
  • Eliminate single points of failure
  • Redundant system components
  • Auto-replace defective components