Skip to main content

System Design

Order Management System

As part of this post, we’ll be covering the design of a modern, production-grade Order Management System (OMS) with a focus on multi-fulfillment, cancellations, refunds, inventory synchronization, and multi-region deployment. Let’s first start with the requirements. Requirements # Functional Requirements # Core order lifecycle: Create order with multiple line items, shipping options, and payment methods. Order state machine: Support states such as PENDING → CONFIRMED → PARTIALLY_FULFILLED → FULFILLED → CANCELLED → REFUNDED. Split shipments: Support split shipments and partial fulfillment when items originate from multiple locations or arrive at different times. Cancellations: Allow customer and system-initiated cancellations in various states (pre-fulfillment, mid-fulfillment) with clear rules. Refunds: Support refunds (full and partial), including multi-payment or mixed-method scenarios (card, wallet, store credit). Multi-fulfillment: Route each line item to an optimal fulfillment node (warehouse, store, 3PL, marketplace drop-shipper). Multiple shipments: Track multiple shipments per order with independent tracking IDs and statuses. Backorders and preorders: Support delayed fulfillment while the order remains active. Inventory and payments: Reserve inventory atomically as part of the order creation saga; release on failure or cancellation. Inventory sync: Prevent overselling across channels with near real-time inventory sync and event-driven updates. Payment gateways: Integrate with one or more payment gateways for authorization, capture, and refund. Multi-channel and integrations: Receive orders from internal checkout, marketplaces, and POS; normalize into a canonical order model. Fulfillment updates: Push fulfillment updates and cancellations back to channels and customer notification systems. Multi-region deployment: Deploy OMS in multiple regions, each with a full stack of services fronted by a global load balancer. Data synchronization: Keep critical data (orders, payments, inventory) synchronized across regions using a mix of strong and eventual consistency depending on domain constraints. Non-Functional Requirements # High Availability and resilience: One failure in a downstream flow should not take down the entire order flow. Scalability: Capable of handling peak events such as flash sales and promotions. Consistency: Clear consistency model for orders and inventory (strong vs eventual consistency). Observability: Comprehensive logging, monitoring, and tracing. Extensibility: Easy to add new fulfillment types, payment methods, or regions without major rewrites. High Level Design # Order Lifecycle and Domain Model # Order Lifecycle Stages # A typical e-commerce order lifecycle contains the following high-level stages:

Mobile Wallet Payment System

As part of this post, we’ll be covering the design of a mobile wallet payment system that supports - Top-ups (add money to wallet from bank/card) P2P transfers (wallet -> wallet) Basic fraud detection Concurrency with clear trade-offs between strong and eventual consistency at scale. Let’s start with a basic design and then we can scale it up.

How to Cache?

Modern APIs frequently access databases, or complex business logic that introduce significant latency and consume CPU and I/O resources. Without caching, every request pays the full cost of database queries, network calls, and computation. This can lead to slow response times and poor scalability as traffic increases.

Rate Limiters

·1921 words·10 mins
Life Without a Rate Limiter # Imagine a public web API that allows clients to fetch user data without any rate limiting. Under normal conditions this might work, but during traffic spikes or abuse (e.g., bots or scrapers) the backend can be overwhelmed, leading to resource exhaustion, cascading failures, and poor availability for legitimate users. Without any form of control, a single noisy neighbor can starve others, increase infrastructure costs, and make it difficult to meet SLAs.