Introduction # A High Concurrency flash sale system is a system that is designed to handle a large number of concurrent users who are trying to purchase a limited number of items in a short period of time.
Imagine a small app where each client talks directly to a specific server, say https://10.0.0.1:8080. As traffic grows, that one server becomes a bottleneck, and if it crashes, the whole app is effectively down for any client pointing at it.
Modern APIs frequently access databases, or complex business logic that introduce significant latency and consume CPU and I/O resources. Without caching, every request pays the full cost of database queries, network calls, and computation. This can lead to slow response times and poor scalability as traffic increases.