Bulkhead
Bulkhead separates resource pools so one overloaded path or dependency cannot consume all shared capacity. Like ship compartments, the goal is not to prevent every local failure, but to stop that failure from flooding the whole service.
βΆArchitecture Diagram
π StructureDashed line animations indicate the flow direction of data or requests
If every request shares the same thread pool, connection pool, or worker queue, the slowest dependency or noisiest workload can starve everything else. Then a non-critical feature can degrade the entire service. Shared capacity becomes the channel through which local trouble turns into global outage.
As platforms relied more on external APIs, background jobs, and multi-tenant workloads, it became common for one slow dependency to tie up huge amounts of shared capacity. Simple shared-pool strategies stopped being good enough. Bulkhead became important because distributed systems needed local containment, not just global scaling.
Teams usually implement bulkheads by splitting worker pools, connection pools, queues, or concurrency limits by dependency, tenant class, or business priority. That way one pool can saturate while another still has room for critical traffic. The structure is about deciding what deserves protection and what should be allowed to degrade first.
Circuit Breaker opens or closes calls based on error and timeout behavior. Bulkhead separates capacity so one class of failure cannot exhaust all resources in the first place. Load Balancer spreads traffic across instances; Bulkhead protects the inside of a service or instance from internal capacity collapse.
It is useful when critical user-facing flows coexist with lower-priority jobs, when dependencies have very different reliability, or when tenant isolation matters operationally. The tradeoff is that too many isolated pools can waste capacity and complicate tuning. Bulkheads work best when guided by real failure paths and saturation data.