A new design for highly concurrent Internet services designed to support massive concurrency demands and simplify the construction of well-conditioned services. It allows services to be well-conditioned to load, preventing resources from being overcommitted when demand exceeds service capacity.
Applications -> a network of event-driven stages connected by explicit queues.
The paper describes several control mechanisms for automatic tuning and load conditioning:
- thread pool sizing
- event batching
- adaptive load shedding
- services -> complex, static content -> dynamic content( extensive computation and I/O)
- service logic change repidly.
- services are increasingly hosted on general-purpose facilities
Replication is a key aspect of service scalability. (CDN, content distribution networks)
- Defect: Waste
In SEDA: Application -> a network of stages, each stage -> incoming event queue.
SEDA draws together two important lines of research
- thread-based concurrency models for ease of programming
- event-based models for extensive concurrency
- The service behaves like a simple pipeline, where the depth of the pipeline is determined by the path through the network and the processing stages within the service itself.
- key property: graceful degradation: as offered load exceeds capacity, the service maintains high throughput with a linear response-time penalty that impacts all clients equally, or at least predictably according to some service-specific policy.
Each accepted request consumes a thread to process it.
OS overlaps computation and I/O by transparently switching among threads.
- easy to program
- Cache and TLB misses, scheduling overhead, and lock contention
Virtualization fundamentally hides the fact that resources are limited and shared
Bound the size of the thread pool associated with a service.
- When the number of requests in the server exceeds some fixed limit, additional connections are not accepted.
Adv: Avoid throughput degradation, more robust
DisAdv: Introduce a great deal of unfairness to clients (cause clients to experience arbitrarily large waiting times)
Another problem: the server is unable to inspect the internal request stream to know the the source of a performance bottleneck. For example, whether IO or network bandwidth is the bottleneck. All it knows is that the thread pool is saturated, and must arbitrarily reject work without knowledge of the source of the bottleneck.
- Resource containers and the concept of paths can help reduce the problem.
- generated by the operating system or internally by the application
- generally correspond to network and disk I/O readiness:
- completion notifications, timers, or other application-specific events.
In Event-driven systems, the main server process is responsible for continually dispatching events to each of these components, which are implemented as library calls.
- Because certain I/O operations (in this case, filesystem access) do not have asynchronous interfaces, the main server process handles these events by dispatching them to helper processes via IPC.
- Helper processes issue (blocking) I/O requests and return an event to the main process upon completion.
- robust to load, with little degradation in throughput as offered load increases beyond saturation.
- It assumes that event-handling threads do not block, and for this reason nonblocking I/O mechanisms must be employed.
Event-driven design raises a number of additional challenges:
- Scheduling and ordering of events: need to decide when to process each incoming event and in what order to process the FSMs for multiple flows.
- Modularity is difficult to achieve: each state must be trusted not to block or consume a large number of resources that can stall the eventhandling thread.
Possible solution: structure an event-driven application using a set of event queues to improve code modularity and simplify application design.
- Separating applications in to events.
- Dynamic resource controllers to allow applications to adjust dynamically to changing load.
- Support massive concurrency
- Simplify the construction of well-conditioned services
- Enable introspection
- Applications should be able to analyze the request stream to adapt behavior to changing load conditions.
- Support self-tuning resource management
A stage is a self-contained application component consisting of an event handler, an incoming event queue, and a thread pool. Like the following figure.
- Each stage is managed by a controller that affects scheduling and thread allocation.
- The core logic for each stage is provided by the event handler, the input to which is a batch of multiple events.
- Threads are the basic concurrency mechanism within SEDA.
The event queues in SEDA may be finite:
- an enqueue operation may fail
- may make use of backpressure (by blocking on a full queue) or load shed-ding (by dropping events) when enqueue operations fail
- the application may wish to take some service-specific action
The introduction of a queue between stages decouples their execution
- by introducing an explicit control boundary.
- Introducing a queue between two modules provides isolation, modularity, and independent load management, but may increase latency.
Use runtime characteristics to adjusts allocation and scheduling parameters. two parameters:
- thread pool controller: number of threads
- batching controller: batching factor
- observing the output rate: if throughput is smaller, the increase the batching factor.
Applications do not create or manage threads
- this is the responsibility of the runtime system and associated controllers
- asyncClientSocket: initiate outgoing socket connections
- asyncServerSocket: initiate incoming socket connections
- asyncConnection: when a connection is established
Each asyncSocket stage services two separate event queues:
- A. a request queue from the user
- B. an I/O readiness/completion event queue from the operating system.
Application also has its event queue C
readStage rea~d a socket(A) when an I/O readiness event indicates that a socket has data available(of B). And then enqueues the resulting packet to the event queue provided by the user(C).
writeStage receives packet write requests from the user(from C) and enqueues them onto an internal queue associated with the particular socket. (A)
asyncFile object is provide to support async file I/O.
Interfaces: read, write, seek, stat, and close.
The run of asyncFile io has three stages:
- PageCache stage
- CacheMiss stage
- asyncFile stage
A Retrospective on SEDA
After ten years, the original author thinks back on the design.
Using Java for SEDA was crazy(I guess it is because Java is too slow)
- Linux threads: suffering scalability problems.
- Multicore machines were rare.
- Nearly all papers about Web server performance focused on bulk throughput for serving static Web pages, without regard for end-to-end request latency.
- Linux threads is better.
- Multicores are the norm.
- Request latency matters a lot more.
As a request passes through the stage graph, the request suffers context switches and waiting is queue
- Cause poor cache behavior and long response time.
- decouple stages from queues and thread pools
- put a separate thread pool and queue in front of a group of stages that have long latency or nondeterministic runtime, such as performing disk I/O.
There are multiple threads responsible for polling for request completion, incoming sockets, and so forth, and performance is highly sensitive to the timing of these threads. The author spent a lot of time on tuning the sockets library. And the parameter tuned did not work well on other machines or JVMs.
The most important contribution of SEDA: making load and resource bottlenecks explicit in the application programming model
- Requests are never "stalled" somewhere under the covers -- say, blocking on an I/O or waiting for a thread to be scheduled. You can always get at them and see where the bottlenecks are, just by looking at the queues.