-
Notifications
You must be signed in to change notification settings - Fork 772
Both SSE and StreamableHttp transport require sticky sessions #330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This also troubles me. I want distributed deployment, but it is difficult to achieve. |
Nobody deploys a single service replica in a cluster, so this affects all real world remote MCP server implementations. WebSickets based transport would work since it opens and keeps the socket open, but that also does make sense. You want a transport solution that opens the socket for the session duration and then closes it to conserve resources. |
It seems possible to use this transport in a "stateless" mode according to comments in the code: typescript-sdk/src/server/streamableHttp.ts Lines 68 to 89 in 09e5d5b
I am not sure what you give up going stateless but it seems like this would be preferred for a production server. Do most major clients support this? |
It is hard to test until StreamableHttp transport is published in NPM. Right now, the example implies only stateful mode, but description of the transport constructor implies stateless mode is supported. |
@gylove1994 @ihrpr I see that you've been working on this. Is stateless mode really something that is usable? @jspahrsummers @jerome3o-anthropic Do you know timeline when this stateless code will be published to NPM? |
@asprouse, it's available now in 1.10.1 version package. Also added a folder with examples -- see README and an example for stateless |
@dglozic some hight level notes on distributed deployment are here |
Yeah, this seems what I was looking for, let me try it. |
Actually, there is still room for clarity in README.md:
In order to persist session data in a database (say, Redis cache), what is the state actually and how can this state be serialized (ideally into JSON that can be cached between nodes)? |
@dglozic did you find a way to use/try non-sticky sessions. As you mentioned in first thread:
I encountered the same problem. We need to find a way to persist transport objects. (serialization/deserialization) @ihrpr any insights would be very helpful. |
It seems that with the latest release, StreamableHttp can be used stateless (but it is still recommended to serialize state if possible). I am waiting to see how exactly :-). |
@asprouse You give up the ability to have server sent events. For that, there needs to be a session id. |
@buggyhunter why do you need to persist transport objects in stateless server? If you need to have a stateless server, transport (and server object) will be created on every request, here is an example |
yeah, and in addition, things like resumability, which is useful for long running tasks |
In what cases do we need to keep a |
I am not saying I want to give up on state, only that the examples showing how state is implemented are not useful for actual implementations in the real world, where multiple server replicas are running. It is relatively easy to set up a Redis service to keep state between replicas (using the session id for the key). But Redis requires this state to be serializable into a string, and examples are cashing entire transport object. What we would need is an example where the transport state is extracted from the transport as JSON serializable object, and on each request a new transport instance can be created by passing that state. We cannot cache entire transport objects, that's what's broken in the examples. We can even cache these transports in memory once created, but if the request started with one replica hits another, there should be a path of re-creating the transport from string-only state in Redis. |
It's not the |
Transport is created on each request only for stateless servers. Here is a very high level example for multi-node deployment with state management. We can try to add a more detailed example in a bit if it's going to be helpful. |
I was toying with the idea of pub/sub, but it is inferior for HA to serializing/deserializing state. When pub/sub is used, what we are saying is that there is the 'correct' node to handle each request based on session id, and we pass on handling of the request to that node. The problem here:
Essentially using pub/sub is just re-implementing sticky session. Kubernetes will handle this automatically for you if you use sticky session support (since load balancer will route requests to nodes acting as upstreams based on varying criteria, and sticky session id would 'rig' that by forcing the load balancer to always pass the requests to the same node based on it). Meanwhile, if state can be serialized into a string and cached in Redis, any node is the 'right' node because if the transport for that session id is missing, we can recreate it from state in Redis. Crashing or restarting nodes is not a problem because load balancer will re-route new requests to the healthy nodes. |
Regarding the serialization, which does seem like a terrific idea, I had a chat with Gemini (because it was handy in my IDE) that identified a bit of the challenge involved. Since it was fairly on point, I will include it here. Me
GeminiOkay, that's an interesting approach to achieving statelessness on the server-side by persisting the transport's state! However, there's a significant challenge here: the Let's break down how you could implement serialization for the non-connection state and discuss the implications: 1. Identify Serializable and Non-Serializable State1.1. Serializable StateFirst, identify the properties that represent the logical state of the transport and can be serialized (primitive types, plain objects, arrays, Maps of serializable types):
1.2 Non-Serializable State:
Important Note on
|
That Gemini guy is smart! I think this all boils down to intent. If you want to open a bi-directional connection and keep it open, I fear HTTP is the wrong protocol to use. If THAT is the intent, then Web Sockets is the right protocol - it will HTTP Upgrade from HTTP to WS(S) transport and keep the socket open (status code 101) for bidirectional traffic (and reconnect in the case of socket drop). HTTP is really not intended for opened connections - you send a request, you get your response, and then connection closes (unless you use keepAlive for performance reasons). Load balancers sitting in front of multiple upstream nodes are assuming the can round-robin traffic to any of the upstreams as long as they are alive. Nodes cannot really keep anything in memory - they should be able to process the request from the request itself, plus state that that can be marshalled and unmarshalled safely. As Gemini dude said, open connections cannot be serialized. |
Seems like there is a very well established pattern for this via express session and its many connectors to things like redis. We don't have to require express per se, but seems like we should match the pattern given that we are already "express adjacent"? |
That was my point - most people using Express also use Redis-based sessions. They work well with multiple instances of services, allowing those instances to 'see' the same cached values. That was my point above, but it seems like resumable and bi-directional message traffic required here is a bad fit for HTTP. Web Sockets seem like a better fit to attempting to cache entire connection in memory (which requires sticky session support). |
StreamableHttp can be stateless, the example server works in kube with multiple pods SSE seems to be deprecated, so I wouldn't count being updated. If anyone needs to implement SSE it will require manually handle sessions in Redis. |
Is your feature request related to a problem? Please describe.
Both SSE and StreamableHttp transport rely on caching the transport in memory in the server pod:
This does not work when there are multiple replicas/pods behind a load balancer (which is most k8s deployments of MCP servers as services). This means that if the call is not the initial one, and the cached transport is not found in the
transports
, error is sent out.We need a solution (and example) that does not involve storing transports in memory because real world deployments of remote MCP servers will have multiple replicas, and demanding sticky session for them is too much of a burden.
If we can ask implementors to cache state that is can survive JSON.stringify/parse roundtrip, they can cache state in distributed caches like Redis session store. But transport object cannot be stored in Redis.
Describe the solution you'd like
A stateless solution that does not involve caching transports in memory, or a stateful solution that can be serialized and stored as strings in caches like Redis.
Describe alternatives you've considered
I seems that streamableHttp transport may be able to work without session ids completely, but the example in code does not shows how. I will try to create an example where sessionID generator is undefined, but if so, we would always need to create a new transport inside request handlers, and not try to reuse them from memory cache. I am not sure this will work.
The text was updated successfully, but these errors were encountered: