During investigation of some intermittent problems in one of our web applications, we observed some JSESSIONID cookie behavior that we couldn’t explain, so I performed some investigation into the cookie’s mechanics. I’ll attempt to summarize in this article what I learned.
JSESSIONID Cookie Format
In a clustered environment, the JSESSIONID cookie is composed of the core Session ID and a few other components. Here’s an example:
|Clone ID or Partition ID||v544d0o0|
It’s still unclear to me what the different values mean here, but searching our Proxy logs indicates that the vast majority of our Sessions, the Cache ID is 0001. For instance, a search of one day’s log around 17:00 indicates the following number of cookies which contained a particular Cache ID:
|Cache ID||Hit count|
For a particular Session ID, the Cache ID can definitely change mid-session, without any other changes. In particular, without the Clone/Partition ID changing. This does not indicate a switch to another Cluster member, but I don’t know exactly what it does indicate. Based on the relative loads of our various systems, it seems possible that changing Cache IDs only occurs under heavy loads.
A Partition ID is appended to the cookie if memory-to-memory replication in peer-to-peer mode is utilized for Distributed Session management. Otherwise, a Clone ID is appended.
This will match one of the CloneID attributes in the Server elements within the web server’s plugin-cfg.xml file. For instance 138888kcd in:
<Server CloneID="138888kcd" ConnectTimeout="10" ExtendedHandshake="false" LoadBalanceWeight="2" MaxConnections="-1" Name="server1Node01_App03" ServerIOTimeout="0" WaitForContinue="false"> ... </Server></pre>
If you use memory-to-memory Session replication, your cookies will contain Partition IDs rather than Clone IDs.
Partition IDs are similar in function to Clone IDs, but best I can tell there is no direct way to determine which values correspond to which cluster members. They are internally mapped by the WebSphere HAManager to specific Clone IDs, and that mapping is exchanged with the web server plug-in so that it can maintain Session affinity and find additional cluster members for failover. (The exchange takes place in private headers on each response from WAS back to the plug-in, and those headers are removed before the response is sent back to the client.)
Session Affinity and Failover
The Clone/Partition ID corresponds to whichever cluster member creates the Session, and the plug-in is responsible to send that Session to the same cluster member as long as it is available. From the Scalability Redbook:
Since the Servlet 2.3 specification, as implemented by WebSphere Application Server V5.0 and higher, only a single cluster member may control/access a given Session at a time. After a Session has been created, all following requests need to go to the same application server that created the Session. This Session affinity is provided by the plug-in.
If on a subsequent request the specified cluster member is unavailable, the plug-in will choose another cluster member and attempt to connect to that. If Distributed Sessions are configured, via database persistence or memory-to-memory replication, the Session will be resumed in-progress on that new member. If not, a new Session will be created and the user’s progress will be lost.
If a new cluster member is able to resume the existing Session, it will append its own Clone/Partition ID to the existing JSESSIONID cookie. For instance:
Now the plug-in knows that 2 different cluster members could potentially service this Session. If the original member becomes available again, the Session will switch back to it.
Finally, note that according to the System Management Redbook:
WebSphere provides session affinity on a best-effort basis. There are narrow windows where session affinity fails. These windows are:
- When a cluster member is recovering from a crash, a window exists where concurrent requests for the same session could end up in different cluster members…
- A server overload can cause requests belonging to the same session to go to different cluster members…
Referenced articles and Redbooks
- Redbook: WebSphere Application Server V6 Scalability and Performance Handbook - The best reference; it contains most of what I’ve discovered thus far. Sections 6.8.1, 6.8.6, and example 6-19 in particular.
- Redbook: WebSphere Application Server V6.1: System Management and Configuration - Section 10.7 in particular.
- InfoCenter section on HTTP Session problems
- PK83788: INCORRECT HANDLING OF PARTITION TABLES BY PLUGIN
- PK48101: NEWLY SPAWNED WEBSERVER PROCESSES BREAK SESSION AFFINITY WHEN MEMORY-TO-MEMORY PERSISTENCE IS CONFIGURED
Feb 6 2012 BTW, we recently noticed a specific case where the Cache ID changed for a Session: when the Session hadn’t been accessed in a long time. In fact, the subsequent access was right before the 30-minute timeout would have triggered.
One interesting behavior we observed in this case was that an object on the Session which we had been comparing for identity-equivalence (== comparison) was no longer equivalent after the Cache ID changed. We needed to update our code to use .equals() instead (which we should have been using the in the first place, anyway).