Following Part
1 and Part
2 of this series, this entry is more in line with Part
2 as it is a setting that is particularly useful in lower
Quality-of-Service, higher volume environments.
Reusing the diagram from Part
2, consider the basic structure of the average message send():
For a moment, let’s consider what it would look like if the WebLogic
JMS server had to wait for a consumer that does not allow a backlog, and
confirm the receipt of each message individually.
Note that, for both participants, the waiting constitutes a
significant portion (if not the majority) of time. The longer the roundtrip, the
more pronounced this effect is. This results in a low utilization of both the
consumer and the JMS server. We frequently don’t want this behavior – it’s
slower, and the usual manner of compensating for it is to add more consumer
threads.
Consider the one-way sends from Part
2: The underlying trick is to remove the waiting, in favor of accepting a
diminishment of message quality of service. Message pipelining (called
“Maximum Messages per Session” in the WebLogic Administration Console) setting
is a JMS configuration for the messaging consumer’s connection factory that is
somewhat similar, as it may require a tradeoff for added performance.
Turn on the Speed: MessagesMaximum
Maximum messages per session (MessagesMaximum for short or
in the configuration XML file), like many phrases or words with “Max” in them,
can make a pretty spectacular performance gain in certain scenarios and is
always welcome at the best of parties.
You can find this setting in the connection factory “Client”
tab in the WebLogic Administration Console.
One of the interesting things about message pipelining is
that the setting works a little differently based on your JMS client type. The
main purpose of message pipelining is to lower the amount of time the client
spends waiting, and increase the ratio of time that is spent by the JMS server
transmitting messages. The message pipeline (also referred to as a “message
backlog”) is created by sending more than one message to a consumer prior to
receiving an acknowledgement.
Case 1: Asynchronous Client
If you’re using an MDB or otherwise using a client with an onMessage()
method and implementing a MessageListener interface, you are using an
asynchronous client. When you are using an asynchronous client, the “Maximum
Messages per Session” setting applies to the message pipeline on the consumer
side.
The message pipeline is in effect when the consumers unable
to take messages off of the destination of the JMS server as fast as the
messages are put there by the producers. Until production is faster than
consumption, individual messages are received by the consumer from the JMS
server in a two-way send. When production outpaces consumption, messages begin
to be sent in batches to available, asynchronous consumer sessions.
The batch style of message delivery from the messaging
server provides both the performance benefit of lowering the number of two-way
sends and generally having messages more immediately available for consumption
by the onMessage() method.
A potential downside is that the message pipeline very
clearly affects memory consumption on the JMS consumer side, so getting optimal
performance with this setting may be a balancing act if heap consumption
becomes a concern on the consumer. If the pipeline is too large, you might
wind up with one consumer overwhelmed by a huge backlog of messages when the
other consumers are doing nothing.
There are a few behaviors to consider prior to implementing message
pipelines for asynchronous consumers:
- Messages in the pipeline will not be in the destination’s configured sort order. This isn’t surprising – if the messages have already left the server, the server isn’t going to be sorting messages that are now on the client. The messages are sorted, however, prior to being sent in batch to the client.
- The message pipeline is sometimes sent as a single T3 message, which makes it easier to go over the MaxT3MessageSize. Generally this is more of a concern with larger messages (> 1MB), but it depends on your pipeline size setting and the average message size.
Case 2: Synchronous Client & Prefetch Mode
If you are receiving messages with receive() (or receive(long
timeout), receiveNoWait()), you’re receiving synchronously. The
consumer makes a two-way call to the JMS server to see if there is a message
available, and retrieves it, if possible – it’s a polling behavior. If there
is no message available, the call’s thread blocks for the specified time, waiting
for the next message on the destination to arrive.
This is the behavior for synchronous clients unless
“Prefetch Mode for Synchronous Consumers” is enabled. You can find this in the
Administration Console, under your Connection Factory settings in the “Client”
tab.
Like the asynchronous client message pipelining behavior,
synchronous clients with Prefetch Mode enabled receive batches of messages when
the client invokes the receive() method. The number specified in
“Maximum Messages per Session” will apply here, as well. Despite the batches
of messages that are sent to your client JVM, the receive() method
returns messages individually. De-batching takes place in the code provided in
the WebLogic client libraries – so no de-batching is needed in the user-written
consumer / subscriber code.
As with the asynchronous client, performance improves if pipelining
results in the consumer spending less time waiting and more time processing.
There is also the added benefit that, since the consumer generally receives
more than one (possibly many more) message per polling attempt, which reduces
the amount of polling necessary and, therefore, overall network traffic. Overall,
the trend is towards higher consumer utilization.
Pipelining works differently with user transactions (XA). It
also behaves differently when more than one consumer shares the same session. I
invite you to read the docs on this. They state that User Transactions (XA)
will either silently ignore the Prefetch Mode setting, or the consumer will
fail to retrieve the message and generate an exception (the same applies to
multiple consumers on the same session). The docs didn’t clarify this adequately
for my purposes, so I will expound a bit after having experimented with this on
WebLogic. Keep in mind that these are just my findings, and not official aims
or requirements of the product.
Synchronous Clients, User Transactions and Session Sharing
WebLogic implicitly disables Prefetch Mode /
pipelining for the rest of the session when:
- Using an XA-enabled connection factory, the first receive() on a non-transacted session is a part of a User Transaction (XA).
- Multiple consumers are created in the same session prior to calling the first receive().
Otherwise, pipelining for a synchronous client is enabled
for the rest of the session upon the first receive() when there is a
single consumer on the session, presuming Prefetch Mode is enabled in the
connection factory settings.
Knowing when pipelining is enabled or disabled is imperative
to understand what conditions produce exceptions. If pipelining is already
enabled for a session, and you perform one of the two conditions that would
have caused it to be disabled (back when the session was newly created), you’ll
get an exception.
When Do I Use it and How?
I think the question is not so much, “When do I use it?” as
much as it is “When do I turn it off?” The performance advantage, presuming
adequate producer-side performance, is significant. Presuming you don’t have a
strict need involving message sorting, there isn’t much downside as long as you
are using asynchronous consumers. Even using synchronous consumers, where transactions
or consumer session sharing *might* (but shouldn’t, if you’ve read this
blog) impact your usage.
The primary questions to ask on whether or not to enable pipelining,
in general, are:
- Is the utilization on my consumers currently low? Am I currently creating extra consumers to compensate for consumption rate in the presence of the low client utilization?
- Are the JMS producers getting throttled or otherwise hitting quota because message consumption isn’t happening fast enough?
- Are my messages small? Or are my messages very large? You may gain little to no advantage from enabling message pipelines with larger messages. Grouping large messages in a batch has some pretty negative consequences, and generally makes no sense. Think of it this way: Do you think receiving an acknowledgement is the time-consuming part of transmitting a 3 megabyte message? An average message size over 100 kB should be an indicator that this setting may possess less value for you.
- Is throughput less of a consideration than latency? If so, batching messages together may make less sense than immediate sends. You may alternately benefit from simply keeping the number of pipelined messages low, in this case.
Message pipelines are turned on and set to 10 messages per
session by default. This can be a conservative setting in some scenarios, and
can frequently be set higher than a few hundred if the average message size is
sufficiently low. You can explicitly turn it off, by setting it to “1”.
There is no way to guess at a generalized, ideal messages-per-session
setting – except on a case-by-case basis. The following questions should be
answered in order to guess at the initial setting (and you can tune from there):
- What is the expected average message size?
- What is the expected quantity of messages of the expected average size that the consumers can support?
- What is the expected round trip time between the JMS server and the consumer? The smaller the round trip time, the less potential advantage there is in setting the messages per session at a higher number.
Fortunately, other than these considerations,
MessagesMaximum is a relatively straightforward choice – there isn’t a special
cluster consideration as with one-way
sends.
Case Study
I’m simply going use the scenario from Part
1 and Part
2, and add on to it. To recap, I started with an out-of-the box
configuration of a WebLogic JMS server (“Base” in the graph), and used the IBM
Performance Harness producer and consumer for simulation. The producer threads
are set to fire out as many non-persistent, non-transactional 1 KB messages at
a JMS topic as the JMS server will take. The single-threaded, synchronous
consumer was set to AUTO_ACKNOWLEDGE.
Adding quotas and quota-blocking sends (“Quotas Only” in the
graph) reduced the large standard deviation in message rate caused the WebLogic
JMS servers holding onto more messages than could be delivered, and paging the
messages to disk. It also increased overall performance considerably.
Adding one-way sends (“One-Way Sends Enabled” in the graph)
reduced the number of producer threads necessary to reach the level of
performance seen in the Quotas test run.
Taking this configuration, and enabling “Prefetch Mode”
(because I’m using a synchronous consumer) and setting “Maximum Messages per
Session” to 300 (which is a guess, based on the size of the
message, the presence of only one consumer, and the short round-trip time).
Like my other efforts, there’s not a lot of purpose in perfecting these
settings on my local machine, so I’m not concerned with ideal settings so much
as illustrating the principle.
The hypothesis
from the last blog entry (that message consumption was the bottleneck)
seems accurate. Altering MessagesMaximum and enabling Prefetch Mode caused a
fairly linear scaling for the producers up to 4 producers. Here, the
scaling stops because the single consumer thread has become saturated. We
can be confident this is true because: 1) UNIX top reports the thread is
utilizing 100% of a CPU core, and 2) Adding a second subscriber to the topic
doesn’t cause the message rate to alter significantly (each subscriber is
getting > 90k messages per second, although this is not displayed in the
graph).
Final Thoughts
There are fewer reasons not to use message pipelines
(MessagesMaximum / Prefetch) than there are with One-Way Sends. Message backlog
is valuable when: 1) Utilization is low in your consumers and producers are
waiting (either due to quotas or throttling), 2) Message size trends towards
smaller messages, and 3) You are willing to accept the transaction and message
ordering caveats. As presented, performance can be dramatically improved (more
than 4x in this case), and is quite simple to configure.
This comment has been removed by the author.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteIssue :
ReplyDeleteThere are lot of messages being sent to the TOPIC with the producer clearly outpacing the consumer.
We are using weblogic and the consumer is a MDB with current configuration for mdb as max-beans-in-free-pool set to 1.
The overall objective is to make the consumer processing rate faster.
Some of the articles mentions to increase the size of "max-beans-in-free-pool" so that multiple instances of the mdb would be created. Also some have mentioned to use work manager defining threads. We are quite confused, with multiple instances of mdb vs the threads as how it would work differently though
Not sure, if we need to work-out the multiple instances of the mdb along with threads.
Would you be able to please clarify on that.