I had the pleasure of delivering a short talk at the Business Process Integration & Workflow Conference last week in Redmond. The whole conference was great, especially meeting quite a few folks in person I'd only conversed with via email. Being notified of MVP status for BizTalk on Friday was a great cap to the week!
Although the sample I presented during my talk isn't quite ready to release, the slides (on scatter/gather scenarios in BizTalk) can be downloaded here.
I worked through a problem recently with a client that really took me by surprise - because I would think that many BizTalk shops would be running into this issue regularly. So! Here goes with an explanation and a solution.
We ran into this problem initially using the MSMQ (not MSMQT) adapter with BizTalk 2004. We had roughly 10 MSMQ receive locations, as well as a few send ports that were using the loopback adapter. These were all executing in the same host.
The initial symptom was that the loopback adapter appeared to not work - messages were just not getting through! They sat in the "delivered, not consumed state" for no good reason. But we quickly reproduced the problem with just MSMQ receive locations (i.e. without the loopback adapter.)
On a single processor virtual machine, the repro looked like this: Create four MSMQ receive locations, and one MSMQ send port (with the send port subscribed to one of the receive ports, just to keep things easy.) No messages will flow through the send port at all.
To repro: This download has a binding file with receive ports/locations for local (non-tx) private queues Q1-Q4, plus a send port for local private queue NONTXQ with a filter for the first receive port. There is also a bit of VB script to put a message into a queue...If you turn off one of the receive locations for Q2-Q4, you'll find things work just fine. If you don't, then (reiterating) no messages will flow through the send port.
What was the resolution? Well, with BizTalk 2004 Service Pack 1 installed, you can create a "CLR Hosting" key under the registry service definition.
In our case, we actually had to increase these values - you should determine the values you need through testing. Consider having min worker threads equal to 7x the number of MSMQ receive locations, and max worker threads equal to 10x the number of MSMQ receive locations. (More on these numbers later...)
Does the documentation address this? Good question. If you look at the topic "Managing Multiple Receive Locations" in the MSMQ adapter documentation, you will find some reference to this. It indicates you should create a "CLR Hosting" key as described above...but no actual values are mentioned (clearly just a documentation mishap.)
But why do these have to be tweaked at all? Good question. The documentation for the MSMQ adapter has some unfortunate quotables, like:
To increase performance, Microsoft BizTalk® 2004 Adapter for MSMQ is multi-threaded. If you have many receive locations, there may not be enough threads available for all the receive locations. This prevents some of the receive locations from picking up messages.
The reality is that you really shouldn't have to starve any particular receive location because of a lack of threads...you should just wind up with increased latency. But, such is not the implementation of the MSMQ adapter (at least for BizTalk 2004.)
Some background: The MSMQ adapter has a "Batch Size" parameter and a "Serial Processing" parameter that can be set per receive location. "Batch Size" determines how many messages the adapter will attempt to read from the queue (and submit to the message box) on each iteration. "Serial Processing" determines whether one thread is engaged in the peek/get/submit activity per receive location (Serial Processing = 'true') or multiple threads (Serial Processing = 'false'). If "Serial Processing" is true, the "Batch Size" is forced to one regardless of the actual setting.
So what is the execution flow for a given receive location? The internal class MsmqReceiverEndpoint is instantiated per receive location, and when it initializes, it calls ThreadPool.QueueUserWorkItem with a reference to itself. If "Serial Processing" is false...it does this exactly seven (7) times.
What does it do with the QueueUserWorkItem callback? Well, when MsmqReceiverEndpoint.ProcessWorkItem is called, it enters into a do/while loop that doesn't exit until the endpoint (receive location) becomes invalid (i.e. the receive location is shut town.) In other words, ProcessWorkItem sits on a .NET thread pool thread - and if Serial Processing is false, it sits on seven of them. The do/while loop executes a peek on the queue (with a hard-coded 10 second timeout), and if there are messages waiting, it receives up to "Batch Size" and submits them to the message box. (It will give up attempting to receive a "Batch Size" worth of messages if the 10 second timeout is reached on any attempt within the batch receive loop - i.e. if you drop a single message on a queue, and the batch size is greater than one, expect to wait 10 seconds before further activity begins...) The behavior of consuming seven threads per queue leads to the recommendation of MinWorkerThreads = 7x MSMQ receive locations provided above.
Now, I confess - I'm not a BizTalk adapter expert. But, this design seems to be in conflict with the advice offered in "Writing Effective BizTalk Server Adapters", where it says:
Don't starve the .NET thread pool: ...While starving the .NET thread pool is a risk to all asynchronous programming in .NET, it is particularly important for the BizTalk Server adapter programmer to watch out for this. It has impacted many BizTalk Server adapters: take great care not to starve the .NET thread pool. The .NET thread pool is a limited but widely shared resource. It is very easy to write code that uses one of its threads and holds onto it for ages and in so doing blocks other work items from ever being executed....If you have multiple pieces of work to do (for example copying messages out of MQSeries into BizTalk Server), you should execute one work item (one batch of messages into BizTalk Server) and simply requeue in the thread pool if there is more work to do. What ever you do, don't sit in a while loop on the thread.
Is this fixed in BizTalk 2006? Surely it is... And, in fact, it sure seems to be in Beta 1. The design of the adapter is a bit different...First, "Serial Processing" refers to whether additional messages will be received from the queue prior to the "EndBatchComplete" event being set (downstream of IBTDTCCommitConfirm.Done.) (This part of "Serial Processing" is true for BizTalk 2004 as well, along with forcing the batch size to one.) "Serial Processing" in BizTalk 2006 does not affect how many threads will be reading from your queue - you will have just one (despite what the Beta 1 docs say...), unless you have multiple host instances in play. (That one thread using a large batch size and operating with serial processing set to 'false' - not blocking on the actual message box submission - should keep up with a fairly large message arrival rate, but multiple host instances might be needed for your particular case.)
More importantly, the ProcessWorkItem implementation returns immediately after a single peek/get/submit operation (and simply calls QueueUserWorkItem again, per the advice cited above.) (Side note: There seems to be some room in the design for the idea that you in fact woudn't return immediately if more than a threshold number of messages were received, but currently this condition is "if # of messages received > BatchSize", which won't ever happen.)
So what should I do for now with BizTalk 2004? For those using the MSMQ Adapter with BizTalk 2004...consider whether you can set "Serial Processing" equal to true. Keep in mind this forces you to a batch size of 1, so this might not work depending on your message arrival rate. If you test this configuration and find an unacceptable performance loss, consider setting the MinWorkerThreads value to 7x the number of MSMQ receive locations you are maintaining, and MaxWorkerThreads to roughly 10x (to provide breathing room.) As an alternative, spread your receive locations among multiple hosts (though avoid an over-proliferation of hosts - that has its own issues.)
And never draw any conclusions until you have performance tested at load with your final host configuration - that is, your final allocation of send handlers, receive handlers, send ports, and receive locations among your hosts! Other adapters may affect the outcome if they involve polling on the receive side, or polling on the "response" side of a solit-response send port. (If they use a thread pool thread to do their work, they can be affected by any adapter that consumes threads whether they themselves are written correctly or not!) Finally, I've heard from a gentlemen who has done extensive testing that the threading parameters above are useful/necessary when using large numbers of MSMQT receive locations as well.
Never a dull day in BizTalk land...!
Scott Colestock lives, writes, and works as an independent consultant in the Twin Cities (Minneapolis, Minnesota) area.
© Copyright 2008