I recently went through a really nast bout of troubleshooting with the client I
currently work with, related to MSMQT. Hopefully, my tale can save you
similar pain.
The core issues was this: The BizTalk MSMQT adapter can be configured during
installation to integrate with Active Directory. The default is that it
will not operate in this fashion, but rather in "workgroup" mode. There
are (at least) two reasons why you might want to have MSMQT integrate with
Active Directory: 1) you want to make use of an MSMQ router in your environment
or 2) you want to use certificate-based authentication at a protocol level
(where the public certificate is managed by AD.) (Note: I know this now;
I didn't know it a couple weeks ago...)
We have been installing our servers in "workgroup" mode. To install in
Active Directory mode requires a special permission granted by the domain
administrator.
Now, when you a configure a Send Port within BizTalk and select MSMQT as the
transport, the property pages in the BizTalk Explorer offer a checkbox that is
labeled "Use MSMQ Authentication". If you hit the "Help" button on this
dialog, the explanation that is provided is this: "Identify whether BizTalk
Message Queuing uses protocol authentication every time it sends a message on
this port."
As it turns out, although it isn't documented as such, a Send Port with this
option checked can only work if MSMQT has been installed in Active
Directory-integrated mode. If you have the "Use MSMQ Authentication"
option checked on a Send Port and you are not in Active Directory-integrated
mode, then messages will not flow. When we eventually discovered this
discrepency and fixed our bindings files, the problem was resolved.
(Note: there is a similar option when configuring Receive Locations.)
This checkbox had been checked at the point our initial binding files were
exported, and became a part of our scripted deployment. What was worse,
when we encountered this problem a few weeks ago in QA, we began
troubleshooting the BizTalk configuration on the server directly and wound up
"fixing" the problem by creating an additional Send Port (subscribing to the
same traffic as the original) that simply had the MSMQ Auth checkbox off.
But we didn't realize that discrepancy at the time, so we had to troubleshoot
the same problem all over again a few weeks later. We definitely got
ourselves into the wrong troubleshooting mindset by assuming that Biztalk was
flaky in some way.
Key lesson: If you don't get into a given environment (QA, production,
whatever) with your scripted deployment, then you really didn't get there at
all….
A few more notes. As I said above, if you have the "Use MSMQ
Authentication" option checked on a Send Port and you are not in Active
Directory-integrated mode, then messages will not flow. What you will see
is:
-
Messages will appear in the HAT "Queries-Messages Sent in Past Day" report, but
they will not actually have arrived in the destination queue. (Fixed in
SP1?)
-
You will see strange behavior in the HAT "Operations-Messages" view, but
nothing that indicates an error condition. Retry count will increment on
the original service instance.
-
There will be no error condition reported in the event log. (OK, Premier
Support indicates in a phone conversation you might see something after 5 days
have elapsed, when an exponential backoff algorithm has run its course.)
IMHO, Biztalk 2004 should be more serviceable in this regard, and should give
better error information. And of course, the documentation for MSMQ Send
Port configuration should have mentioned that MSMQ Authentication would only
work for Active Directory.
Microsoft Premier Support became involved, and after around 18 hours of
analysis they said "We see some certificate-related errors in the traces.
Do you use MSMQ authentication? Are you AD-integrated?"
We looked in our binding files (since the decision had long since been
forgotten) and saw this snippet:
…
<TransportType Name="MSMQT" Capabilities="16495"
ConfigurationClsid="9a7b0162-2cd5-4f61-b7eb-c40a3442a5f8"/>
<TransportTypeData><CustomProps>&
lt;Authenticated
vt="11">-1</Authenticated></CustomProps>
</TransportTypeData>
<RetryCount>3</RetryCount>
<RetryInterval>5</RetryInterval>
…
See that in the escaped XML? Yup, that is a property called
"Authenticated" that is an old-fashioned Variant of type bool, where "-1" means
"true".
Leaps out at you, right? Determining if you are AD-integrated means
looking at
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\BTSSvc.3.0\MessageQueuing\MsmqtWorkgroupMode.
From all of this I (gently) conclude that the product instrumentation/tracing
should point out this condition more quickly to a support engineer. In
addition, the MSMQT adapter should warn you of a mismatch during configuration
with the Biztalk Explorer and, ideally, when you deploy/import bindings.
Hindsight being 20/20, the support engineer should have asked to see our
binding file - and should have compared it with one exported from a server that
was indeed sending messages (since we had one.) Of course, we should have
made such a comparison, too! (and much earlier...) The engineer did look
at the Biztalk Admin Console, but of course that doesn't give any of the
detailed port configuration information - only Visual Studio/BT Explorer does.
Having said that, the support engineers were great to work with and were
certainly dedicated to getting to the bottom of our issue.
Key lesson: Diffing binding files will prove to be a key troubleshooting
technique with Biztalk...