a trace of thought on...BizTalk Server, Team Foundation Server, AppFabric, etc. RSS 2.0
 Thursday, February 26, 2004

I agree wholeheartedly with Ian Griffith's response to Sam Gentile's recent post

I went throught this exercise with a client last summer - it was fairly long and drawn out.  The organization had always had a physically distinct middle tier that was responsible for data access - based on the belief that scalability and security would both be improved.

For the application in question at the time, DataSets were being returned from middle-tier objects - marshaled via .NET Remoting (binary/tcp).  Now, DataSets don't have a very compact representation when serialized, as has been described here.  Whether due to the serialization format or other aspects of the serialization process, our performance tests indicated that the time spent serializing/deserializing DataSets imposed a tremendous CPU tax on both the web server and the application server - even after implementing techniques to address it.  The throughput (requests per second) on the web servers & request latency also suffered dramatically.

We conducted extensive tests using ACT (Application Center Test) to drive 40 virtual users with no dwell time (i.e. each driver thread executed another request as soon as the last returned.)  Two web servers were used, and in the remoted middle-tier case, we had a single middle-tier server.  A single Sql Server was used.  All servers were "slightly aged" four-processor machines.  The read operations brought back large DataSets, whereas the write operations were fairly simple by comparison - the workload was intended to simulate real user profiles.



Requests Per Second (RPS)


Remoted middle tier



1400 msec

Local middle tier



322 msec

Remoted middle tier



2791 msec

Local middle tier



193 msec

Notice that not only was the local middle tier (non-remoted case) able to sustain a much higher throughput, but it had far less latency as well.  CPU utilization indicated we would need one physical middle tier server for every web server.  (Of course, when comparing raw performance of "physical middle tier vs. not", you always need to ask "what would happen if I deployed these middle tier servers as front-end web servers instead?  In practice, you don't even need to go that far - just getting rid of the middle tier servers altogther will often improve performance…)

So, after evaluating performance, we decided we wanted to push for a local middle tier, and allow (gasp) access to the database from the DMZ.  This led us to a long and serious discussion of the security implications, and our reasoning followed Ian's quite closely.  The Threats and Countermeasures text was a very valuable resource.  We certainly avoided the use of all dynamic Sql (in favor of using stored procedures), used a low privilege (Windows) account to access Sql Server (that only had access to stored procedures), used strongly-typed (SqlParameter) parameters for all database calls (that are type/length checked), avoided storing connection strings in the clear via the .NET config encryption mechanism, used non-standard ports for Sql Server, etc.  The quantity of advice to digest is large indeed - but necessary regardless of whether you are deployed with a physical middle tier or not...

Two closing thoughts on this topic….First, Martin Fowler sums up this whole topic well in his book Patterns of Enterprise Application Architecture (Chapter 7) on the topic of “Errant Architectures” - excerpted in a SD magazine article.  After introducing the topic, he says:

"…Hence, we get to my First Law of Distributed Object Design: Don’t distribute your objects! How, then, do you effectively use multiple processors [servers]? In most cases, the way to go is clustering [of more front-end web servers]. Put all the classes into a single process and then run multiple copies of that process on the various nodes. That way, each process uses local calls to get the job done and thus does things faster. You can also use fine-grained interfaces for all the classes within the process and thus get better maintainability with a simpler programming model. …All things being equal, it’s best to run the Web and application servers in a single process—but all things aren’t always equal. "

Second, does anyone remember the nile.com benchmarks that DocuLabs conducted?   I can't find the exact iteration of the benchmark I'm looking for, but they found that ISAPI components calling local COM+ components on a single Compaq 8500 (8-way) could achieve 3000 requests per second, vs. just 500 once the COM+ components were placed on a separate Compaq 8500.  Unreal.  (And by the way, with those numbers, what the heck was wrong the ASP.NET code above?  Oh well, nile.com WAS a benchmark after all…)

Thursday, February 26, 2004 5:38:41 PM (Central Standard Time, UTC-06:00)  #    Comments [0] -
 Friday, February 20, 2004

Steve Maine's post on "single-parameter service interfaces" - and the assertion that such interfaces are more in keeping with the SOA theme - got me thinking just a bit about the real relationship between [WebMethod] methods and the associated WSDL.

Recall that WSDL port types consist of operations that define (at most) an input message and an output message.  WSDL messages consist of "parts" - and for literal formatting with “wrapped“ parameter style (the default for ASMX), you will have a single "part".  The part in turn refers to an XML schema-defined element.  (Here is a concrete example to look at.)

Notice that at this point, we haven't said anything about whether a) we have multiple parameters to our service interface, with a mapping between those parameters and child elements in the WSDL-referenced schema or b) we have a single (xml document) parameter to our service interface that is expected to conform to the WSDL-referenced schema (or for that matter, a single parameter consisting of a serializable class.)

But the WSDL operation definition is quite clear - there is only one message associated with each of the potential directions (input, output, and fault.)  The operation definition doesn't care whether the underlying code that supplies implementation shreds the associated schema types to and from method parameters!

And in an important way, it doesn’t matter.  From the client's perspective, I can submit an xml document (or serialized object) to an operation defined on a port type, as long as that xml document conforms to the associated schema.  The client isn't forced to take a parameter-oriented view of a web service interface regardless of whether or not the server implementation is "parameterized".  Likewise, from the server's perspective, a web service interface could be implemented with consumption of (compliant) xml documents - without forcing that view on the client (who might very well prefer a parameter-style proxy to be generated from WSDL.)

This point remains true even if I was using “bare“ parameter style (i.e. if I had multiple message parts) or if I was using RPC formatting (i.e. if I had a parent element for my parameters named after the web service method.)

Of course, your philosophical bent will lead you to either the WSDL-first path (for the document view) or the ASMX path-of-least-resistance (for the parameter view.)

And, handling the open content case that Steve discussed is only possible with a document-oriented approach.  (XmlAnyElementAttribute could assist with the case where you want to rely on serialized/deserialized objects to stand in for raw xml documents.)

Note that the parameterized view exhibits some aspects of being a leaky abstraction.  SOAP 1.1 allows for missing values ("Applications MAY process requests with missing parameters but also MAY return a fault.") - and so does the XmlSerializer.  This means that you can wind up with malformed requests, and not know it.  (Is your service really going to be ok with treating missing parameters the same as freshly initialized data types?)  Since ASMX offers no schema validation by default, you really need to rely on a schema-validation SoapExtension to solve this problem.

Friday, February 20, 2004 7:22:44 AM (Central Standard Time, UTC-06:00)  #    Comments [0] -
 Tuesday, February 3, 2004

'Nuff said.

Tuesday, February 3, 2004 4:55:54 PM (Central Standard Time, UTC-06:00)  #    Comments [0] -
<February 2004>
About the author:

Scott Colestock lives, writes, and works as an independent consultant in the Twin Cities (Minneapolis, Minnesota) area.

© Copyright 2014
Scott Colestock
Sign In
All Content © 2014, Scott Colestock
DasBlog theme 'Business' created by Christoph De Baene (delarou)