Home | About Me | Developer PFE Blog | Become a Developer PFE

Contact

Categories

On this page

Linking your pivot collections the fun and easy way
Pivoting ASP.NET event log error messages
Parsing ASP.NET event log error messages for fun and profit
Micro optimization or just good coding practice?
Fixing the DynamicControlsPlaceholder control – Making the community better
OpenXML: How to refresh a field when the document is opened
Working with the Bing Translator API
IE9: DOM Traversal
Azure: Why did my role crash?
.NET 2.0 and Access Databases
ASP.NET 2.0: Textbox ReadOnly Property
LINQ: When are my database connections closed?
String Concatenation – A Performance Story
ASP.NET GridView Grouping and Checkbox Lists
One-to-Many: What is the best way to store the values?

Archive

Blogroll

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

Sign In

# Monday, October 11, 2010
Monday, October 11, 2010 12:09:42 AM (Central Daylight Time, UTC-05:00) ( ASP.NET | Development | Pivot )

In my previous post, I added to my series of entries on making sense of your ASP.NET event log error messages.  Note that this is entry #4 in this series.  The previous three entries can be found here:

In that last post, I walked through the PAuthorLib.dll and showed you how to crawl through your event log error messages and create a pivot collection.  The result of that initial effort was a nice view into our events:

image

While this certainly got the job done and is a very powerful and compelling view into our data, we need to realize that as our data grow, the amount of entries in our linked collection is limited.  From the Developer Overview, we see that the maximum number of items we should have in a single collection is 3,000:

image

So, while a simple collection will get the job done for your smaller amounts of data, you will really run into some challenges with your larger datasets like our ASP.NET event log error messages.  To combat this limitation you can create what’s called a Linked Collection.  The idea is that it’s just a way for you to link together related collections in order to provide a seamless experience for your users.  In our case, a natural break for our collections will be based upon the exception type with a summary collection and then a separate collection for each exception type.  If I were to draw this out: 

PivotCollections

Event Log Header Collection Source

The idea behind this structure is that the Exception summary would simply link to each of these exception collections.  First, we’ll create a colleciton source for our exception summary.  As in our previous collection source (in my last blog post), we inherit from the AbstractCollectionSource class and use the LoadHeaderData() method to add our facet categories to the collection.  In this case, we’ll create two categories – the number of Occurrences and the Urls where the exception occurred.  Another difference is that we are going to pass the already parsed collection of messages into the constructor.  The reason for that is so we don’t have to repeat the parsing of the event log messages multiple times.

class EventLogHeaderCollectionSource : AbstractCollectionSource
{

private IEnumerable<EventLogMessage> m_messages = null;

public EventLogHeaderCollectionSource(IEnumerable<EventLogMessage> messages,
string inputFile)
: base(inputFile)
{

m_messages = messages;


}

#region Facets

private const string OCCURRENCES = "Occurrences";
private const string URLS = "Urls";

#endregion

protected override void LoadHeaderData()
{
this.CachedCollectionData.FacetCategories.
Add(new PivotFacetCategory(OCCURRENCES, PivotFacetType.Number));
this.CachedCollectionData.FacetCategories.
Add(new PivotFacetCategory(URLS, PivotFacetType.String));

this.CachedCollectionData.Name =
"ASP.NET Error Messages - Summary";
this.CachedCollectionData.Copyright =
new PivotLink("Source", "http://www.samuraiprogrammer.com");

}
}

Then, in the LoadItems() method, we provide the logic to generate the PivotItem collection.  The one key item to make note of is the use of the Href property of the PivotItem object.  This is where we specify the collection we wish to link to this item.  Since each of the PivotItems will be a summary of the number of each exception type – we’ll name the sub-collections by its exception type.  For example, NullReferenceException.cxml, SqlException.cxml, etc.

protected override IEnumerable<PivotItem> LoadItems()
{
var results = from log in m_messages
group log by log.Exceptiontype into l
orderby l.Count() descending, l.Key
select new
{
ExceptionType = l.Key,
ExceptionCount = l.Count()
};


int index = 0;
foreach (var result in results)
{
PivotItem item = new PivotItem(index.ToString(), this);
item.Name = result.ExceptionType;
item.Description = "# of Exceptions: " + result.ExceptionCount.ToString();
item.AddFacetValues(OCCURRENCES, result.ExceptionCount);
item.Href = result.ExceptionType + ".cxml";

...

index++;
yield return item;
}

yield break;
}

Event Log Collection Source Redux

Previously, when we generated the pivot collections, we were outputting all of the records into a single collection.  Now that we are generating a collection for each exception type, we will need to put a filter in our exception collection and then incorporate that filter into our item generation.  Other than that, the code we wrote previously remains largely unchanged, so I left the majority of it out and only included the snippets that we care about below.

class EventLogCollectionSource : AbstractCollectionSource
{
private IEnumerable<EventLogMessage> m_messages = null;
private string m_exceptionType = string.Empty;

public EventLogCollectionSource(
IEnumerable<EventLogMessage> messages,
string exceptionType,
string path)
: base(path)
{
m_messages = messages;
m_exceptionType = exceptionType;
}

protected override void LoadHeaderData()
{
...
this.CachedCollectionData.Name =
string.Format("{0} Error Messages", m_exceptionType);
...
}

protected override IEnumerable<PivotItem> LoadItems()
{
var results = (from message in m_messages
where message.Exceptiontype == m_exceptionType
select message);

int index = 0;
foreach (EventLogMessage message in results)
{
PivotItem item =
new PivotItem(index.ToString(), this);
item.Name = message.Exceptiontype;
item.Description = message.Exceptionmessage;

...

index++;
yield return item;
}
yield break;
}
}

Generate and test the collection

Then, the only thing we have left to do is generate and test our linked collections.  I won’t go into a lengthy explanation of how we generate the collections because I did that in the last blog entry.  I will show the broad strokes required to tie this all together, though:

// Load the raw messages into a collection
IEnumerable<EventLogMessage> messages =
LoadEventLogMessages(inputFile).ToList();

// Generate summary pivot collection
EventLogHeaderCollectionSource sourceSummary =
new EventLogHeaderCollectionSource(messages, inputFile);
...
summaryTargetFilter1.Write(sourceSummaryFilter1);

// Get the aggregate results so we know the filters
// for our exception pivot collections
var summaryResults = from log in messages
group log by log.Exceptiontype into l
orderby l.Count() descending, l.Key
select new
{
ExceptionType = l.Key,
ExceptionCount = l.Count()
};

foreach (var resultItem in summaryResults)
{
// Generate pivots for each exception type
EventLogCollectionSource source =
new EventLogCollectionSource(messages,
resultItem.ExceptionType,
inputFile);
...
targetFilter1.Write(sourceFilter1);
}

Once we we have this code and everything has been generated, if we open the output folder, we’ll see the following structure:

image

We see our ExceptionSummary pivot collection and all of the deep zoom folders.  So, when we open the Pivot tool, we’ll see a nice parent collection:

image

This gives us a nice breakdown of the number of occurrences for each exception in our source data.  Immediately we see an outlier (more on that later) between the 6,000 and 7,000 item mark and when we select that tile, we see the following:

image

We also see a green “Open” box (surrounded in a red rectangle, my emphasis) which links to our NullReferenceException.cxml.  When we click that box, the tool will immediately open that collection in the UI for our perusal – providing a very similar look to what we saw in the last entry:

image

Closing Thoughts

Now, you may have noticed a contradiction above.  I said that a collection should have no more than 3,000 items and yet, with the NullReferenceException collection, we saw in the summary that it had over 6,000 items.  That is a very good point and will be a subject of a future blog post.  I wanted to illustrate the simple collections and the linked collections before we got into that third type of collection from above – the Dynamic Collection.  Stay tuned!

# Monday, October 4, 2010
Monday, October 4, 2010 10:43:49 AM (Central Daylight Time, UTC-05:00) ( ASP.NET | Development | Pivot )

logo-pivotUnless you’ve been hiding under the proverbial rock, you’ve probably seen the recent Pivot hoopla.  If you’re not familiar with it, it’s a way to visualize a large amount of data in a nice filterable format.  The nice thing about it is that it’s really easy to put together a pivot collection and there are a ton of tools available for just this purpose.  Just do a search on CodePlex for Pivot and you’ll get about 40’ish good results for tools you can use to create a Pivot Collection. 

So, I was putting together a proof-of-concept for an internal project and thought I would continue on with my series of blog posts on ASP.NET Error Message event logs with a post on how to visualize this data using a pivot.  You may wish to read parts 1 and 2 here:

So, when I put together my pivot, I worked out a 3 step process:

  1. Figure out what you want to Pivot
  2. Find an API and convert the data
  3. Generate and Test the collection

Let’s begin, shall we.

Figure out what you want to Pivot

The structure for the Pivot Collection is deceptively simple -

<?xml version="1.0"?> 
<Collection Name="Hello World Collection" …>
<FacetCategories>
<FacetCategory Name="Hello World Facet Category One" Type="String"/>
</FacetCategories>
<Items ImgBase="helloworld.dzc">
<Item Img="#0" Id="0" Href="http://www.getpivot.com" Name="Hello World!">
<Description> This is the only item in the collection.</Description>
<Facets>
<Facet Name="Hello World Facet Category One">
<String Value="Hello World Facet Value"/>
</Facet>
</Facets>
</Item>
</Items>
</Collection>

The way that I think about the Items in the Collection are in the same way that you might think about an object.  For example, a Car object might have the following properties:

  • Advertising blurb
  • Car and Driver Reviews
  • Color
  • Make
  • Model
  • Engine
  • 0-60mph time
  • Max Speed

The common values like the Color, Make, Model, 0-60mph time and max speed become the facets or attributes that describe your object in relation to other instances of objects.  Things like the advertising blurbs and car and driver reviews or descriptions belong instead as properties of your Item directly in the Description element.

For our data, namely ASP.NET exceptions, we’re going to define an exception as the following:

  • Item
    • Name = Exception Type
    • Description = Exception Message
    • Facets
      • Request Path
      • Stack Trace
      • Event Time
      • Top Method of Stack Trace
      • Top My Code Method of Stack Trace

This should allow us to group and drill through the common properties that might link exceptions together and still provide detailed error information when necessary.

Find an API and code it

The second step here is to find some code/API/tool that we can enhance for our purposes.  There are some great tools published by the Live Labs team – for example:

While both tools could be used in this instance, in part 2 we found that some of our Event Logs we were parsing contained more than 10,000 items and I wanted a bit more control over how I converted the data.  “No touching” is the phrase of the day.  Fortunately, the command line tool was published on CodePlex with an API we can use.  Once you download the product you see that it contains 3 assemblies:

image

The last item there is the PauthorLib.dll which encapsulates many of the extension points within this great tool.  In-fact, it exposes about 7 different namespaces for our purposes:

image

For our purposes, we are going to focus on the Streaming set of namespaces.  Why?  Well, this is namely because we are going to be dealing with a lot of data and I didn’t want to load everything into memory before writing it to disk.  If you look at the contents of the Streaming namespace, you’ll see a great class called “AbstractCollectionSource”.  This looks fairly promising because it exposes two main methods:

class EventLogExceptionCollectionSource : AbstractCollectionSource
{
protected override void LoadHeaderData()
{
throw new NotImplementedException();
}

protected override IEnumerable<PivotItem> LoadItems()
{
throw new NotImplementedException();
}
}

Before we do anything, though, we need a constructor.  The constructor will be responsible for taking a string representing the path to our data and passing it to our base class’s constructor.

public EventLogExceptionCollectionSource(string filePath)
: base(filePath)
{

// Do nothing else.

}
Then, the first method, LoadHeaderData, is where we define our facets – Request Path, Stack Trace, etc.  - as well as the data types that each facet will be.  So, our code will be fairly simple and straight-forward:
protected override void LoadHeaderData()
{

this.CachedCollectionData.FacetCategories.Add(
new PivotFacetCategory(STACKTRACE,
PivotFacetType.LongString));
this.CachedCollectionData.FacetCategories.Add(
new PivotFacetCategory(REQUESTPATH,
PivotFacetType.String));
this.CachedCollectionData.FacetCategories.Add(
new PivotFacetCategory(EVENTTIME,
PivotFacetType.DateTime));
this.CachedCollectionData.FacetCategories.Add(
new PivotFacetCategory(TOPMETHOD,
PivotFacetType.String));
this.CachedCollectionData.FacetCategories.Add(
new PivotFacetCategory(TOPAPPMETHOD,
PivotFacetType.String));

this.CachedCollectionData.Name = "Event Log Error Messages";


}

The second method, LoadItems(), is responsible for doing exactly what it suggests – this is where we load the data from whichever source we care about and then convert it into our PivotItem collection.  For our purposes, we’re going to load the XML file we defined in Part 1 of this series into a list of EventLogMessage objects and then convert those EventLogMessage objects into PivotItem objects:

protected override IEnumerable<PivotItem> LoadItems()
{
// Load XML file
XDocument document = XDocument.Load(this.BasePath);

// Populate collection of EventLogMessage objects
var messages = from message in document.Descendants("Message")
select EventLogMessage.Load(message.Value);

int index = 0;
foreach (EventLogMessage message in messages)
{

PivotItem item = new PivotItem(index.ToString(), this);
item.Name = message.Exceptiontype;
item.Description = message.Exceptionmessage;
item.AddFacetValues(REQUESTPATH, message.Requestpath);
item.AddFacetValues(STACKTRACE, message.Stacktrace);
item.AddFacetValues(EVENTTIME, message.Eventtime);
item.AddFacetValues(TOPMETHOD, message.StackTraceFrames[0].Method);
item.AddFacetValues(TOPAPPMETHOD, GetFirstNonMicrosoftMethod(message.StackTraceFrames));

index++;
yield return item;

}
}

The key method calls from above are the AddFacetValues(…) method calls.  This method essentially sets the attributes we wish to have our data pivot upon.  This, by itself, isn’t enough to generate our great pivot calls – we need to call our code from somewhere.  Since this is a simple app, we’re going to make it a console app.  For our Collection to get generated we need to use a few other objects included in this API:

  • EventLogExceptionCollectionSource – The class we created above.
  • HtmlImageCreationSourceFilter – This class will generate the tile in the Pivot based upon some HTML template we specify.
  • LocalCxmlCollectionTarget – Generates the Collection XML file at the path we specify.
  • DeepZoomTargetFilter – Generates the deep zoom files to support our collection XML file and also enables all of our fancy transitions.

In practice, the code is pretty simple and straight forward and I applaud the people who wrote this library:

private static void GenerateExceptionPivot(string inputFile, string outputFolder)
{
string collectionName = Path.Combine(outputFolder, "MyExceptions.cxml");

EventLogExceptionCollectionSource source =
new EventLogExceptionCollectionSource(inputFile);
HtmlImageCreationSourceFilter sourceFilter1 =
new HtmlImageCreationSourceFilter(source);
sourceFilter1.HtmlTemplate =
"<html><body><h1>{name}</h1>{description}</body></html>";
sourceFilter1.Width = 600;
sourceFilter1.Height = 600;

LocalCxmlCollectionTarget target =
new LocalCxmlCollectionTarget(collectionName);
DeepZoomTargetFilter targetFilter1 =
new DeepZoomTargetFilter(target);
targetFilter1.Write(sourceFilter1);

}

That last statement, targetFilter1.Write(…) is what will actually execute everything and write our resultant files to disk.

Generate and Test the collection

So, now if we run our console application and call that GenerateExceptionPivot(…) method, we’ll get some great output.

image

What’s nice about the Library is that it provides progress as it iterates through your data (in the red rectangle) and also in the blue rectangle, we need that it’s multi-threaded by default.  This is primarily for the most intensive part of the operation – the creation of the deep zoom artifacts.  If you have one of those new fangled machines with 2+ cores, you can tweak the number of threads that it will spawn for this operation by setting the ThreadCount property of the DeepZoomTargetFilter object.  This may or may not improve your performance but it’s nice that the option is available.

...
DeepZoomTargetFilter targetFilter1 =
new DeepZoomTargetFilter(target);

targetFilter1.ThreadCount = 100;
targetFilter1.Write(sourceFilter1);
...

Once our collection has been generated, we can browse it in an explorer.exe window just to get an idea of what our code has wrought:

image

And then to test it, you can just point the Live Labs Pivot application at our “MyExceptions.cxml” file and view the wonderful data.  For example, you can look at the Event Time in the histogram view to see how your exceptions broke down over time.  You can also filter your data by things like the RequestPath (the page that threw the exception) or the Method that was at the top of the callstack.

image

Then, you can zoom in on a specific time slice you care about:

image

Then, if you want to view the details for a specific instance, just click the corresponding tile.  Then, a new side bar will appear on the right hand side with all of the details we stored in this record:

image

We generated a single collection in this blog post.  One thing to keep in mind is that each collection should have no more than 3,000 items.  For collections in which you want to have more than 3,000 items, you should look at potentially creating a Linked Collection.  That will be the subject of an upcoming blog post. 

Until next time!

# Sunday, August 29, 2010
Sunday, August 29, 2010 5:10:07 PM (Central Daylight Time, UTC-05:00) ( ASP.NET | Best Practice | Development )

image Sometimes a customer will ask me to look at their site and make some recommendations on what can be improved.  One of the many things I’ll look at is their event logs.  One of the nice things about ASP.NET is that when you encounter an unhandled exception, an event will be placed into your Application event log.  The message of the event log entry will usually include lots of good stuff like the application, path, machine name, exception type, stack trace, etc.  Loads of great stuff and all for free.  For customers that don’t have a centralized exception logging strategy, this can be a gold mine.

The way it usually works is that they will provide me an EVTX from their servers.  If you’re not aware, an EVTX is just an archive of the events from the event log you specify.  By itself, looking at the raw event logs from a server can be quite daunting.  There are usually thousands of entries in the event log and filtering down to what you actually care about can be exhausting.  Even if you do find a series of ASP.NET event log messages, the problem has always been – how do you take all of this great information that’s just dumped into the Message property of the event log entry and put it into a format you can easily report on, generate statistics, etc.  Fortunately, I have a non-painful solution. 

I’ve broken this down into a relatively simple 4-step process:

  • Get the EVTX
  • Generate a useful XML file
  • Parse into an object model
  • Analyze and report on the data

Let’s get to it.

Step 1:  Get the EVTX

This step is pretty short and sweet.  In the Event Log manager, select the “Application” group and then select the “Save All Events As…” option. 

image

That will produce an EVTX file with whatever name you specify.  Once you have the file, transfer it to your machine as you generally do not want to install too many tools in your production environment.

Step 2:  Generate a useful XML file

Now that we have the raw EVTX file, we can get just the data we care about using a great tool called LogParserJeff Atwood did a nice little write-up on the tool but simply put it’s the Swiss Army knife of parsing tools.  It can do just about anything data related you would wish using a nice pseudo-SQL language.  We’ll use the tool to pull out just the data from the event log we want and dump it into an XML file.  The query that we can use for this task is daunting in its simplicity:

SELECT Message INTO MyData.xml
FROM ‘*.evtx’
WHERE EventID=1309

The only other thing we need to tell LogParser is the format in which it the data is coming in and the format to put it into.  This makes our single command the following:

C:\>logparser -i:EVT -o:XML
"SELECT Message INTO MyData.xml FROM ‘*.evtx’ WHERE EventID=1309"

This will produce a nice XML file that looks something like the following:

<?xml version="1.0" encoding="ISO-10646-UCS-2" standalone="yes" ?>
<ROOT DATE_CREATED="2010-08-29 06:04:20" CREATED_BY="Microsoft Log Parser V2.2">
<ROW>
<Message>Event code: 3005 Event message: An unhandled exception has occurred...
</Message>
</ROW>
...
</ROOT>

One thing that you may notice is that all of the nicely formatted data from our original event log message is munged together into one unending string.  This will actually work in our favor but more on that in the next step.

Step 3:  Parse into an object model

So, now that we have an XML file with all of our event details, let’s do some parsing.  Since all of our data is in one string, the simplest method is to apply a RegEx expression with grouping to grab the data we care about. 

In a future post, I’ll talk about a much faster way of getting this type of data without a RegEx expression.  After all, refactoring is a way of life for developers.

private const string LargeRegexString = @"Event code:(?<Eventcode>.+)" +
@"Event message:(?<Eventmessage>.+)" +
@"Event time:(?<Eventtime>.+)" +
@"Event time \(UTC\):(?<EventtimeUTC>.+)" +
@"Event ID:(?<EventID>.+)" +
@"Event sequence:(?<Eventsequence>.+)" +
@"Event occurrence:(?<Eventoccurrence>.+)" +
@"Event detail code:(?<Eventdetailcode>.+)" +
@"Application information:(?<Applicationinformation>.+)" +
@"Application domain:(?<Applicationdomain>.+)" +
@"Trust level:(?<Trustlevel>.+)" +
@"Full Application Virtual Path:(?<FullApplicationVirtualPath>.+)" +
@"Application Path:(?<ApplicationPath>.+)" +
@"Machine name:(?<Machinename>.+)" +
@"Process information:(?<Processinformation>.+)" +
@"Process ID:(?<ProcessID>.+)" +
@"Process name:(?<Processname>.+)" +
@"Account name:(?<Accountname>.+)" +
@"Exception information:(?<Exceptioninformation>.+)" +
@"Exception type:(?<Exceptiontype>.+)" +
@"Exception message:(?<Exceptionmessage>.+)" +
@"Request information:(?<Requestinformation>.+)" +
@"Request URL:(?<RequestURL>.+)" +
@"Request path:(?<Requestpath>.+)" +
@"User host address:(?<Userhostaddress>.+)" +
@"User:(?<User>.+)" +
@"Is authenticated:(?<Isauthenticated>.+)" +
@"Authentication Type:(?<AuthenticationType>.+)" +
@"Thread account name:(?<Threadaccountname>.+)" +
@"Thread information:(?<Threadinformation>.+)" +
@"Thread ID:(?<ThreadID>.+)" +
@"Thread account name:(?<Threadaccountname>.+)" +
@"Is impersonating:(?<Isimpersonating>.+)" +
@"Stack trace:(?<Stacktrace>.+)" +
@"Custom event details:(?<Customeventdetails>.+)";

Now that we have our RegEx, we’ll just write the code to match it against a string and populate our class.   While I’ve included the entire regex above, I’ve only included a partial implementation of the class population below.

public class EventLogMessage
{

private static Regex s_regex = new Regex(LargeRegexString, RegexOptions.Compiled);

public static EventLogMessage Load(string rawMessageText)
{

Match myMatch = s_regex.Match(rawMessageText);
EventLogMessage message = new EventLogMessage();
message.Eventcode = myMatch.Groups["Eventcode"].Value;
message.Eventmessage = myMatch.Groups["Eventmessage"].Value;
message.Eventtime = myMatch.Groups["Eventtime"].Value;
message.EventtimeUTC = myMatch.Groups["EventtimeUTC"].Value;
message.EventID = myMatch.Groups["EventID"].Value;
message.Eventsequence = myMatch.Groups["Eventsequence"].Value;
message.Eventoccurrence = myMatch.Groups["Eventoccurrence"].Value;
...
return message;
}

public string Eventcode { get; set; }
public string Eventmessage { get; set; }
public string Eventtime { get; set; }
public string EventtimeUTC { get; set; }
public string EventID { get; set; }
public string Eventsequence { get; set; }
public string Eventoccurrence { get; set; }
...
}

The last step is just to read in the XML file and instantiate these objects.

XDocument document = XDocument.Load(@"<path to data>\MyData.xml");

var messages = from message in document.Descendants("Message")
select EventLogMessage.Load(message.Value);

Now that we have our objects and everything is parsed just right, we can finally get some statistics and make sense of the data.

Step 4:  Analyze and report on the data

This last step is really the whole point of this exercise.  Fortunately, now that all of the data is an easily query’able format using our old friend LINQ, the actual aggregates and statistics are trivial.  Really, though, everyone’s needs are going to be different but I’ll provide a few queries that might be useful.

Query 1:  Exception Type Summary

For example, let’s say you wanted to output a breakdown of the various Exception Types in your log file.  The query you would use for that would be something like:

var results = from log in messages
group log by log.Exceptiontype into l
orderby l.Count() descending, l.Key
select new
{
ExceptionType = l.Key,
ExceptionCount = l.Count()
};

foreach (var result in results)
{

Console.WriteLine("{0} : {1} time(s)",
result.ExceptionType,
result.ExceptionCount);

}

This would then output something like:

WebException : 15 time(s)
InvalidOperationException : 7 time(s)
NotImplementedException : 2 time(s)
InvalidCastException : 1 time(s)
MissingMethodException : 1 time(s)

Query 2:  Exception Type and Request URL Summary

Let’s say that you wanted to go deeper and get the breakdown of which URL’s generated the most exceptions.  You can just expand that second foreach loop in the above snippet to do the following:

foreach (var result in results)
{

Console.WriteLine("{0} : {1} time(s)",
result.ExceptionType,
result.ExceptionCount);

var requestUrls = from urls in messages
where urls.Exceptiontype == result.ExceptionType
group urls by urls.RequestURL.ToLower() into url
orderby url.Count() descending, url.Key
select new
{
RequestUrl = url.Key,
Count = url.Count()
};

foreach (var url in requestUrls){

Console.WriteLine("\t{0} : {1} times ",
url.RequestUrl,
url.Count);
}
}

This then would produce output like this:

WebException  : 15 time(s)
http://localhost/menusample/default.aspx : 11 times
http://localhost:63188/menusample/default.aspx : 4 times
InvalidOperationException : 7 time(s)
http://localhost:63188/menusample/default.aspx : 6 times
http://localhost/menusample/default.aspx : 1 times
NotImplementedException : 2 time(s)
http://localhost/samplewebsiteerror/default.aspx : 2 times
InvalidCastException : 1 time(s)
http://localhost:63188/menusample/default.aspx : 1 times
MissingMethodException : 1 time(s)
http://localhost:63188/menusample/default.aspx : 1 times

Query 3:  Exception Type, Request URL and Method Name Summary

You can even go deeper, if you so desire, to find out which of your methods threw the most exceptions.  For this to work, we need to make a slight change to our EventLogMessage class to parse the Stack Trace data into a class.  First, we’ll start with our simple little StackTraceFrame class:

public class StackTraceFrame
{
public string Method { get; set; }

}

Second, add a new property to our EventLogMessage class to hold a List<StackTraceFrame>:

public List<StackTraceFrame> StackTraceFrames { get; set; }

Lastly, add a method (and its caller) to parse out the stack frames and assign the resulting List to the StackTraceFrames property mentioned above:

public EventLogMessage(string rawMessageText)
{
Match myMatch = s_regex.Match(rawMessageText);
...
Stacktrace = myMatch.Groups["Stacktrace"].Value;
...
StackTraceFrames = ParseStackTrace(Stacktrace);
}

private List<StackTraceFrame> ParseStackTrace(string stackTrace)
{
List<StackTraceFrame> frames = new List<StackTraceFrame>();
string[] stackTraceSplit = stackTrace.Split(new string[] { " at " },
StringSplitOptions.RemoveEmptyEntries);
foreach (string st in stackTraceSplit)
{
if (!string.IsNullOrEmpty(st))
{
frames.Add(new StackTraceFrame() { Method = st });
}
}
return frames;
}

Please Note:  You could enhance the ParseStackTrace(…) method to parse out the source files, line numbers, etc. I’ll leave this as an exercise for you, dear reader.

Now that we have the infrastructure in place, the query is just as simple.  We’ll just nest this additional query inside of our URL query like so:

foreach (var url in requestUrls){

Console.WriteLine("\t{0} : {1} times ",
url.RequestUrl,
url.Count);

var methods = from method in messages
where string.Equals(method.RequestURL,
url.RequestUrl,
StringComparison.InvariantCultureIgnoreCase)
&&
string.Equals(method.Exceptiontype,
result.ExceptionType,
StringComparison.InvariantCultureIgnoreCase)
group method by method.StackTraceFrames[0].Method into mt
orderby mt.Count() descending, mt.Key
select new
{
MethodName = mt.Key,
Count = mt.Count()
};

foreach (var method in methods)
{
Console.WriteLine("\t\t{0} : {1} times ",
method.MethodName,
method.Count
);
}
}

This would then produce output like the following:

WebException  : 15 time(s)
http://localhost/menusample/default.aspx : 11 times
System.Net.HttpWebRequest.GetResponse() : 11 times
http://localhost:63188/menusample/default.aspx : 4 times
System.Net.HttpWebRequest.GetResponse() : 4 times
InvalidOperationException : 7 time(s)
http://localhost:63188/menusample/default.aspx : 6 times
System.Web.UI.WebControls.Menu... : 6 times
http://localhost/menusample/default.aspx : 1 times
System.Web.UI.WebControls.Menu... : 1 times

One last thing you may notice is that the in the example above, the first frame for each of those exceptions are somewhere in the bowels of the .NET BCL.  You may want to filter this out even further to only return YOUR method.  This can be accomplished very easily with the method below.  It will simply loop through the StackTraceFrame List and return the first method it encounters that does not start with “System.” or “Microsoft.”.

private static string GetMyMethod(List<StackTraceFrame> frames)
{

foreach (StackTraceFrame frame in frames)
{

if (!frame.Method.StartsWith("System.") &&
!frame.Method.StartsWith("Microsoft."))
return frame.Method;


}

return "No User Code detected.";
}

Then, you can just call that method from the new query we wrote above:

var methods = from method in messages
where ...
group method by
GetMyMethod(method.StackTraceFrames) into mt
...

Finally, with this new snippet in place, we’ll get output like this:

WebException  : 15 time(s)
http://localhost/menusample/default.aspx : 11 times
_Default.Page_Load(Object sender, EventArgs e)...: 8 times
No User Code detected. : 3 times

http://localhost:63188/menusample/default.aspx : 4 times
_Default.Page_Load(Object sender, EventArgs e)... : 1 times
No User Code detected. : 1 times
WebControls.CustomXmlHierarchicalDataSourceView.Select()... : 2 times

As you can see, the sky’s the limit.

Enjoy!

# Friday, August 20, 2010
Friday, August 20, 2010 11:36:27 PM (Central Daylight Time, UTC-05:00) ( .NET | Best Practice | C# | Code Reviews | Development | Performance )

cheetah This is a common topic and I thought I’d write up some thoughts I have on it.  In-fact, I was just working with a customer on improving their code reviews and what they should be checking for and the question arose - “Should performance be targeted during a code review?”  It’s an interesting question.  I’m a big fan of performance testing early and often and not waiting until the end of a dev cycle but code reviews, IMO, should focus on logic, maintainability and best practices.  I may be in the minority and if you look around the web, you’ll see varying opinions on the topic.  For example, one of the PAG articles states:

“Code reviews should be a regular part of your development process. Performance and scalability code reviews focus on identifying coding techniques and design choices that could lead to performance and scalability issues. The review goal is to identify potential performance and scalability issues before the code is deployed. The cost and effort of fixing performance and scalability flaws at development time is far less than fixing them later in the product deployment cycle.

Avoid performance code reviews too early in the coding phase because this can restrict your design options. Also, bear in mind that that performance decisions often involve tradeoffs. For example, it is easy to reduce maintainability and flexibility while striving to optimize code.”

As I mentioned above, I am a huge proponent of performance analysis and optimization many times throughout a typical product development cycle.  I can say with a fair amount of certainty that if you don’t build performance reviews into your project plan at regular intervals, you will hit some problem (or multiple problems) in production and have to refactor some code. 

Circling back to the original question, though, are code reviews the place for performance analysis?  Typically, I’d recommend using them to squash little bits of bad code but maintainability and code-cleanliness should be first and foremost in your minds.  That said, if you see a pattern that you know can be improved, by all means bring it up.  What’s an example of that type of situation? 

Let’s take a look at predicates, specifically their usage in the Find method of a List<T>.  If you’re not aware, the Find() method performs a linear search through all of the items until it finds the first match – then it returns.  This makes it a O(n) operation where “n” is the number of items in the list.  Basically, this means that the more items you have in the list, the longer a Find() operation can potentially take.  So, if we slam about 10,000 elements into a list:

private static List<Data> LoadList()
{
List<Data> myList = new List<Data>();
for (int i = 0; i < 10000; i++)
{
myList.Add(new Data() { Id = "Id" + i.ToString(),
Value = "Value" + i.ToString() });
}

return myList;
}

Then, if someone wants to return the instance of the Data class that contains an Id of say “Id10000”, they might write the following code:

static Data Find1(List<Data> myList, string idToFind)
{
Data data = myList.Find(s =>
s.Id.ToLower() ==
idToFind.ToLower());

return data;
}

Now, keep in mind that the predicate is executed for each element in the List<T> until it finds the instance you care about.  With that in mind, we would probably want to refactor out the “idToFind.ToLower()” above the predicate since that value isn’t changing.  So, you might end-up with something like this:

static Data Find2(List<Data> myList, string idToFind)
{

idToFind = idToFind.ToLower();

Data data = myList.Find(s =>
s.Id.ToLower() ==
idToFind);

return data;
}

Another route you may want to go is just to use the string.Equals(…) method to perform the comparison.  That would look like:

static Data Find3(List<Data> myList, string idToFind)
{

Data data = myList.Find(s =>
string.Equals(
s.Id,
idToFind,
StringComparison.
InvariantCultureIgnoreCase)
);

return data;

}

Fact is, the last method IS the fastest way to perform the operation.  I can say that without even needing to run it through a profiler.  But if you don’t believe me…  

Function Name
Elapsed
Inclusive Time
...Find1(System.Collections.Generic.List`1<....Data>,string)
6.34
...Find2(System.Collections.Generic.List`1<....Data>,string)
4.47
...Find3(System.Collections.Generic.List`1<....Data>,string)
3.65

That’s something I might put into the category of a micro-optimization AND just good coding practice.  But is this something that should be caught during a code review?  I’d say “yes” because logically it all makes sense and none of the solutions would really hurt maintainability or readability of the code.
 
So, I’d tag this as a good coding practice.  Other thoughts on the topic?
 
Enjoy!
# Saturday, August 14, 2010
Saturday, August 14, 2010 12:45:39 PM (Central Daylight Time, UTC-05:00) ( ASP.NET | Development | Premier Field Engineer (PFE) )

11954221481914068549johnny_automatic_mister_fix_it_svg_hi In my job as a PFE for Microsoft, I read, review and fix a lot of code.  A lot of code.  It’s a large part of what I love about my job.  The code is generally written by large corporations or for public websites.  Every now and again I’ll get pinged on an issue and after troubleshooting the issue, it’s pretty clear that the core issue is with some community code.  When I say community code, in this instance, I don’t mean a CodeProject or CodePlex project.  In this case, I am referring to a control that Denis Bauer created and then made available to the community on his website – the “DynamicControlsPlaceholder” control.  This is a great little control that inherits from a PlaceHolder and allows you to  create dynamic controls on the fly and then it will persist the controls you add on subsequent requests – like a postback.

The Problem

The customer was experiencing a problem that could only be replicated in a web farm when they don’t turn on sticky sessions.  They found that when a request went from one server to another server in their farm they would get a FileNotFoundException with the following details:

Type Of Exception:FileNotFoundException
Message:Error on page http://blahblahblah.aspx
Exception Information:System.IO.FileNotFoundException:
Could not load file or assembly 'App_Web_myusercontrol.ascx.cc671b29.ypmqvhaw,
Version=0.0.0.0, Culture=neutral, PublicKeyToken=null' or one of its dependencies.
The system cannot find the file specified.
File name: 'App_Web_myusercontrol.ascx.cc671b29.ypmqvhaw,
Version=0.0.0.0, Culture=neutral, PublicKeyToken=null'
at System.RuntimeTypeHandle.GetTypeByName(String name,
Boolean throwOnError,
Boolean ignoreCase,
Boolean reflectionOnly,
StackCrawlMark& stackMark)
...
at DynamicControlsPlaceholder.RestoreChildStructure(Pair persistInfo,
Control parent)
at DynamicControlsPlaceholder.LoadViewState(Object savedState)
at System.Web.UI.Control.LoadViewStateRecursive(Object savedState)
...
at System.Web.UI.Control.LoadViewStateRecursive(Object savedState)
at System.Web.UI.Page.LoadAllState()
at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint,
Boolean includeStagesAfterAsyncPoint)

So, we can gleam a few things from the error details:

  • They are using the ASP.NET website model (the “app_web_….dll” assembly is the clue).
  • The error is occurring in the RestoreChildStructure method of the DynamicControlsPlaceholder control.

The Research

The way that ASP.NET Websites work is that each component of your site can be compiled into a separate assembly.  The assembly name is randomly generated.  This also means that on two servers, the name of the assemblies can end up being different.  So, an assumption to make is that something is trying to load an assembly by its name.  If we look at the RestoreChildStructure method, we see the following:


Type ucType = Type.GetType(typeName[1], true, true);

try
{
MethodInfo mi = typeof(Page).GetMethod("LoadControl",
new Type[2] { typeof(Type), typeof(object[]) });
control = (Control) mi.Invoke(this.Page, new object[2] { ucType, null });
}
catch (Exception e)
{
throw new ArgumentException(String.Format("The type '{0}' …",
ucType.ToString()), e);
}

The important thing to look at here is the Type.GetType(…) call.  Since the code for the control is in a separate assembly from everything else, the “typeName[1]” value MUST BE A FULLY QUALIFIED ASSEMBLY NAME.  From the exception details, we can see that it is attempting to load the type from the following string:

App_Web_myusercontrol.ascx.cc671b29.ypmqvhaw, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null

The “typeName[1]” variable is loaded from ViewState because that’s where the control persists its child structure.  So, for some reason the fully qualified assembly name is stored in ViewState.  If we look at the code that inserts the value into ViewState (in the PersistChildStructure(…) method), we see:


typeName = "UC:" + control.GetType().AssemblyQualifiedName;

So, here we see the AssemblyQualifiedName is being stored into ViewState – which is then used to persist the controls across postback using the above code.  As I mentioned, this won’t work with an ASP.NET website hosted in a web farm because the assembly qualified name will probably be different from server to server.  We even have a KB article that discusses this issue somewhat

The Fix

Fortunately, the fix is pretty simple. 

First, we need to store the path to the User Control instead of the AQN in ViewState.  To do this, you can comment out the “typeName = ….” line from directly above and replace it with:


UserControl uc = control as UserControl;
typeName = "UC:" + uc.AppRelativeVirtualPath;

So, now we store the path to the UserControl in ViewState.  Then, we need to fix the code that actually loads the control.  Replace the code from above in the RestoreChildStructure(…) method with this code:


string path = typeName[1];

try
{
control = Page.LoadControl(path);
}
catch (Exception e)
{
throw new ArgumentException(String.Format(
"The type '{0}' cannot be recreated from ViewState",
path), e);
}

That’s all there is to it.  Just load the user control from where it is being stored in the site and ASP.NET will take care of loading the appropriate assembly.

Enjoy!

# Monday, August 9, 2010
Monday, August 9, 2010 1:09:32 AM (Central Daylight Time, UTC-05:00) ( Development | OpenXML )

logo_Office_2010 I was working on an internal project a bit ago and one of the requirements was to implement a fancy Word document.  The idea was that all of the editing of the text/code samples/etc. would be done in the application and then the user could just export it to Word to put any finishing touches and send off to the customer.  The final report needed to include section headers, page breaks, a table of contents, etc.  There are a number of ways we could have accomplished the task.  There’s the Word automation stuff that relies upon a COM based API, there’s the method of just creating an HTML document and loading that into Word and then finally there’s the Open XML API.  Now, someone had hacked up a version of this export functionality previously using the Word automation stuff but considering we’re often dealing with 1,000+ page documents – it turned out to be a little slow.  Also, there are some restrictions around using the automation libraries in a server context.  Lastly, since my OpenXML kung-fu is strong, I thought I would take the opportunity to implement a better, more flexible and much faster solution.  For those just starting out, Brian and Zeyad’s excellent blog on the topic is invaluable

One of the requirements for the export operation was to have Word automagically refresh the table of contents (and other fields) the first time the document is opened.  This was something that took a bit of time to research but you really end up with 2 options:

w:updateFields Element

The “w:updateFields” element is a document-level element that is set in the document settings part and tells Word to update all of the fields in the document:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<w:settings>
<w:updateFields w:val="true" />

</w:settings>

If you’re wondering what the document settings part is – just rename a Word doc from “blah.docx” to “blah.docx.zip” and extract it to a folder on your computer.  In the new folder is a directory called “word”.  In that directory, you should see a file called “settings.xml”:

image In that file are all of the document level settings for your docx.  There’s some really great stuff in here

If you’d like to use the OpenXML SDK to set that value (and you’d be crazy not to), here’s some sample code:

using (WordprocessingDocument document = WordprocessingDocument.Open(path, true))
{

DocumentSettingsPart settingsPart =
document.MainDocumentPart.GetPartsOfType<DocumentSettingsPart>().First();

// Create object to update fields on open
UpdateFieldsOnOpen updateFields = new UpdateFieldsOnOpen();
updateFields.Val = new DocumentFormat.OpenXml.OnOffValue(true);

// Insert object into settings part.
settingsPart.Settings.PrependChild<UpdateFieldsOnOpen>(updateFields);
settingsPart.Settings.Save();

}

w:dirty Attribute

This attribute is applied to the field you would like to have refreshed when the document is opened in Word.  It tells Word to only refresh this field the next time the document is opened.  For example, if you want to apply it to a field like your table of contents, just find the w:fldChar and add that attribute:

<w:r>
<w:fldChar w:fldCharType="begin" w:dirty="true"/>
</w:r>

For a simple field, like the document author, you’ll want to add it to the w:fldSimple element, like so:

<w:fldSimple w:instr="AUTHOR \* Upper \* MERGEFORMAT"
w:dirty="true" >
<w:r>
...
</w:r>
</w:fldSimple>

A caveat or two

Both of these methods will work just fine in Word 2010. 

In Word 2007, though, you need to clear out the contents of the field before the user opens the document.  For example, with a table of contents, Word will normally cache the contents of the TOC in the fldChar element.  This is good, normally, but here it causes a problem. 

For example, in a very simple test document, you would see the following cached data (i.e.:  Heading 1, Heading 2, etc.):

<w:p w:rsidR="00563999" w:rsidRDefault="00050B09">
...
<w:r>
<w:fldChar w:fldCharType="begin"/>
</w:r>
<w:r w:rsidR="00563999">
<w:instrText xml:space="preserve"> TOC \* MERGEFORMAT </w:instrText>
</w:r>
</w:p>
<w:p w:rsidR="00F77370" w:rsidRDefault="00F77370">
...
<w:r>
...
<w:t>Heading 1</w:t>
</w:r>
...
</w:p>
<w:p w:rsidR="00F77370" w:rsidRDefault="00F77370">
...
<w:r>
...
<w:t>Heading 2</w:t>
</w:r>
...
</w:p>
<w:p w:rsidR="00F77370" w:rsidRDefault="00F77370">
...
<w:r>
<w:rPr>
<w:noProof/>
</w:rPr>
<w:fldChar w:fldCharType="end"/>
</w:r>
</w:p>

After you clear out the schmutz, you end up with just the begin element, the definition of the TOC and the end element:

<w:p w:rsidR="00563999" w:rsidRDefault="00563999">
...
<w:r>
<w:fldChar w:fldCharType="begin"/>
</w:r>
<w:r>
<w:instrText xml:space="preserve"> TOC \* MERGEFORMAT </w:instrText>
</w:r>
</w:p>
<w:p w:rsidR="00B63C3C" w:rsidRDefault="00563999" w:rsidP="00B63C3C">
<w:r>
<w:fldChar w:fldCharType="end"/>
</w:r>
...
</w:p>

Once you’ve made the updates, you can safely open up your file in Word 2007 and your fields will update when the document opens.

Big thanks for Zeyad for his tip on trimming out the schmutz.

Just to stress, this is improved in Word 2010 and you no longer need to clear out the cached data in your fields.

Enjoy!

# Wednesday, August 4, 2010
Wednesday, August 4, 2010 1:23:21 AM (Central Daylight Time, UTC-05:00) ( Bing | Development | Google | Online Translator )

imageAn online translator really isn’t all that new.  They’ve been around for at least 8 years or so.  I remember the days when I would use Babelfish for all of my fun translations.  It was a great way to get an immediate translation for something non-critical.  The problem in a lot of cases was grammatical correctness.  Translating word for word isn’t particularly difficult but context and grammar varies so much between languages that it was always challenging to translate entire sentences, paragraphs, passages, etc. from one language to another. 

Fortunately the technology has improved a lot over the years. Now, you can somewhat reliably translate entire web pages from one language to another.  I’m not saying it’s without fault – but I am saying that it’s gotten a lot better over time. These days there are a few big players in this space.  Notably Google Translate, Babelfish and the Bing Translator.  The interesting thing I’ve found is that only Bing actually has a supported API into its translation service

There are 3 primary ways to interact with the service:

They all seem to expose the same methods but it’s just the way you call them that differs.  For example, the sample code published for the HTTP method looks like:

   1: string appId = "myAppId";
   2: string text = "Translate this for me";
   3: string from = "en";
   4: string to = "fr";
   5:  
   6: string detectUri = "http://api.microsofttranslator.com/v2/Http.svc/Translate?appId=" + appId +
   7:     "&text;=" + text + "&from;=" + from + "&to;=" + to;
   8: HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create(detectUri);
   9: WebResponse resp = httpWebRequest.GetResponse();
  10: Stream strm = resp.GetResponseStream();
  11: StreamReader reader = new System.IO.StreamReader(strm);
  12: string translation = reader.ReadToEnd();
  13:  
  14: Response.Write("The translated text is: '" + translation + "'.");

Then, for the SOAP method:

   1: string result;
   2: TranslatorService.LanguageServiceClient client = 
   3:                     new TranslatorService.LanguageServiceClient(); 
   4: result = client.Translate("myAppId", 
   5:                           "Translate this text into German", 
   6:                           "en", "de"); 
   7: Console.WriteLine(result);

And lastly for the AJAX method:

   1: var languageFrom = "en";
   2: var languageTo = "es";
   3: var text = "translate this.";
   4:  
   5: function translate() {
   6:     window.mycallback = function(response) { alert(response); }
   7:     
   8:     var s = document.createElement("script");
   9:     s.src = "http://api.microsofttranslator.com/V2/Ajax.svc/Translate?oncomplete=mycallback&appId;=myAppId&from;=" 
  10:                 + languageFrom + "&to;=" + languageTo + "&text;=" + text;
  11:     document.getElementsByTagName("head")[0].appendChild(s);
  12: }

Fortunately, it all works as you’d expect – cleanly and simply.  The really nice thing about this (and the Google Translator) is that when faced with straight-up HTML like:

   1: <p class="style">Hello World!</p>

They will both return the following:

   1: <p class="style">¡Hola mundo!</p> 

Both translators will keep the HTML tags intact and only translate the actual text.  This undoubtedly comes in handy if you do any large bulk translations.  For example, I’m working with another couple of guys here on an internal (one day external) tool that has a lot of data in XML files with markup.  Essentially we need to translate something like the following:

   1: <Article Id="this does not get translated" 
   2:            Title="Title of the article" 
   3:            Category="Category for the article"
   4:            >
   5:   <Content><![CDATA[<P>description for the article<BR/>another line </p>]]></Content>
   6: </Article>

The cool thing is that if I just deserialize the above into an object and send the value of the Content member to the service like:

   1: string value = client.Translate(APPID_TOKEN, 
   2:                                 content, "en", "es");

I get only the content of the HTML translated:

   1: <p>Descripción del artículo<br>otra línea</p> 

Pretty nice and easy.  One thing all of the translator services have trouble with is if I just try to translate the entire xml element from the above in one shot.  Bing returns:

   1: <article id="this does not get translated" 
   2:          title="Title of the article" 
   3:          category="Category for the article">
   4: </article> 
   5:     <content><![CDATA[<P>Descripción del artículo<br>otra línea]]</content> >

And Google returns:

   1: <= Id artículo "esto no se traduce"
   2: Título = "Título del artículo"
   3: Categoría = "Categoría para el artículo">
   4:  
   5: <Content> <! [CDATA [descripción <P> para el artículo <BR/> otra línea </ p >]]>
   6: </ contenido>
   7: </> Artículo

Oh well – I guess no one’s perfect and for now we’ll be forced to deserialize and translate each element at a time.

Enjoy!

# Tuesday, August 3, 2010
Tuesday, August 3, 2010 1:50:24 AM (Central Daylight Time, UTC-05:00) ( Development )

Really interesting blog post by the IE team on some of the new DOM traversal features in IE9 (and other browsers).  Often times, you need to traverse the DOM to find a particular element or series of elements.  In the past, you might need to write some recursive JavaScript functions to navigate through the HTML on your page to act upon functions you care about.

Now, in IE9 (and other browsers that follow the W3C spec), you can use node iterators to get a flat list of the elements that you actually care about.  For example:

   1: // This would work fine with createTreeWalker, as well
   2: var iter = document.createNodeIterator(elm, 
   3:                                        NodeFilter.SHOW_ELEMENT, 
   4:                                        null, 
   5:                                        false); 
   6:  
   7: var node = iter.nextNode();
   8: while (node = iter.nextNode())
   9: {
  10:     node.style.display = "none";
  11: }

The NodeFilter enum by default allows for the following values (from the w3c spec here - http://www.w3.org/TR/2000/REC-DOM-Level-2-Traversal-Range-20001113/traversal.html#Traversal-NodeFilter):

   1: const unsigned long       SHOW_ALL                       = 0xFFFFFFFF;
   2: const unsigned long       SHOW_ELEMENT                   = 0x00000001;
   3: const unsigned long       SHOW_ATTRIBUTE                 = 0x00000002;
   4: const unsigned long       SHOW_TEXT                      = 0x00000004;
   5: const unsigned long       SHOW_CDATA_SECTION             = 0x00000008;
   6: const unsigned long       SHOW_ENTITY_REFERENCE          = 0x00000010;
   7: const unsigned long       SHOW_ENTITY                    = 0x00000020;
   8: const unsigned long       SHOW_PROCESSING_INSTRUCTION    = 0x00000040;
   9: const unsigned long       SHOW_COMMENT                   = 0x00000080;
  10: const unsigned long       SHOW_DOCUMENT                  = 0x00000100;
  11: const unsigned long       SHOW_DOCUMENT_TYPE             = 0x00000200;
  12: const unsigned long       SHOW_DOCUMENT_FRAGMENT         = 0x00000400;
  13: const unsigned long       SHOW_NOTATION                  = 0x00000800;

While this is great – you can also write your own NodeFilter callback function to filter the results even further:

   1: var iter = document.createNodeIterator(elm, 
   2:                                        NodeFilter.SHOW_ALL, 
   3:                                        keywordFilter, 
   4:                                        false);
   5:  
   6: function keywordFilter(node)
   7: {
   8:  
   9:     var altStr = node.getAttribute('alt').toLowerCase();
  10:     
  11:     if (altStr.indexOf("flight") != -1 || altStr.indexOf("space") != -1)
  12:         return NodeFilter.FILTER_ACCEPT;
  13:     else
  14:         return NodeFilter.FILTER_REJECT;
  15: }

Really nice and can help make your code simpler to read and faster too!

Enjoy!

Tuesday, August 3, 2010 1:48:17 AM (Central Daylight Time, UTC-05:00) ( .NET | Azure | Development | Logging )

One thing you might encounter when you start your development on windows-azure-logo_1-f2e19cWindows Azure is that there is an insane number of options available for number of options for logging.   You can view a quick primer here.  One of the things that I like about it is that you don’t necessarily need to learn a whole new API just to use it.  Instead, the logging facilities in Azure integrates really well with the existing Debug and Trace Logging API in .NET.  This is a really nice feature and is done very well in Azure.  In-fact to set-up and configure it is all of about 5 lines of code.  Actually it’s four lines of code with one line that wraps:

   1: public override bool OnStart() 
   2: { 
   3:     DiagnosticMonitorConfiguration dmc =          
   4:             DiagnosticMonitor.GetDefaultInitialConfiguration(); 
   5:     dmc.Logs.ScheduledTransferPeriod = TimeSpan.FromMinutes(1); 
   6:     dmc.Logs.ScheduledTransferLogLevelFilter = LogLevel.Verbose;
   7:     DiagnosticMonitor.Start("DiagnosticsConnectionString", dmc); 
   8: } 

One specific item to note is the ScheduledTransferPeriod property of the Logs property.  The minimum value you can set for that property is the equivalent of 1 minute.  The only downside to this method of logging is that if your Azure role crashes within that minute, whatever data you have written to your in-built logging will be lost.  This also means that if you are writing exceptions to your logging, that will be lost as well.  That can cause problems if you’d like to know both when your role crashed and why (the exception details).

Before we talk about a way to get around it, let’s review why the Role might crash.  In the Windows Azure world, there are two primary roles that you will use, a Worker and Web role.  Main characteristics and reasons it would crash are below:

Role Type Analogous Why would it crash/restart?
Worker Role Console Application
  • Any unhandled exception.
Web Role ASP.NET Application hosted in IIS 7.0+
  • Any unhandled exception thrown on a background thread.
  • StackOverflowException
  • Unhandled exception on finalizer thread.

As you can see, the majority of the reasons why an Azure role would recycle/crash/restart are essentially the same as with any other application – essentially an unhandled exception.  Therefore, to mitigate this issue, we can subscribe to the AppDomain’s UnhandledException Event.  This event is fired when your application experiences an exception that is not caught and will fire RIGHT BEFORE the application crashes.  You can subscribe to this event in the Role OnStart() method:

   1: public override bool OnStart()
   2: {
   3:     AppDomain appDomain = AppDomain.CurrentDomain;
   4:     appDomain.UnhandledException += 
   5:         new UnhandledExceptionEventHandler(appDomain_UnhandledException);
   6:     ...
   7: }

You will now be notified right before your process crashes.  The last piece to this puzzle is logging the exception details.  Since you must log the details right when it happens, you can’t just use the normal Trace or Debug statements.  Instead, we will write to the Azure storage directly.  Steve Marx has a good blog entry about printf in the cloud.  While it works, it requires the connection string to be placed right into the logging call.  He mentions that you don’t want that in a production application.  in our case, we will do things a little bit differently.  First, we must add the requisite variables and initialize the storage objects:

   1: private static bool storageInitialized = false;
   2: private static object gate = new Object();
   3: private static CloudBlobClient blobStorage;
   4: private static CloudQueueClient queueStorage;
   5:  
   6: private void InitializeStorage()
   7: {
   8:    if (storageInitialized)
   9:    {
  10:        return;
  11:    }
  12:  
  13:    lock (gate)
  14:    {
  15:       if (storageInitialized)
  16:       {
  17:          return;
  18:       }
  19:  
  20:  
  21:       // read account configuration settings
  22:       var storageAccount = 
  23:         CloudStorageAccount.FromConfigurationSetting("DataConnectionString");
  24:  
  25:       // create blob container for images
  26:       blobStorage = 
  27:           storageAccount.CreateCloudBlobClient();
  28:       CloudBlobContainer container = blobStorage.
  29:           GetContainerReference("webroleerrors");
  30:  
  31:       container.CreateIfNotExist();
  32:  
  33:       // configure container for public access
  34:       var permissions = container.GetPermissions();
  35:       permissions.PublicAccess = 
  36:            BlobContainerPublicAccessType.Container;
  37:       container.SetPermissions(permissions);
  38:  
  39:       storageInitialized = true;
  40:   }
  41: }

This will instantiate the requisite logging variables and then when the InitializeStorage() method is executed, we will set these variables to the appropriate initialized values.  Lastly, we must call this new method and then write to the storage.  We put this code in our UnhandledException event handler:

   1: void appDomain_UnhandledException(object sender, UnhandledExceptionEventArgs e)
   2: {
   3:     // Initialize the storage variables.
   4:     InitializeStorage();
   5:  
   6:     // Get Reference to error container.
   7:     var container = blobStorage.
   8:                GetContainerReference("webroleerrors");
   9:  
  10:  
  11:     if (container != null)
  12:     {
  13:         // Retrieve last exception.
  14:         Exception ex = e.ExceptionObject as Exception;
  15:  
  16:         if (ex != null)
  17:         {
  18:             // Will create a new entry in the container
  19:             // and upload the text representing the 
  20:             // exception.
  21:             container.GetBlobReference(
  22:                String.Format(
  23:                   "<insert unique name for your application>-{0}-{1}",
  24:                   RoleEnvironment.CurrentRoleInstance.Id,
  25:                   DateTime.UtcNow.Ticks)
  26:                ).UploadText(ex.ToString());
  27:         }
  28:     }
  29:     
  30: }

Now, when your Azure role is about to crash, you’ll find an entry in your blog storage with the details of the exception that was thrown.  For example, one of the leading reasons why an Azure Worker Role crashes is because it can’t find a dependency it needs.  So, in that case, you’ll find an entry in your storage with the following details:

   1: System.IO.FileNotFoundException: Could not load file or assembly 
   2:     'GuestBook_Data, Version=1.0.0.0, Culture=neutral, 
   3:     PublicKeyToken=f8a5fcb6c395f621' or one of its dependencies. 
   4:     The system cannot find the file specified.
   5:  
   6: File name: 'GuestBook_Data, Version=1.0.0.0, Culture=neutral, 
   7:     PublicKeyToken=f8a5fcb6c395f621'

You can find some other common reasons why an Azure role might crash (especially when you first deploy it) can be found at Anton Staykov’s excellent blog:

Hope this helps.

Enjoy!

# Friday, July 23, 2010
Friday, July 23, 2010 10:51:28 PM (Central Daylight Time, UTC-05:00) ( .NET | .NET Upgrade | Access Database | ASP.NET | Development | Security )

For an internal application at my company, very particular users have the rights to download an MDB that contains links to a replicated instance of a DB.  This was done so they could create their own queries without any assistance from IT - essentially a poor man's Cognos or Reporting Services.  This worked fine in .Net 1.1.

We are currently in the process of upgrading this application to .Net 2.0 and one of the developers on the project brought to my attention that the download was failing with a message stating:

downloadmdfError

She did some research and uncovered this forum post with step by step instructions on how to re-enable the downloading of an Access database.

Please keep in mind, though, that this was done for security purposes - so any page that links to a secured Access DB - should be secured in a variety of ways:

1.  Code Access Security:

   1: <PrincipalPermission(SecurityAction.Demand, _ 
   2:                      Authenticated:=True, _
   3:                      Role:="Secured User")> _

This should be placed in the code-behind on the page with the link to the file.  As you can see, the "Role" property will secure this page to just a user in a particular role.  If you have a page that is accessible to multiple roles, you can stack the class attributes on top of each other as seen here:

   1: <PrincipalPermission(SecurityAction.Demand, _
   2:                      Authenticated:=True, 
   3:                      Role:="First Role")> _
   4: <PrincipalPermission(SecurityAction.Demand, _
   5:                      Authenticated:=True, 
   6:                      Role:="Second Role")> _
   7: Partial Class PagesToBeSecured

2.  File Access Restriction:

Also, by putting a web.config in the folder that houses the Access DB, you can secure access to the contents of a particular folder.  For example with a structure of:

   1: Root/Downloads/Database/db.mdb

You can put a web.config in the "Database" folder that restricts access to that folder to a particular user/role/etc.  For example, the web.config in this example would look something like:

   1: <?xml version="1.0" encoding="utf-8"?>
   2: <configuration>
   3:     <system.web>
   4:         <authorization>
   5:             <allow roles="Role 1" />
   6:             <allow roles="Role 2"/>
   7:             <deny users="*" />
   8:         </authorization>
   9:     </system.web>
  10: </configuration>

As you can see, we are only allowing users in roles Role 1 and Role 2 to access the contents of this folder.  If any other users attempt to access the file, they will get a Page Cannot Be found exception.

In conclusion - if you are going to remove this particular restriction in IIS to allow users to download Access DB files directly from your web application - then you shouldn't completely discount security and you implement both of the following measures.

Enjoy!

Friday, July 23, 2010 10:42:12 PM (Central Daylight Time, UTC-05:00) ( .NET | .NET Upgrade | ASP.NET | Development )

This has been documented in numerous places including here, here, here (with an incorrect fix) and here.

Essentially, if you set the Readonly Property of a textbox using either the XHTML or in the code-behind, any changes you make to the value of that textbox client-side (ie:  javascript) will not be persisted across postbacks.  I had found this method extremely effective for Calendar controls and what-not.  You just provide a link to a popup and then when the user selects an item in the popup, you can populate the underlying Textbox with the value from the popup.  Unfortunately, out of the box, that functionality has been taken away.

The code in the Textbox that actually signals to the page that textboxes using the ReadOnly property should not be saved in ViewState is in a few places:

   1: Protected Overrides Function SaveViewState() As Object
   2:     If Not Me.SaveTextViewState Then
   3:         Me.ViewState.SetItemDirty("Text", False)
   4:     End If
   5:     Return MyBase.SaveViewState
   6: End Function
   7:  
   8: Private Shadows ReadOnly Property SaveTextViewState() _
   9:                                             As Boolean
  10:     Get
  11:  
  12:         If (Me.TextMode = TextBoxMode.Password) Then
  13:             Return False
  14:         End If
  15:         If (((MyBase.Events.Item(EventTextChanged) Is Nothing) _
  16:                 AndAlso MyBase.IsEnabled) _
  17:                 AndAlso ((Me.Visible AndAlso Not Me.ReadOnly) _
  18:                 AndAlso (MyBase.GetType Is GetType(TextBox)))) Then
  19:             Return False
  20:         End If
  21:  
  22:         Return True
  23:     End Get
  24: End Property

In the SaveTextViewState() method, part of the conditional statement specifies that if the ReadOnly property is set to true, then do not save the ViewState.

Also, in the LoadPostData() method, there is another check to make sure that the control's ReadOnly property is not set to true before it loads the data from the form post:

   1: Protected Overrides Function LoadPostData( _
   2:                 ByVal postDataKey As String, _
   3:                 ByVal postCollection As _
   4:                     NameValueCollection) As Boolean
   5:     MyBase.ValidateEvent(postDataKey)
   6:     Dim text As String = Me.Text
   7:     Dim text2 As String = _
   8:                 postCollection.Item(postDataKey)
   9:     If (Not Me.ReadOnly _
  10:         AndAlso Not [text].Equals(text2, _
  11:             StringComparison.Ordinal)) Then
  12:         Me.Text = text2
  13:         Return True
  14:     End If
  15:  
  16:     Return False
  17: End Function

If the ReadOnly property is set to true, then it will not set the Text property of the TextBox to be the value from the Form post.

The fix that people are touting is relatively simple.  In the code behind of the page, you can simply add the attribute manually to the TextBox like this:

   1: Protected Sub Page_Load(ByVal sender As Object, _
   2:         ByVal e As System.EventArgs) Handles Me.Load
   3:     With Me.txtTest
   4:         .Attributes.Add("readonly", "readonly")
   5:     End With
   6: End Sub

And while this would work - this just feels like a bandage.  Plus, if you have a big web project you are converting from .net 1.1 to .net 2.0 that contains TextBoxes all over the place that have the ReadOnly=True set, then this would be a lot of code to include on every single page.  Even if you have a shared function somewhere that you can call - that means you still have to call the function for each Textbox that you wish to set as ReadOnly.

An alternative is to create your own TextBox class that inherits from the existing TextBox and handles the ReadOnly property in the fashion mentioned above.  Any example of the code for the new TextBox is as follows:

   1: Namespace WebControls
   2:     Public Class TextBox
   3:         Inherits System.Web.UI.WebControls.TextBox
   4:  
   5:         Public Overrides Property [ReadOnly]() As Boolean
   6:             Get
   7:                 Return False
   8:             End Get
   9:             Set(ByVal value As Boolean)
  10:                 If value Then
  11:                     Me.Attributes("readonly") = "readonly"
  12:                 Else
  13:                     Me.Attributes.Remove("readonly")
  14:                 End If
  15:             End Set
  16:         End Property
  17:  
  18:     End Class
  19: End Namespace

That's fine for new TextBoxes, but what about my hundreds and hundreds of TextBoxes that are already in my application?  Well, .Net 2.0 allows you to use something called tagMappings in your web.config.  Essentially, they are a way to redirect the compiler to use a different class for an existing mapping when it compiles your pages.  From MSDN: 

"Defines a collection of tag types that are remapped to other tag types at compile time."

So, how do we use this neat feature to solve this problem?  Well, in the web.config, we will tell the .Net compiler to use our new TextBox control instead of the existing System.Web.UI.WebControls.TextBox control.  In order to do this, you simply add the following to your web.config:

   1: <pages>
   2:       <tagMapping>
   3:         <add tagType="System.Web.UI.WebControls.TextBox" 
   4:              mappedTagType="WebControls.TextBox"/>
   5:       </tagMapping>
   6: </pages>

The tagType attribute is the existing tag you would like to change (System.Web.UI.WebControls.TextBox) and the mappedTagType is the new TextBox control from above.

With this tag added in the web.config - whenever you have the following in a page:

   1: <asp:TextBox runat="server" 
   2:              ID="txtTest" 
   3:              ReadOnly="true" />

Your application will use the TextBox defined directly above.

Enjoy!

Friday, July 23, 2010 2:02:43 AM (Central Daylight Time, UTC-05:00) ( Database | Development | LINQ | Performance )

So, I was working with a customer who is writing their first application using LINQ. They had previously been bitten by the failure to close and dispose their SqlConnection objects. This is actually a fairly common problem and usually leads to those pesky SqlExceptions detailing that there are no connections left in the pool.

So, since LINQ to SQL abstracts out much of the direct database interaction, they were concerned about when the underlying SqlConnections are closed. I will walk through how I answered their question using a few of my favorite tools:

To start off, I created a simple SQL Table called Users:

UserTable

Then, I created a simple LINQ to SQL dbml:

Linq2SqlDBML

Now that the plumbing is in place, I can write some simple code to return the data from the table and display it to the console window:

   1: LinqConnectionSampleDataContext db = 
   2:     new LinqConnectionSampleDataContext();
   3:  
   4: Table<User> users = db.GetTable<User>();
   5:  
   6: IQueryable<User> userQuery =
   7:     from user in users
   8:     orderby user.firstName
   9:     select user;
  10:     
  11:  
  12: foreach (User user in userQuery)
  13: {
  14:  
  15:     Console.WriteLine("ID={0}, First Name={1}", 
  16:                         user.id, 
  17:                         user.firstName);

So, now when the application is executed, the output is as follows:

ConsoleSampleOutput

So, since Linq to Sql uses an underlying SqlConnection to do its work, we can set a breakpoint on the Close() method of that class in WinDBG. If you are unfamiliar with this great debugging tool, you can find a simple walkthrough on how to set it up here.

There are a number of ways to set a breakpoint in managed code in WinDBG. Here are the steps that I followed:

Step 1. Launch WinDBG and attach to the process in question.

Windbg_AttachToProcess_RedCircle

Step 2. Load the SOS extension into WinDBG by executing:

.loadby sos mscorwks

Step 3. Set the managed breakpoint using the !bpmd command. For this step, the !bpmd command accepts a variety of parameters. Basically, you can pass it either:

  • MethodDescr address.

  • Combination of Module Name and Managed Function Name

I chose the latter method because it’s relatively quick and I knew exactly what I wanted. So, the syntax for this method is:

!bpmd <module name> <managed function name>

You can get the module name from visiting the SqlConnection page up on MSDN. On this page, we can get the module name and the namespace to the class:

MSDN_sqlconnection

From this, we can get both parameters necessary:

  • Module Name: System.Data.dll
  • Managed Function Name: System.Data.SqlClient.SqlConnection.Close

So, our command in WinDBG becomes:

   1: !bpmd System.Data.dll 
   2: System.Data.SqlClient.SqlConnection.Close

Once you enter in this command, you should get output similar to the following in the WinDBG window:

0:014> !bpmd System.Data.dll System.Data.SqlClient.SqlConnection.Close
Found 1 methods...
MethodDesc = 544a0418
Setting breakpoint: bp 5455DC80 [System.Data.SqlClient.SqlConnection.Close()]

Step 4. “Go” in the debugger and wait for your breakpoint to be hit.

For this, the command is simply “g”.

0:014> g

Eventually, your breakpoint will be hit in the debugger and you should get output similar to the following:

Breakpoint 0 hit
eax=5457da68 ebx=04d7e9dc ecx=0185cd30 edx=018e56b0 esi=01870d80 edi=04d7e9a4
eip=5455dc80 esp=04d7e860 ebp=04d7e868 iopl=0         nv up ei pl nz na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000202
System_Data_ni+0xcdc80:
5455dc80 55              push    ebp

Step 5. Print out the call-stack.

The command to print out the call stack in SOS and WinDBG is “!clrstack”:

0:008> !clrstack

This will print out the managed call stack, which turns out to be:

OS Thread Id: 0x1d70 (8)
ESP       EIP     
04d7e860 5455dc80 System.Data.SqlClient.SqlConnection.Close()
04d7e864 77e20586 System.Data.Linq.SqlClient.SqlConnectionManager
                    .CloseConnection()
04d7e870 77e20554 System.Data.Linq.SqlClient.SqlConnectionManager
                    .ReleaseConnection(...)
04d7e87c 77e1da35 System.Data.Linq.SqlClient.
        ObjectReaderCompiler+ObjectReaderSession`1[...].Dispose()
04d7e888 77e1ddac System.Data.Linq.SqlClient.
        ObjectReaderCompiler+ObjectReaderSession`1[...].CheckNextResults()
04d7e894 77e1df2c System.Data.Linq.SqlClient.
        ObjectReaderCompiler+ObjectReaderBase`1[...].Read()
04d7e8a0 77e1ea2d System.Data.Linq.SqlClient.
        ObjectReaderCompiler+ObjectReader`2[...].MoveNext()
04d7e8ac 004f1a12 LINQ.SqlConnection.Program.Main(System.String[])

So, if you’re having trouble parsing this, the take away here is that when you iterate through a Linq resultset and you get to the end, the ObjectReaderSession will automatically close the Connection to the database.

Now, this is a simple HelloWorld code sample for retrieving a result-set and there are obviously a number of ways to do the same thing. The customer’s code was closer to the following:

   1: using (IEnumerator<User> enumerator = 
   2:           context.ExecuteQuery<User>(sqlStatement).GetEnumerator())
   3: {
   4:  
   5:        while (enumerator.MoveNext())
   6:        {
   7:  
   8:               // Do something here
   9:  
  10:         }
  11:  
  12: }

In this situation, we get an IEnumerator<T> back from the database call and iterate through it. Now, this part is very important. If you are iterating through the result set to completion – the connection will be closed the same as the above. However, if you do something like this:

   1: using (IEnumerator<User> enumerator = 
   2:     db.ExecuteQuery<User>(sqlStatement).GetEnumerator())
   3: {
   4:     while (enumerator.MoveNext())
   5:     {
   6:  
   7:         Console.WriteLine("ID={0}, First Name={1}", 
   8:             enumerator.Current.id, 
   9:             enumerator.Current.firstName);
  10:         
  11:         // Stop iterating after this record.
  12:         break;
  13:  
  14:     }
  15:  
  16: }

Please note the “break” statement. Essentially, if you are NOT iterating through to completion, the call stack looks like:

OS Thread Id: 0x251c (11)
ESP       EIP     
0522e73c 5455dc80 System.Data.SqlClient.SqlConnection.
                    Close()
0522e740 77e20586 System.Data.Linq.SqlClient.SqlConnectionManager.
                    CloseConnection()
0522e74c 77e20554 System.Data.Linq.SqlClient.SqlConnectionManager.
                    ReleaseConnection(...)
0522e758 77e1da35 System.Data.Linq.SqlClient.ObjectReaderCompiler+
                    ObjectReaderSession`1[...].Dispose()
0522e764 77e1ea12 System.Data.Linq.SqlClient.ObjectReaderCompiler+
                    ObjectReader`2[...].Dispose()
0522e768 00691bde LINQ.SqlConnection.Program.Main(System.String[])

The connection will NOT be closed until you call Dispose() on the ObjectReader (IEnumerable) object. This means that if you happen to write some code without the Using… statement when returning data like this:

   1: IEnumerator<User> enumerator =
   2:     db.ExecuteQuery<User>(sqlStatement).GetEnumerator();
   3:  
   4: while (enumerator.MoveNext())
   5: {
   6:  
   7:     Console.WriteLine("ID={0}, First Name={1}",
   8:         enumerator.Current.id,
   9:         enumerator.Current.firstName);
  10:  
  11:     // Stop iterating after this record.
  12:     break;
  13:  
  14: }

The SqlConnection.Close() method will NOT be called. This is because you have full control over the lifetime of the IEnumerator<T> object and you should know when you are done with it.

Now, along those lines, you may be asking yourself – what if I did something like this:

   1: LinqConnectionSampleDataContext db = 
   2:     new LinqConnectionSampleDataContext();
   3:  
   4: Table<User> users = db.GetTable<User>();
   5:  
   6: IQueryable<User> userQuery =
   7:     from user in users
   8:     orderby user.firstName
   9:     select user;
  10:     
  11:  
  12: foreach (User user in userQuery)
  13: {
  14:  
  15:     Console.WriteLine("ID={0}, First Name={1}", 
  16:                         user.id, 
  17:                         user.firstName);
  18:  
  19:     break;
  20: }

Where you break before you iterate through to completion? In that situation, Dispose() will still be called on the IQueryable<T> object. How? Because of a compile-time optimization we do. We insert a finally statement after the userQuery has been used. This compiles down to (in IL):

 try{
L_005d: br.s L_0084 L_005f: ldloc.s CS$5$0002 L_0061: callvirt instance !0...get_Current() L_0066: stloc.3 L_0067: ldstr "ID={0}, First Name={1}" L_006c: ldloc.3 L_006d: callvirt instance int32 LINQ.SqlConnection.User::get_id() L_0072: box int32 L_0077: ldloc.3 L_0078: callvirt instance string LINQ.SqlConnection.User::get_firstName() L_007d: call void [mscorlib]System.Console::WriteLine(string, object, object) L_0082: br.s L_008d L_0084: ldloc.s CS$5$0002 L_0086: callvirt instance bool [mscorlib]System.Collections.IEnumerator::MoveNext() L_008b: brtrue.s L_005f L_008d: leave.s L_009b
}finally{
L_008f: ldloc.s CS$5$0002 L_0091: brfalse.s L_009a L_0093: ldloc.s CS$5$0002 L_0095: callvirt instance void [mscorlib]System.IDisposable::Dispose() L_009a: endfinally
} L_009b: ret .try L_005d to L_008f finally handler L_008f to L_009b

The text in red is my emphasis. So, the moral of this story, when you take control of the data yourself, you MUST call dispose on the IEnumerable<T> object when you are done with it.

Enjoy!

Friday, July 23, 2010 2:01:29 AM (Central Daylight Time, UTC-05:00) ( Development | Performance )

This is an old topic, but as I work with developers more and more, I find that there is still some gray area on this topic.

Concatenating strings is probably one of the most comment tasks that most developers perform in their work-lives. If you've ever done some searching around, though, you'll find that there are 2 main ways that people perform those concatenations:

First, you can use the "+" or the "&" operator with the native string object:

    Sub ConcatWithStrings(ByVal max As Int32)
        Dim value As String = ""
        For i As Int32 = 0 To max
            value += i.ToString()
        Next
    End Sub

Second, you can use the System.Text.StringBuilder() object to perform the concatenation:

    Sub ConcatWithStringBuilder(ByVal max As Int32)
        Dim value As New System.Text.StringBuilder()
        For i As Int32 = 0 To max
            value.Append(i.ToString())
        Next
    End Sub

So, which is better? Well, if you do any Google'ing or Live Search'ing on the topic, you'll find some great articles on the topic. Mahesh Chand has a great article, which I'll quote:

"You can concatenate strings in two ways. First, traditional way of using string and adding the new string to an existing string. In the .NET Framework, this operation is costly. When you add a string to an existing string, the Framework copies both the existing and new data to the memory, deletes the existing string, and reads data in a new string. This operation can be very time consuming in lengthy string
concatenation operations."

Now, I'm a big fan of short descriptions like the above, but I always find that without actually showing what is happening behind the scenes, you might lose some folks in the translation. To illustrate what happens behind the scenes, I wrote a quick console application that uses the first code sample - and then took a memory dump using ADPlus to show what actually gets kept around in memory after the String level concatenation.

I'm not going to go into a lot of detail on what I did to mine through the memory dump, but if you're interested in getting down to this detail on your own - I highly recommend the book Debugging Microsoft .NET 2.0 Applications. After you read that great book, you should read Tess's Great Blog to get even more practice on this area.

In any case, what was uncovered after the memory dump was that with the listing in the first example, there will be a separate String object placed into memory each time you perform a concatenation. Here is an excerpt from that dump:

# Size Value

1 28 "0123"
1 28 "01234"
1 32 "012345"
1 32 "0123456"
1 36 "01234567"
1 36 "012345678"
1 40 "0123456789"
1 44 "012345678910"
1 48 "01234567891011"
1 52 "0123456789101112"
1 56 "012345678910111213"
.....
1 124 "0123456789101112131415161718192021222324252627282930"
1 128 "012345678910111213141516171819202122232425262728293031"
1 132 "01234567891011121314151617181920212223242526272829303132"
1 136 "0123456789101112131415161718192021222324252627282930313233"
1 140 "012345678910111213141516171819202122232425262728293031323334"
1 144 "01234567891011121314151617181920212223242526272829303132333435"
65 17940 ""012345678910111213141516171819202122232425262728293031323334353"

First, the command I used outputs the first 65 characters or so within the String object. This is why the last entry has a count of 65 instances. In any case, as you can see, there is a separate copy of each string made into memory during each concatenation operation. Ouch. This can get expensive very quickly!

Now what about the StringBuilder operation?

# Size Value

1 52 "0123456789101112"
1 84 "01234567891011121314151617181920"
3 956 "012345678910111213141516171819202122232425262728293031323334353"

Gee, that seems a lot simpler - but if you're paying attention, you'll notice that there are a few other instances of this lengthy string in memory. Any ideas why?

Well, this appears to be my bug. When you instantiate a System.Text.StringBuilder object, one of the constructors allows for a integer parameter called "Capacity". If you do not specify an initial capacity, the noargs constructor defaults to "&H10" - which is 16 characters. If, during your string operations, you exceed that 16 character capacity, the StringBuilder will create a new new string with a capacity of 32 (16 * 2) characters. The next time, you perform an operation that needs more than 32 characters, the StringBuilder will double the capacity again - this time to 64 characters. This will continue to happen over and over again as you continually append more characters to the StringBuilder object.

So, what does this mean? Well, it means that we're still not achieving the maximum efficiency here. If we want to build a String like this - without creating extra instances of the base string object - even when using a StringBuilder object, you should specify a capacity in the constructor. To prove this hypothesis, we can rewrite the second code listing to be:

    Sub ConcatWithStringBuilder(ByVal max As Int32)
        Dim value As New System.Text.StringBuilder(193)
        For i As Int32 = 0 To max
            value.Append(i.ToString())
        Next
    End Sub

Now, when we take a memory dump and look for the String objects for the above loop:

# Size Value

1 404 "012345678910111213141516171819202122232425262728293031323334353"

We see that there is only one instance of the String object in memory.

Moral of the story? Even when using the StringBuilder object, if you know the final string is going to be lengthy, you should set an initial capacity to most efficiently perform your string concatenations.

Enjoy!

Friday, July 23, 2010 12:52:47 AM (Central Daylight Time, UTC-05:00) ( ASP.NET | Database | Development )

I was helping a co-worker today with one of his screens and realized that it may make a useful little tutorial for all those other folks out there. This was somewhat interesting because it follows up pretty well on my previous post as I talk about saving records with a 1 to many relationship. Basically, he needed to develop a screen similar to the following:

image

This is actually a pretty simple page, but there are a few places that might cause a hiccup or two:

  • Grouping items in a checkbox list.
  • Getting the value for each Attribute (ie: the PK of the Attribute itself)

For the first item - grouping inside of a checkbox list. This actually won't work - there is no way to specify groups of checkboxes or additional properties to define checkboxes or any of that...so, the only real option is to put the checkboxes into a GridView control. Now, I've done this type of grouping many times previously and it never really felt right. After Googling the problem with him - "grouping items in a gridview" - the first result is a great series of classes posted here. The only problem with this method is that it is focused on grouping within reports. The difference in this situation is that we wanted the checkboxes to maintain their state between postbacks. If you perform a postback using the article's method, then not only will your checkboxes lose their state (kinda sorta) but the Header row for each group will disappear.

Then, I happened to recall something similar written by Matt Dotson and his use of cell-spanning to accomplish a similar feat. He has a great blog entry posted here to illustrate the concept. In-fact, he also has a great CodePlex project here that has a bunch of useful "Real World" ASP.NET webcontrols in it. One of these useful controls is a GroupingGridView. This is essentially a GridView to perform the actual Grid Grouping. By default, using Matt's sample, the GridView renders like this:

image

And the XHTML to output this simple page is:

<rwg:GroupingGridView ID="GroupGrid" DataSourceID="PubsDataSource" 
                    AutoGenerateColumns="False" GroupingDepth="2"
                    DataKeyNames="au_id" runat="server">
    <Columns>
        <asp:BoundField HeaderText="State" DataField="state" />
        <asp:BoundField HeaderText="City" DataField="city" />
        <asp:BoundField HeaderText="Last Name" DataField="au_lname" />
        <asp:BoundField HeaderText="First Name" DataField="au_fname" />
        <asp:BoundField HeaderText="Phone" DataField="phone" />
        <asp:BoundField HeaderText="Address" DataField="address" />
        <asp:BoundField HeaderText="Zip Code" DataField="zip" />
        <asp:CheckBoxField HeaderText="Contract" DataField="contract" />
    </Columns>
</rwg:GroupingGridView>

Nice, clean and simple - and it gets us most of the way there. Instead of actually displaying the grouped items in the center of the cell, though, we needed the Group text to display at the top of the cell. This is a very simple change to his source code (available on the CodePlex site). In fact, it's only 1 additional line of code in his SpanCellsRecursive method (see the marked line below):

private void SpanCellsRecursive(int columnIndex, 
                                        int startRowIndex, 
                                        int endRowIndex)
        {
            if (columnIndex >= this.GroupingDepth 
                    || columnIndex >= this.Columns.Count )
                return;

            TableCell groupStartCell = null;
            int groupStartRowIndex = startRowIndex;

            for (int i = startRowIndex; i < endRowIndex; i++)
            {
                TableCell currentCell = this.Rows.Cells[columnIndex];

                bool isNewGroup = (null == groupStartCell) || 
                    (0 != 
                        String.CompareOrdinal(currentCell.Text, 
                                            groupStartCell.Text));

                currentCell.VerticalAlign = VerticalAlign.Top;
                if (isNewGroup)
                {
                    if (null != groupStartCell)
                    {
                        SpanCellsRecursive(columnIndex + 1, 
                                           groupStartRowIndex, i);
                    }

                    groupStartCell = currentCell;
                    groupStartCell.RowSpan = 1;
                    groupStartRowIndex = i;
                }
                else
                {
                    currentCell.Visible = false;
                    groupStartCell.RowSpan += 1;
                }
            }

            SpanCellsRecursive(columnIndex + 1, groupStartRowIndex, endRowIndex);
        }

Now that gets us the Grouped items in a GridView for our pretty display. Just using that wonderful GridView gets us a page that looks like this:

image

With only this small amount of code in the XHTML:

    <RWC:GroupingGridView runat="server" ID="GroupingGridView1" 
                AutoGenerateColumns="False" 
                GroupingDepth="1" 
                ShowHeader="False">
         <Columns>
            <asp:BoundField DataField="GroupName" ShowHeader="False" /> 
            <asp:TemplateField>
                <ItemTemplate>
               <asp:checkbox runat="server" 
                             id="checkbox1" 
                             Text='<%# Eval("AttributeName") %>'
                 Checked='&lt;%# DirectCast(Eval("ParentId"),Int32) > 0 %>'                />
                </ItemTemplate>
            </asp:TemplateField>                               
        </Columns>
    </RWC:GroupingGridView>

And then you just bind this grid behind the scenes:

    Protected Sub Page_Load(ByVal sender As Object, _
                            ByVal e As System.EventArgs) _                                              Handles Me.Load

        If Not IsPostBack Then

            Dim attributeList As _
                    System.Collections.Generic.IList(Of Attribute)
            attributeList = Me.GetAttributes()

            With Me.GroupingGridView1
                .DataSource = attributeList
                .DataBind()
            End With

        End If
    End Sub

Now, from a look and feel perspective - we're just about done. But what about getting the values out of the checkbox - so that we know:

  1. Whether the checkbox was selected.
  2. What the value of the checkbox is - ie: the primary key of the Attribute object.

The checked property is relatively simple to get out of the page. In the Submit button's event, you loop through the rows in the GridView and extract out the instance of the Checkbox for each row and then check the "Checked" property of the checkbox to determine if the checkbox is selected:

For Each row As GridViewRow In Me.gvGrouping.Rows
  If row.RowType = DataControlRowType.DataRow Then
    Dim obj As CheckBox = _
         TryCast(row.FindControl("chkbxPermissions"), _
                                CheckBox)
    If obj IsNot Nothing Then
        Response.Write(String.Format("Row {0}: checked value is {1} <br>", _
             row.RowIndex.ToString, obj.Checked.ToString))
     End If
  End If
Next

Now this will write to the page the RowIndex and a value indicating whether the checkbox was checked. This will not, though, give me the AttributeId for these checkboxes because the ASP.NET 2.0 Checkbox Control does not have actually have a "Value" property like it does in a CheckBoxList control. So, what do we do? Well, there are two solutions:

  1. Use the DataKeyNames property of the GridView to specify that the AttributeId is our key for this GridView and then extract it from the row.
  2. Create a new CheckBox control that will include a new property to hold the Checkbox's Value.

Given the two options mentioned above - there are no real differences. The actual size of the page is EXACTLY the same and the speed to access the GridView's DataKey property versus a new property on a new CheckBox control is no different. I'm choosing to go with the latter as I know of a few other situations where a control like this might come in handy.

So, since it is not already there - let's create a new control that derives from the existing CheckBox control. We'll call this new control a ValueCheckBox control and add a new property:

Namespace WebControls
    Public Class ValueCheckBox
        Inherits CheckBox

        Public Property CheckboxValue() As String
            Get
                Return DirectCast(ViewState("checkboxvalue"), String)
            End Get
            Set(ByVal value As String)
                ViewState("checkboxvalue") = value
            End Set
        End Property
    End Class
End Namespace

Then, we add the new Register tag to the top of our page:

<%@ Register TagPrefix="sp" Namespace="WebControls" %>

And then slightly change our XHTML to include a reference to the new checkbox in our GridView:

<RWC:GroupingGridView runat="server" ID="gvGrouping" 
                AutoGenerateColumns="False" 
                GroupingDepth="1" 
                ShowHeader="False">
         <Columns>
            <asp:BoundField DataField="GroupName" ShowHeader="False" /> 
            <asp:TemplateField>
                <ItemTemplate>
                    <sp:ValueCheckBox runat="server" 
                       ID="chkbxPermissions" 
                       Text='<%# Eval("AttributeName") %>' 
                       Checked='&lt;%# DirectCast(Eval("ParentId"),Int32) > 0 %>' 
                       CheckboxValue='<%# Eval("AttributeId") %>' />          
                </ItemTemplate>
            </asp:TemplateField>                               
        </Columns>
</RWC:GroupingGridView>

Lastly, you tweak your submit button code-behind to get at the new control:

        For Each row As GridViewRow In Me.gvGrouping.Rows
            If row.RowType = DataControlRowType.DataRow Then
                Dim obj As WebControls.ValueCheckBox = _
                    TryCast(row.FindControl("chkbxPermissions"), _
                                WebControls.ValueCheckBox)
                If obj IsNot Nothing Then
                    Dim checked As Boolean = obj.Checked
                    Dim checkboxValue As String = obj.CheckboxValue
                End If
            End If
        Next

And now you're done. My previous post discusses the best way to store these values, so I won't go into that level.

Enjoy!

Friday, July 23, 2010 12:43:03 AM (Central Daylight Time, UTC-05:00) ( Database | Development )

 

If you've been a developer for any period of time, you have probably come across this question. If you're given a one-to-many relationship in a table, what is the best way to insert those values into a database. For example, let's say that you had a Person table and an Attribute table. Each person can have more than one attribute associated with them, hence the 1-to-Many relationship. Your database diagram is probably something that looks like this:

bs_1_to_many_relationship

So, if given a Person and a list of Attributes for that person, what is the best way to insert rows into the PersonAttributeXref table? Well, if this is the first time that you've done something like this, you probably came up with something like this:

  1. Create a Stored Procedure that accepts a UserId and AttributeId value and then insert one row into the PersonAttributeXref table.
  2. For each Attribute selected, you simply loop through them and call out to that Stored Procedure once for each item you would like to insert.

So what is wrong with this method? Not a whole lot if you are only inserting a few (1 to 3) rows into the PersonAttributeXref table. However, when inserting more than those few rows into the table, you will begin to notice a performance hit. Why? Well, if you are using something like the Enterprise Library to perform your inserts, your code probably looks a little something like this:

Public Sub InsertAttribute(ByVal userId As Int32, ByVal attributeId As Int32)
     Dim db As Database = DatabaseFactory.CreateDatabase()
     Using dbCommand As Common.DbCommand = _
           db.GetStoredProcCommand("USP_BS_INSERT_PERSONATTRIBUTEXREF")

        db.AddInParameter(dbCommand, "@UserId", DbType.Int32, userId)
        db.AddInParameter(dbCommand, "@AttributeId", DbType.Int32, attributeId)

        Try
             db.ExecuteNonQuery(dbCommand)
        Catch ex As SqlClient.SqlException
            '// Insert error handling here.
        End Try
    End Using
End Sub

The problem with something like this is that each time you call out to this method, a new connection is established with the database and the insert is then completed. The act of establishing a connection out to the database is very expensive - even using Connection Pools. In-fact, during some tests that I ran locally, if you are calling out to this method 100 times - it will probably take you in excess of about 937,500 ticks to complete the transaction. You also don't have the option of rolling back the rows inserted if something fails.

Well, chances are, you've probably noticed this issue and have tried to overcome it a variety of ways. In-fact, the next thing that I've seen people try is to concatenate the attributeId's into a delimited string and then pass that string to a single stored procedure:

        Dim attributeIdList As New System.Text.StringBuilder()
        For i As Int32 = 1 To MAXINSERTS
            If attributeIdList.Length > 0 Then
                attributeIdList.Append(",")
            End If
            attributeIdList.Append(i.ToString)
        Next

And then in the Stored Procedure, you perform the String parsing using a simple UDF like that which you can find here and then you perform the insert there - all wrapped in a nice little transaction. If you have done something like this successfully, you have probably seen transaction times somewhere in the neighborhood of 625,000 ticks for 100 inserted rows. Wow! That's a nice a performance boost, right? Yes, it is - however, I've always been a firm believer that the SQL Server is no place to be concatenating and splitting strings. Just because you can do it, doesn't mean it should be done. In-fact, I'd argue that splitting strings and concatenating strings makes your procedure a lot less supportable than performing similar actions in .Net code.

Faced with these two options, I'm likely to propose a third option. In .NET, though, you have the option of using Transactions (feel free to look here or here or even here for a description). Transactions in .NET work the same way as Transactions in SQL. Basically, you wrap one or more operations in a Transaction and if all operations succeed, then the transaction is committed; if one or more fail, then you Roll it back. In the context of this example, though, how would it work? Well, assuming you have an array of AttributeId's, your code would probably look like this:

Public Sub InsertAttributeWithTransaction(ByVal userId As Int32, _
                                          ByVal attributeIdList As Int32())
     Dim db As Database = DatabaseFactory.CreateDatabase()
     Using dbCommand As Common.DbCommand = _
            db.GetStoredProcCommand("USP_BS_INSERT_PERSONATTRIBUTEXREF")

         db.AddInParameter(dbCommand, "@UserId", DbType.Int32, userId)
         db.AddInParameter(dbCommand, "@AttributeId", DbType.Int32, -1)

          Using connection As Common.DbConnection = _
                                    db.CreateConnection()
              connection.Open()
              Dim transaction As Common.DbTransaction = _
                                    connection.BeginTransaction()
              Try
                  For i As Int32 = 0 To attributeIdList.Length - 1
                       dbCommand.Parameters("@AttributeId").Value = _
                                                    attributeIdList(i)
                       db.ExecuteNonQuery(dbCommand, transaction)
                  Next
                   ' Commit the transaction
                   transaction.Commit()

              Catch ex As Exception
                  ' Rollback transaction 
                   transaction.Rollback()
               End Try

               connection.Close()
           End Using
       End Using
 End Sub

As you can see, this is pretty nice and straight-forward. You instantiate the connection, create the transaction, process the inserts and then depending upon the result of the operations, you either Commit() or Rollback() the transaction. So, what is the benefit of this? Well, aside from the obvious readability (and consequently supportability) improvements - you also get a speed improvement too! In-fact, for the same 100 rows that I inserted previously using the first or second approach, the total time to complete the transaction here is between 156,250 ticks and 312,500 ticks. That compared with with the 937,500 ticks in option 1 and 625,000 ticks in Option 2 - represents an incredible speed improvement - and that's only for 100 rows. Which, I'd imagine, is the high-end in UI defined cases.

If anyone has experienced issues with using .NET transactions or performance foibles, I'd love to hear about them. Please feel free to either comment here or e-mail me at greg at samurai programmer dot com.