Samurai Programmer.com

Contact

Blogroll

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

Are you ready for your next challenge?

Site Update July 2010 - (aka - Where are all of your posts?)

Latest Tweet:

Thursday, 29 July 2010

IIS: When to enable web gardening?

Thursday, 29 July 2010 18:50:09 (Central Daylight Time, UTC-05:00) ( Best Practice | IIS | Performance )

I seem to get this question a lot and come across many customer environments where they have enabled web gardening thinking that it will automagically improve the performance for their site/application.

Most time, that is not the case. The funny thing is that once I finally convince them that web gardening is not the way to go, they try to apply that same knowledge to other sites and applications in their environment. When this happens, I’ll get an e-mail or phone call asking for some guidelines on when to enable web gardening.

We typically recommend using Web Gardening as a stop-gap (or workaround) for when a customer has a core issue that is limiting their website and web application scalability.

For example, if a customer has a memory issue that is causing OutOfMemoryExceptions in their main website – we may recommend web gardening to spread the load across multiple worker processes while we assist them in resolving the core memory issue. Please note that this would also increase the memory and processor utilization on the server and in some cases might not be viable.

As a best practice, create Web gardens only for Web applications that meet the following criteria (taken from here):

The application runs multi-instantiated, so that a different instance of the application can be assigned to each worker process.
The Web application is not CPU-intensive. If the CPU is the bottleneck, then adding worker processes cannot help improve performance.
The application is subject to synchronous high latency. For example, if an application calls a back-end database and the response is slow, then a Web garden supports other concurrent connections without waiting for the slow connection to complete.

A good discussion of why not to use Web Gardening can be found here as well: http://blogs.technet.com/b/mscom/archive/2007/07/10/gardening-on-the-web-server.aspx

Enjoy!

Comments [0] | Trackback |

Are you ready for your next challenge? (We’re hiring!)

Thursday, 29 July 2010 16:21:57 (Central Daylight Time, UTC-05:00) ( Hiring | Microsoft | Premier Field Engineer (PFE) )

First, a few questions:

Do you enjoy helping developers write better code?
Do you enjoy solving complex problems that span multiple technologies?
Do you enjoy optimizing and improving code?
Are you passionate about software development?
Do you enjoy managing your own calendar?
Do you want to make the world a better place?

If the answer to these questions is “YES!”, then please read-on.

We are now actively recruiting for 3 Developer Premier Field Engineering positions at Microsoft.

You may ask yourself, what does a Dev PFE do?

We do both proactive and reactive work encompassing a variety of Microsoft’s developer products. In general, the reactive work is where a customer is experiencing a problem (usually in production) and they need someone onsite to help them resolve the issue. The proactive work usually takes the form of knowledge transfer to companies on how they can improve the maintainability of their code, how to debug problems and how to optimize their applications. I have also done a fair number of “proofs of concept” for customers where they just don’t know how to do something or they want me to prove that it can be done.

In the past year, I’ve worked on projects using .NET (1.1 – 4.0), Windows Azure, Internet Explorer, Bing, Bing Translator, Windows 7 and many others.

And the list really does go on. You will work with our largest customers around the world helping them write better code, solving complex issues, teaching them about the latest technology and just making the world a better place.

If this sounds like a good fit for you, click here to go to the Microsoft Careers site to apply.

Comments [0] | Trackback |

Sunday, 25 July 2010

Site update July 2010 (aka - Where are all of your posts?)

Sunday, 25 July 2010 12:47:04 (Central Daylight Time, UTC-05:00) ( )

If you have visited my blog anytime in the last 2 weeks – you may have noticed an error page. This was due to my hosting provider “accidentally” deleting my site’s database. This was actually a perfect storm of sorts. All three of these things happened within the past few weeks:

I had recently repaved my main machine and decided to wipe out my database backups. I figured I would just get a fresh backup afterwards.
With some recent travel for work, I did not have a chance to get those database backups.
The hosting company did not have a backup of the database they deleted.

All three of these extenuating circumstances led to this site being in a sorry state of repair.

Fortunately, all is not entirely lost as both Bing and Google have cached copies of most of my posts but it leaves me in a quandary on if I should bother reposting everything from the past four years or if I should just start fresh. For now, I’ve decided to repost only those blog entries that had more than 1,000 views. That said, if you remember a blog entry and it hasn’t been reposted, please send me a message at greg [at] samuraiprogrammer dot com and I will get it reposted.

Please enjoy the new site.

Thanks!

Comments [2] | Trackback |

Friday, 23 July 2010

How to reset your Flip UltraHD Video camcorder

Friday, 23 July 2010 23:35:28 (Central Daylight Time, UTC-05:00) ( )

I recently ran into a problem with my Flip UltraHD Video camcorder where it would not turn on. Unlike other camcorders in the Flip family, there is no microscopic reset button anywhere on the device.

After e-mailing support, I received a response the next day with the steps to reset the device. For some reason, this information is not listed in their knowledge base on the support site, so I thought I would post it here.

The steps to reset/resolve an issue where your UltraHD will not turn on is as follows:

Remove the battery pack from the camcorder.
Connect the camcorder to a powered USB port on your computer.
When the "Connected" indicator comes on, insert the battery pack into the camcorder.
Safe eject your camcorder from your computer.
Reconnect your camcorder to your computer.
The battery pack should now begin to charge within the camcorder

Hope this helps someone else out there.

Comments [0] | Trackback |

Creating an instructor guide from a PowerPoint deck.

Friday, 23 July 2010 22:58:54 (Central Daylight Time, UTC-05:00) ( Powerpoint | Presentations | VBA )

I recently came across the situation where I had several PowerPoint decks that were VERY well documented. Essentially, each slide had reams of notes in the Notes panel of the deck. This is both good and bad. It was good because for preparation purposes, it was very easy to review the notes as you reviewed the slides. It was bad because on many machines where I presented from, you could not split the monitors so the slides were on one machine and the notes were on another.

This presented a conundrum because it’s nice to be able to have the notes handy as I present the material. That way, I can refer back to the main bullets to make sure I covered everything before moving on. For other workshops I teach, we have an instructor guide on paper that you can have up next to your presenter machine on the podium and all is good to go.

Unfortunately, there was nothing like that setup already for this workshop. Fortunately, PowerPoint 2007 provides a feature to export your presentation to Word 2007. Full steps are located here but essentially you are looking for this screen:

Creating handouts in Word

This is a nice solution but the main problem that I had is that it embeds PowerPoint objects into the slide deck. This *sounds* like a good idea but it grows the file size of the document to a huge degree. For example, one deck had 36 slides and the file size of the resulting Word document was >50MB!

Fortunately, with some nice VBA code, you can do the following:

Iterate through every shape in the document.
Copy the PowerPoint object.
Paste that object as a JPEG.
Delete the PowerPoint object.

That brings the file size back down to where it should be, from my perspective. For example, in that file I mentioned previously, it brought the size down from 55MB to 642KB. Talk about a tremendous improvement!

The VBA code to use is as follows:

        1: Sub ConvertPowerPointToImage()

       2:     

       3:     Dim j As InlineShape

       4:     For Each j In Word.ActiveDocument.InlineShapes

       5:         j.Select

       6:         Selection.Copy

       7:         Selection.PasteSpecial Link:=False, _

       8:                   DataType:=15, 

       9:                   Placement:=wdInLine, 

      10:                   DisplayAsIcon:=False

      11:         j.Delete        

      12:     Next j

      13: End Sub

Just copy/paste the above code into a new module. Then execute the “ConvertPowerPointToImage” macro. That will clean everything up and make the file a lot more manageable.

Enjoy!

Comments [0] | Trackback |

.NET 2.0 and Access Databases

For an internal application at my company, very particular users have the rights to download an MDB that contains links to a replicated instance of a DB. This was done so they could create their own queries without any assistance from IT - essentially a poor man's Cognos or Reporting Services. This worked fine in .Net 1.1.

We are currently in the process of upgrading this application to .Net 2.0 and one of the developers on the project brought to my attention that the download was failing with a message stating:

She did some research and uncovered this forum post with step by step instructions on how to re-enable the downloading of an Access database.

Please keep in mind, though, that this was done for security purposes - so any page that links to a secured Access DB - should be secured in a variety of ways:

1. Code Access Security:

        1: <PrincipalPermission(SecurityAction.Demand, _ 

       2:                      Authenticated:=True, _

       3:                      Role:="Secured User")> _

This should be placed in the code-behind on the page with the link to the file. As you can see, the "Role" property will secure this page to just a user in a particular role. If you have a page that is accessible to multiple roles, you can stack the class attributes on top of each other as seen here:

       1: <PrincipalPermission(SecurityAction.Demand, _

       2:                      Authenticated:=True, 

       3:                      Role:="First Role")> _

       4: <PrincipalPermission(SecurityAction.Demand, _

       5:                      Authenticated:=True, 

       6:                      Role:="Second Role")> _

       7: Partial Class PagesToBeSecured

2. File Access Restriction:

Also, by putting a web.config in the folder that houses the Access DB, you can secure access to the contents of a particular folder. For example with a structure of:

       1: Root/Downloads/Database/db.mdb

You can put a web.config in the "Database" folder that restricts access to that folder to a particular user/role/etc. For example, the web.config in this example would look something like:

       1: <?xml version="1.0" encoding="utf-8"?>

       2: <configuration>

       3:     <system.web>

       4:         <authorization>

       5:             <allow roles="Role 1" />

       6:             <allow roles="Role 2"/>

       7:             <deny users="*" />

       8:         </authorization>

       9:     </system.web>

      10: </configuration>

As you can see, we are only allowing users in roles Role 1 and Role 2 to access the contents of this folder. If any other users attempt to access the file, they will get a Page Cannot Be found exception.

In conclusion - if you are going to remove this particular restriction in IIS to allow users to download Access DB files directly from your web application - then you shouldn't completely discount security and you implement both of the following measures.

Enjoy!

Comments [0] | Trackback |

ASP.NET 2.0: Textbox ReadOnly Property

Friday, 23 July 2010 22:42:12 (Central Daylight Time, UTC-05:00) ( .NET | .NET Upgrade | ASP.NET | Development )

This has been documented in numerous places including here, here, here (with an incorrect fix) and here.

Essentially, if you set the Readonly Property of a textbox using either the XHTML or in the code-behind, any changes you make to the value of that textbox client-side (ie: javascript) will not be persisted across postbacks. I had found this method extremely effective for Calendar controls and what-not. You just provide a link to a popup and then when the user selects an item in the popup, you can populate the underlying Textbox with the value from the popup. Unfortunately, out of the box, that functionality has been taken away.

The code in the Textbox that actually signals to the page that textboxes using the ReadOnly property should not be saved in ViewState is in a few places:

        1: Protected Overrides Function SaveViewState() As Object

       2:     If Not Me.SaveTextViewState Then

       3:         Me.ViewState.SetItemDirty("Text", False)

       4:     End If

       5:     Return MyBase.SaveViewState

       6: End Function

       7:  

       8: Private Shadows ReadOnly Property SaveTextViewState() _

       9:                                             As Boolean

      10:     Get

      11:  

      12:         If (Me.TextMode = TextBoxMode.Password) Then

      13:             Return False

      14:         End If

      15:         If (((MyBase.Events.Item(EventTextChanged) Is Nothing) _

      16:                 AndAlso MyBase.IsEnabled) _

      17:                 AndAlso ((Me.Visible AndAlso Not Me.ReadOnly) _

      18:                 AndAlso (MyBase.GetType Is GetType(TextBox)))) Then

      19:             Return False

      20:         End If

      21:  

      22:         Return True

      23:     End Get

      24: End Property

In the SaveTextViewState() method, part of the conditional statement specifies that if the ReadOnly property is set to true, then do not save the ViewState.

Also, in the LoadPostData() method, there is another check to make sure that the control's ReadOnly property is not set to true before it loads the data from the form post:

       1: Protected Overrides Function LoadPostData( _

       2:                 ByVal postDataKey As String, _

       3:                 ByVal postCollection As _

       4:                     NameValueCollection) As Boolean

       5:     MyBase.ValidateEvent(postDataKey)

       6:     Dim text As String = Me.Text

       7:     Dim text2 As String = _

       8:                 postCollection.Item(postDataKey)

       9:     If (Not Me.ReadOnly _

      10:         AndAlso Not [text].Equals(text2, _

      11:             StringComparison.Ordinal)) Then

      12:         Me.Text = text2

      13:         Return True

      14:     End If

      15:  

      16:     Return False

      17: End Function

If the ReadOnly property is set to true, then it will not set the Text property of the TextBox to be the value from the Form post.

The fix that people are touting is relatively simple. In the code behind of the page, you can simply add the attribute manually to the TextBox like this:

       1: Protected Sub Page_Load(ByVal sender As Object, _

       2:         ByVal e As System.EventArgs) Handles Me.Load

       3:     With Me.txtTest

       4:         .Attributes.Add("readonly", "readonly")

       5:     End With

       6: End Sub

And while this would work - this just feels like a bandage. Plus, if you have a big web project you are converting from .net 1.1 to .net 2.0 that contains TextBoxes all over the place that have the ReadOnly=True set, then this would be a lot of code to include on every single page. Even if you have a shared function somewhere that you can call - that means you still have to call the function for each Textbox that you wish to set as ReadOnly.

An alternative is to create your own TextBox class that inherits from the existing TextBox and handles the ReadOnly property in the fashion mentioned above. Any example of the code for the new TextBox is as follows:

       1: Namespace WebControls

       2:     Public Class TextBox

       3:         Inherits System.Web.UI.WebControls.TextBox

       4:  

       5:         Public Overrides Property [ReadOnly]() As Boolean

       6:             Get

       7:                 Return False

       8:             End Get

       9:             Set(ByVal value As Boolean)

      10:                 If value Then

      11:                     Me.Attributes("readonly") = "readonly"

      12:                 Else

      13:                     Me.Attributes.Remove("readonly")

      14:                 End If

      15:             End Set

      16:         End Property

      17:  

      18:     End Class

      19: End Namespace

That's fine for new TextBoxes, but what about my hundreds and hundreds of TextBoxes that are already in my application? Well, .Net 2.0 allows you to use something called tagMappings in your web.config. Essentially, they are a way to redirect the compiler to use a different class for an existing mapping when it compiles your pages. From MSDN:

"Defines a collection of tag types that are remapped to other tag types at compile time."

So, how do we use this neat feature to solve this problem? Well, in the web.config, we will tell the .Net compiler to use our new TextBox control instead of the existing System.Web.UI.WebControls.TextBox control. In order to do this, you simply add the following to your web.config:

       1: <pages>

       2:       <tagMapping>

       3:         <add tagType="System.Web.UI.WebControls.TextBox" 

       4:              mappedTagType="WebControls.TextBox"/>

       5:       </tagMapping>

       6: </pages>

The tagType attribute is the existing tag you would like to change (System.Web.UI.WebControls.TextBox) and the mappedTagType is the new TextBox control from above.

With this tag added in the web.config - whenever you have the following in a page:

       1: <asp:TextBox runat="server" 

       2:              ID="txtTest" 

       3:              ReadOnly="true" />

Your application will use the TextBox defined directly above.

Enjoy!

Comments [0] | Trackback |

LINQ: When are my database connections closed?

Friday, 23 July 2010 02:02:43 (Central Daylight Time, UTC-05:00) ( Database | Development | LINQ | Performance )

So, I was working with a customer who is writing their first application using LINQ. They had previously been bitten by the failure to close and dispose their SqlConnection objects. This is actually a fairly common problem and usually leads to those pesky SqlExceptions detailing that there are no connections left in the pool.

So, since LINQ to SQL abstracts out much of the direct database interaction, they were concerned about when the underlying SqlConnections are closed. I will walk through how I answered their question using a few of my favorite tools:

To start off, I created a simple SQL Table called Users:

Then, I created a simple LINQ to SQL dbml:

Now that the plumbing is in place, I can write some simple code to return the data from the table and display it to the console window:

        1: LinqConnectionSampleDataContext db = 

       2:     new LinqConnectionSampleDataContext();

       3:  

       4: Table<User> users = db.GetTable<User>();

       5:  

       6: IQueryable<User> userQuery =

       7:     from user in users

       8:     orderby user.firstName

       9:     select user;

      10:     

      11:  

      12: foreach (User user in userQuery)

      13: {

      14:  

      15:     Console.WriteLine("ID={0}, First Name={1}", 

      16:                         user.id, 

      17:                         user.firstName);

So, now when the application is executed, the output is as follows:

So, since Linq to Sql uses an underlying SqlConnection to do its work, we can set a breakpoint on the Close() method of that class in WinDBG. If you are unfamiliar with this great debugging tool, you can find a simple walkthrough on how to set it up here.

There are a number of ways to set a breakpoint in managed code in WinDBG. Here are the steps that I followed:

Step 1. Launch WinDBG and attach to the process in question.

Step 2. Load the SOS extension into WinDBG by executing:

.loadby sos mscorwks

Step 3. Set the managed breakpoint using the !bpmd command. For this step, the !bpmd command accepts a variety of parameters. Basically, you can pass it either:

MethodDescr address.
Combination of Module Name and Managed Function Name

I chose the latter method because it’s relatively quick and I knew exactly what I wanted. So, the syntax for this method is:

!bpmd <module name> <managed function name>

You can get the module name from visiting the SqlConnection page up on MSDN. On this page, we can get the module name and the namespace to the class:

From this, we can get both parameters necessary:

Module Name: System.Data.dll
Managed Function Name: System.Data.SqlClient.SqlConnection.Close

So, our command in WinDBG becomes:

         1: !bpmd System.Data.dll 

         2: System.Data.SqlClient.SqlConnection.Close

Once you enter in this command, you should get output similar to the following in the WinDBG window:

0:014> !bpmd System.Data.dll System.Data.SqlClient.SqlConnection.Close
Found 1 methods...
MethodDesc = 544a0418
Setting breakpoint: bp 5455DC80 [System.Data.SqlClient.SqlConnection.Close()]

Step 4. “Go” in the debugger and wait for your breakpoint to be hit.

For this, the command is simply “g”.

0:014> g

Eventually, your breakpoint will be hit in the debugger and you should get output similar to the following:

Breakpoint 0 hit
eax=5457da68 ebx=04d7e9dc ecx=0185cd30 edx=018e56b0 esi=01870d80 edi=04d7e9a4
eip=5455dc80 esp=04d7e860 ebp=04d7e868 iopl=0         nv up ei pl nz na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000202
System_Data_ni+0xcdc80:
5455dc80 55              push    ebp

Step 5. Print out the call-stack.

The command to print out the call stack in SOS and WinDBG is “!clrstack”:

0:008> !clrstack

This will print out the managed call stack, which turns out to be:

OS Thread Id: 0x1d70 (8)
ESP       EIP     
04d7e860 5455dc80 System.Data.SqlClient.SqlConnection.Close()
04d7e864 77e20586 System.Data.Linq.SqlClient.SqlConnectionManager
                    .CloseConnection()
04d7e870 77e20554 System.Data.Linq.SqlClient.SqlConnectionManager
                    .ReleaseConnection(...)
04d7e87c 77e1da35 System.Data.Linq.SqlClient.
        ObjectReaderCompiler+ObjectReaderSession`1[...].Dispose()
04d7e888 77e1ddac System.Data.Linq.SqlClient.
        ObjectReaderCompiler+ObjectReaderSession`1[...].CheckNextResults()
04d7e894 77e1df2c System.Data.Linq.SqlClient.
        ObjectReaderCompiler+ObjectReaderBase`1[...].Read()
04d7e8a0 77e1ea2d System.Data.Linq.SqlClient.
        ObjectReaderCompiler+ObjectReader`2[...].MoveNext()
04d7e8ac 004f1a12 LINQ.SqlConnection.Program.Main(System.String[])

So, if you’re having trouble parsing this, the take away here is that when you iterate through a Linq resultset and you get to the end, the ObjectReaderSession will automatically close the Connection to the database.

Now, this is a simple HelloWorld code sample for retrieving a result-set and there are obviously a number of ways to do the same thing. The customer’s code was closer to the following:

       1: using (IEnumerator<User> enumerator = 

       2:           context.ExecuteQuery<User>(sqlStatement).GetEnumerator())

       3: {

       4:  

       5:        while (enumerator.MoveNext())

       6:        {

       7:  

       8:               // Do something here

       9:  

      10:         }

      11:  

      12: }

In this situation, we get an IEnumerator<T> back from the database call and iterate through it. Now, this part is very important. If you are iterating through the result set to completion – the connection will be closed the same as the above. However, if you do something like this:

       1: using (IEnumerator<User> enumerator = 

       2:     db.ExecuteQuery<User>(sqlStatement).GetEnumerator())

       3: {

       4:     while (enumerator.MoveNext())

       5:     {

       6:  

       7:         Console.WriteLine("ID={0}, First Name={1}", 

       8:             enumerator.Current.id, 

       9:             enumerator.Current.firstName);

      10:         

      11:         // Stop iterating after this record.

      12:         break;

      13:  

      14:     }

      15:  

      16: }

Please note the “break” statement. Essentially, if you are NOT iterating through to completion, the call stack looks like:

OS Thread Id: 0x251c (11)
ESP       EIP     
0522e73c 5455dc80 System.Data.SqlClient.SqlConnection.
                    Close()
0522e740 77e20586 System.Data.Linq.SqlClient.SqlConnectionManager.
                    CloseConnection()
0522e74c 77e20554 System.Data.Linq.SqlClient.SqlConnectionManager.
                    ReleaseConnection(...)
0522e758 77e1da35 System.Data.Linq.SqlClient.ObjectReaderCompiler+
                    ObjectReaderSession`1[...].Dispose()
0522e764 77e1ea12 System.Data.Linq.SqlClient.ObjectReaderCompiler+
                    ObjectReader`2[...].Dispose()
0522e768 00691bde LINQ.SqlConnection.Program.Main(System.String[])

The connection will NOT be closed until you call Dispose() on the ObjectReader (IEnumerable) object. This means that if you happen to write some code without the Using… statement when returning data like this:

       1: IEnumerator<User> enumerator =

       2:     db.ExecuteQuery<User>(sqlStatement).GetEnumerator();

       3:  

       4: while (enumerator.MoveNext())

       5: {

       6:  

       7:     Console.WriteLine("ID={0}, First Name={1}",

       8:         enumerator.Current.id,

       9:         enumerator.Current.firstName);

      10:  

      11:     // Stop iterating after this record.

      12:     break;

      13:  

      14: }

The SqlConnection.Close() method will NOT be called. This is because you have full control over the lifetime of the IEnumerator<T> object and you should know when you are done with it.

Now, along those lines, you may be asking yourself – what if I did something like this:

       1: LinqConnectionSampleDataContext db = 

       2:     new LinqConnectionSampleDataContext();

       3:  

       4: Table<User> users = db.GetTable<User>();

       5:  

       6: IQueryable<User> userQuery =

       7:     from user in users

       8:     orderby user.firstName

       9:     select user;

      10:     

      11:  

      12: foreach (User user in userQuery)

      13: {

      14:  

      15:     Console.WriteLine("ID={0}, First Name={1}", 

      16:                         user.id, 

      17:                         user.firstName);

      18:  

      19:     break;

      20: }

Where you break before you iterate through to completion? In that situation, Dispose() will still be called on the IQueryable<T> object. How? Because of a compile-time optimization we do. We insert a finally statement after the userQuery has been used. This compiles down to (in IL):

 try{
    L_005d: br.s L_0084
    L_005f: ldloc.s CS$5$0002
    L_0061: callvirt instance !0...get_Current()
    L_0066: stloc.3 
    L_0067: ldstr "ID={0}, First Name={1}"
    L_006c: ldloc.3 
    L_006d: callvirt instance int32 LINQ.SqlConnection.User::get_id()
    L_0072: box int32
    L_0077: ldloc.3 
    L_0078: callvirt instance string LINQ.SqlConnection.User::get_firstName()
    L_007d: call void [mscorlib]System.Console::WriteLine(string, object, object)
    L_0082: br.s L_008d
    L_0084: ldloc.s CS$5$0002
    L_0086: callvirt instance bool [mscorlib]System.Collections.IEnumerator::MoveNext()
    L_008b: brtrue.s L_005f
    L_008d: leave.s L_009b
  }finally{
    L_008f: ldloc.s CS$5$0002
    L_0091: brfalse.s L_009a
    L_0093: ldloc.s CS$5$0002
    L_0095: callvirt instance void [mscorlib]System.IDisposable::Dispose()
    L_009a: endfinally
  } 
    L_009b: ret 
    .try L_005d to L_008f finally handler L_008f to L_009b

The text in red is my emphasis. So, the moral of this story, when you take control of the data yourself, you MUST call dispose on the IEnumerable<T> object when you are done with it.

Enjoy!

Comments [0] | Trackback |

String Concatenation – A Performance Story

Friday, 23 July 2010 02:01:29 (Central Daylight Time, UTC-05:00) ( Development | Performance )

This is an old topic, but as I work with developers more and more, I find that there is still some gray area on this topic.

Concatenating strings is probably one of the most comment tasks that most developers perform in their work-lives. If you've ever done some searching around, though, you'll find that there are 2 main ways that people perform those concatenations:

First, you can use the "+" or the "&" operator with the native string object:

    Sub ConcatWithStrings(ByVal max As Int32)
        Dim value As String = ""
        For i As Int32 = 0 To max
            value += i.ToString()
        Next
    End Sub

Second, you can use the System.Text.StringBuilder() object to perform the concatenation:

    Sub ConcatWithStringBuilder(ByVal max As Int32)
        Dim value As New System.Text.StringBuilder()
        For i As Int32 = 0 To max
            value.Append(i.ToString())
        Next
    End Sub

So, which is better? Well, if you do any Google'ing or Live Search'ing on the topic, you'll find some great articles on the topic. Mahesh Chand has a great article, which I'll quote:

"You can concatenate strings in two ways. First, traditional way of using string and adding the new string to an existing string. In the .NET Framework, this operation is costly. When you add a string to an existing string, the Framework copies both the existing and new data to the memory, deletes the existing string, and reads data in a new string. This operation can be very time consuming in lengthy string
concatenation operations."

Now, I'm a big fan of short descriptions like the above, but I always find that without actually showing what is happening behind the scenes, you might lose some folks in the translation. To illustrate what happens behind the scenes, I wrote a quick console application that uses the first code sample - and then took a memory dump using ADPlus to show what actually gets kept around in memory after the String level concatenation.

I'm not going to go into a lot of detail on what I did to mine through the memory dump, but if you're interested in getting down to this detail on your own - I highly recommend the book Debugging Microsoft .NET 2.0 Applications. After you read that great book, you should read Tess's Great Blog to get even more practice on this area.

In any case, what was uncovered after the memory dump was that with the listing in the first example, there will be a separate String object placed into memory each time you perform a concatenation. Here is an excerpt from that dump:

# Size Value

1 28 "0123"
1 28 "01234"
1 32 "012345"
1 32 "0123456"
1 36 "01234567"
1 36 "012345678"
1 40 "0123456789"
1 44 "012345678910"
1 48 "01234567891011"
1 52 "0123456789101112"
1 56 "012345678910111213"
.....
1 124 "0123456789101112131415161718192021222324252627282930"
1 128 "012345678910111213141516171819202122232425262728293031"
1 132 "01234567891011121314151617181920212223242526272829303132"
1 136 "0123456789101112131415161718192021222324252627282930313233"
1 140 "012345678910111213141516171819202122232425262728293031323334"
1 144 "01234567891011121314151617181920212223242526272829303132333435"
65 17940 ""012345678910111213141516171819202122232425262728293031323334353"

First, the command I used outputs the first 65 characters or so within the String object. This is why the last entry has a count of 65 instances. In any case, as you can see, there is a separate copy of each string made into memory during each concatenation operation. Ouch. This can get expensive very quickly!

Now what about the StringBuilder operation?

# Size Value

1 52 "0123456789101112"
1 84 "01234567891011121314151617181920"
3 956 "012345678910111213141516171819202122232425262728293031323334353"

Gee, that seems a lot simpler - but if you're paying attention, you'll notice that there are a few other instances of this lengthy string in memory. Any ideas why?

Well, this appears to be my bug. When you instantiate a System.Text.StringBuilder object, one of the constructors allows for a integer parameter called "Capacity". If you do not specify an initial capacity, the noargs constructor defaults to "&H10" - which is 16 characters. If, during your string operations, you exceed that 16 character capacity, the StringBuilder will create a new new string with a capacity of 32 (16 * 2) characters. The next time, you perform an operation that needs more than 32 characters, the StringBuilder will double the capacity again - this time to 64 characters. This will continue to happen over and over again as you continually append more characters to the StringBuilder object.

So, what does this mean? Well, it means that we're still not achieving the maximum efficiency here. If we want to build a String like this - without creating extra instances of the base string object - even when using a StringBuilder object, you should specify a capacity in the constructor. To prove this hypothesis, we can rewrite the second code listing to be:

    Sub ConcatWithStringBuilder(ByVal max As Int32)
        Dim value As New System.Text.StringBuilder(193)
        For i As Int32 = 0 To max
            value.Append(i.ToString())
        Next
    End Sub

Now, when we take a memory dump and look for the String objects for the above loop:

# Size Value

1 404 "012345678910111213141516171819202122232425262728293031323334353"

We see that there is only one instance of the String object in memory.

Moral of the story? Even when using the StringBuilder object, if you know the final string is going to be lengthy, you should set an initial capacity to most efficiently perform your string concatenations.

Enjoy!

Comments [0] | Trackback |

ASP.NET GridView Grouping and Checkbox Lists

Friday, 23 July 2010 00:52:47 (Central Daylight Time, UTC-05:00) ( ASP.NET | Database | Development )

I was helping a co-worker today with one of his screens and realized that it may make a useful little tutorial for all those other folks out there. This was somewhat interesting because it follows up pretty well on my previous post as I talk about saving records with a 1 to many relationship. Basically, he needed to develop a screen similar to the following:

This is actually a pretty simple page, but there are a few places that might cause a hiccup or two:

Grouping items in a checkbox list.
Getting the value for each Attribute (ie: the PK of the Attribute itself)

For the first item - grouping inside of a checkbox list. This actually won't work - there is no way to specify groups of checkboxes or additional properties to define checkboxes or any of that...so, the only real option is to put the checkboxes into a GridView control. Now, I've done this type of grouping many times previously and it never really felt right. After Googling the problem with him - "grouping items in a gridview" - the first result is a great series of classes posted here. The only problem with this method is that it is focused on grouping within reports. The difference in this situation is that we wanted the checkboxes to maintain their state between postbacks. If you perform a postback using the article's method, then not only will your checkboxes lose their state (kinda sorta) but the Header row for each group will disappear.

Then, I happened to recall something similar written by Matt Dotson and his use of cell-spanning to accomplish a similar feat. He has a great blog entry posted here to illustrate the concept. In-fact, he also has a great CodePlex project here that has a bunch of useful "Real World" ASP.NET webcontrols in it. One of these useful controls is a GroupingGridView. This is essentially a GridView to perform the actual Grid Grouping. By default, using Matt's sample, the GridView renders like this:

And the XHTML to output this simple page is:

<rwg:GroupingGridView ID="GroupGrid" DataSourceID="PubsDataSource" 
                    AutoGenerateColumns="False" GroupingDepth="2"
                    DataKeyNames="au_id" runat="server">
    <Columns>
        <asp:BoundField HeaderText="State" DataField="state" />
        <asp:BoundField HeaderText="City" DataField="city" />
        <asp:BoundField HeaderText="Last Name" DataField="au_lname" />
        <asp:BoundField HeaderText="First Name" DataField="au_fname" />
        <asp:BoundField HeaderText="Phone" DataField="phone" />
        <asp:BoundField HeaderText="Address" DataField="address" />
        <asp:BoundField HeaderText="Zip Code" DataField="zip" />
        <asp:CheckBoxField HeaderText="Contract" DataField="contract" />
    </Columns>
</rwg:GroupingGridView>

Nice, clean and simple - and it gets us most of the way there. Instead of actually displaying the grouped items in the center of the cell, though, we needed the Group text to display at the top of the cell. This is a very simple change to his source code (available on the CodePlex site). In fact, it's only 1 additional line of code in his SpanCellsRecursive method (see the marked line below):

private void SpanCellsRecursive(int columnIndex, 
                                        int startRowIndex, 
                                        int endRowIndex)
        {
            if (columnIndex >= this.GroupingDepth 
                    || columnIndex >= this.Columns.Count )
                return;

            TableCell groupStartCell = null;
            int groupStartRowIndex = startRowIndex;

            for (int i = startRowIndex; i < endRowIndex; i++)
            {
                TableCell currentCell = this.Rows.Cells[columnIndex];

                bool isNewGroup = (null == groupStartCell) || 
                    (0 != 
                        String.CompareOrdinal(currentCell.Text, 
                                            groupStartCell.Text));

                currentCell.VerticalAlign = VerticalAlign.Top;
                if (isNewGroup)
                {
                    if (null != groupStartCell)
                    {
                        SpanCellsRecursive(columnIndex + 1, 
                                           groupStartRowIndex, i);
                    }

                    groupStartCell = currentCell;
                    groupStartCell.RowSpan = 1;
                    groupStartRowIndex = i;
                }
                else
                {
                    currentCell.Visible = false;
                    groupStartCell.RowSpan += 1;
                }
            }

            SpanCellsRecursive(columnIndex + 1, groupStartRowIndex, endRowIndex);
        }

Now that gets us the Grouped items in a GridView for our pretty display. Just using that wonderful GridView gets us a page that looks like this:

With only this small amount of code in the XHTML:

    <RWC:GroupingGridView runat="server" ID="GroupingGridView1" 
                AutoGenerateColumns="False" 
                GroupingDepth="1" 
                ShowHeader="False">
         <Columns>
            <asp:BoundField DataField="GroupName" ShowHeader="False" /> 
            <asp:TemplateField>
                <ItemTemplate>
               <asp:checkbox runat="server" 
                             id="checkbox1" 
                             Text='<%# Eval("AttributeName") %>'
                 Checked='&lt;%# DirectCast(Eval("ParentId"),Int32) > 0 %>'                />
                </ItemTemplate>
            </asp:TemplateField>                               
        </Columns>
    </RWC:GroupingGridView>

And then you just bind this grid behind the scenes:

    Protected Sub Page_Load(ByVal sender As Object, _
                            ByVal e As System.EventArgs) _                                              Handles Me.Load

        If Not IsPostBack Then

            Dim attributeList As _
                    System.Collections.Generic.IList(Of Attribute)
            attributeList = Me.GetAttributes()

            With Me.GroupingGridView1
                .DataSource = attributeList
                .DataBind()
            End With

        End If
    End Sub

Now, from a look and feel perspective - we're just about done. But what about getting the values out of the checkbox - so that we know:

Whether the checkbox was selected.
What the value of the checkbox is - ie: the primary key of the Attribute object.

The checked property is relatively simple to get out of the page. In the Submit button's event, you loop through the rows in the GridView and extract out the instance of the Checkbox for each row and then check the "Checked" property of the checkbox to determine if the checkbox is selected:

For Each row As GridViewRow In Me.gvGrouping.Rows
  If row.RowType = DataControlRowType.DataRow Then
    Dim obj As CheckBox = _
         TryCast(row.FindControl("chkbxPermissions"), _
                                CheckBox)
    If obj IsNot Nothing Then
        Response.Write(String.Format("Row {0}: checked value is {1} <br>", _
             row.RowIndex.ToString, obj.Checked.ToString))
     End If
  End If
Next

Now this will write to the page the RowIndex and a value indicating whether the checkbox was checked. This will not, though, give me the AttributeId for these checkboxes because the ASP.NET 2.0 Checkbox Control does not have actually have a "Value" property like it does in a CheckBoxList control. So, what do we do? Well, there are two solutions:

Use the DataKeyNames property of the GridView to specify that the AttributeId is our key for this GridView and then extract it from the row.
Create a new CheckBox control that will include a new property to hold the Checkbox's Value.

Given the two options mentioned above - there are no real differences. The actual size of the page is EXACTLY the same and the speed to access the GridView's DataKey property versus a new property on a new CheckBox control is no different. I'm choosing to go with the latter as I know of a few other situations where a control like this might come in handy.

So, since it is not already there - let's create a new control that derives from the existing CheckBox control. We'll call this new control a ValueCheckBox control and add a new property:

Namespace WebControls
    Public Class ValueCheckBox
        Inherits CheckBox

        Public Property CheckboxValue() As String
            Get
                Return DirectCast(ViewState("checkboxvalue"), String)
            End Get
            Set(ByVal value As String)
                ViewState("checkboxvalue") = value
            End Set
        End Property
    End Class
End Namespace

Then, we add the new Register tag to the top of our page:

<%@ Register TagPrefix="sp" Namespace="WebControls" %>

And then slightly change our XHTML to include a reference to the new checkbox in our GridView:

<RWC:GroupingGridView runat="server" ID="gvGrouping" 
                AutoGenerateColumns="False" 
                GroupingDepth="1" 
                ShowHeader="False">
         <Columns>
            <asp:BoundField DataField="GroupName" ShowHeader="False" /> 
            <asp:TemplateField>
                <ItemTemplate>
                    <sp:ValueCheckBox runat="server" 
                       ID="chkbxPermissions" 
                       Text='<%# Eval("AttributeName") %>' 
                       Checked='&lt;%# DirectCast(Eval("ParentId"),Int32) > 0 %>' 
                       CheckboxValue='<%# Eval("AttributeId") %>' />          
                </ItemTemplate>
            </asp:TemplateField>                               
        </Columns>
</RWC:GroupingGridView>

Lastly, you tweak your submit button code-behind to get at the new control:

        For Each row As GridViewRow In Me.gvGrouping.Rows
            If row.RowType = DataControlRowType.DataRow Then
                Dim obj As WebControls.ValueCheckBox = _
                    TryCast(row.FindControl("chkbxPermissions"), _
                                WebControls.ValueCheckBox)
                If obj IsNot Nothing Then
                    Dim checked As Boolean = obj.Checked
                    Dim checkboxValue As String = obj.CheckboxValue
                End If
            End If
        Next

And now you're done. My previous post discusses the best way to store these values, so I won't go into that level.

Enjoy!

Comments [0] | Trackback |

One-to-Many: What is the best way to store the values?

Friday, 23 July 2010 00:43:03 (Central Daylight Time, UTC-05:00) ( Database | Development )

If you've been a developer for any period of time, you have probably come across this question. If you're given a one-to-many relationship in a table, what is the best way to insert those values into a database. For example, let's say that you had a Person table and an Attribute table. Each person can have more than one attribute associated with them, hence the 1-to-Many relationship. Your database diagram is probably something that looks like this:

So, if given a Person and a list of Attributes for that person, what is the best way to insert rows into the PersonAttributeXref table? Well, if this is the first time that you've done something like this, you probably came up with something like this:

Create a Stored Procedure that accepts a UserId and AttributeId value and then insert one row into the PersonAttributeXref table.
For each Attribute selected, you simply loop through them and call out to that Stored Procedure once for each item you would like to insert.

So what is wrong with this method? Not a whole lot if you are only inserting a few (1 to 3) rows into the PersonAttributeXref table. However, when inserting more than those few rows into the table, you will begin to notice a performance hit. Why? Well, if you are using something like the Enterprise Library to perform your inserts, your code probably looks a little something like this:

Public Sub InsertAttribute(ByVal userId As Int32, ByVal attributeId As Int32)
     Dim db As Database = DatabaseFactory.CreateDatabase()
     Using dbCommand As Common.DbCommand = _
           db.GetStoredProcCommand("USP_BS_INSERT_PERSONATTRIBUTEXREF")

        db.AddInParameter(dbCommand, "@UserId", DbType.Int32, userId)
        db.AddInParameter(dbCommand, "@AttributeId", DbType.Int32, attributeId)

        Try
             db.ExecuteNonQuery(dbCommand)
        Catch ex As SqlClient.SqlException
            '// Insert error handling here.
        End Try
    End Using
End Sub

The problem with something like this is that each time you call out to this method, a new connection is established with the database and the insert is then completed. The act of establishing a connection out to the database is very expensive - even using Connection Pools. In-fact, during some tests that I ran locally, if you are calling out to this method 100 times - it will probably take you in excess of about 937,500 ticks to complete the transaction. You also don't have the option of rolling back the rows inserted if something fails.

Well, chances are, you've probably noticed this issue and have tried to overcome it a variety of ways. In-fact, the next thing that I've seen people try is to concatenate the attributeId's into a delimited string and then pass that string to a single stored procedure:

        Dim attributeIdList As New System.Text.StringBuilder()
        For i As Int32 = 1 To MAXINSERTS
            If attributeIdList.Length > 0 Then
                attributeIdList.Append(",")
            End If
            attributeIdList.Append(i.ToString)
        Next

And then in the Stored Procedure, you perform the String parsing using a simple UDF like that which you can find here and then you perform the insert there - all wrapped in a nice little transaction. If you have done something like this successfully, you have probably seen transaction times somewhere in the neighborhood of 625,000 ticks for 100 inserted rows. Wow! That's a nice a performance boost, right? Yes, it is - however, I've always been a firm believer that the SQL Server is no place to be concatenating and splitting strings. Just because you can do it, doesn't mean it should be done. In-fact, I'd argue that splitting strings and concatenating strings makes your procedure a lot less supportable than performing similar actions in .Net code.

Faced with these two options, I'm likely to propose a third option. In .NET, though, you have the option of using Transactions (feel free to look here or here or even here for a description). Transactions in .NET work the same way as Transactions in SQL. Basically, you wrap one or more operations in a Transaction and if all operations succeed, then the transaction is committed; if one or more fail, then you Roll it back. In the context of this example, though, how would it work? Well, assuming you have an array of AttributeId's, your code would probably look like this:

Public Sub InsertAttributeWithTransaction(ByVal userId As Int32, _
                                          ByVal attributeIdList As Int32())
     Dim db As Database = DatabaseFactory.CreateDatabase()
     Using dbCommand As Common.DbCommand = _
            db.GetStoredProcCommand("USP_BS_INSERT_PERSONATTRIBUTEXREF")

         db.AddInParameter(dbCommand, "@UserId", DbType.Int32, userId)
         db.AddInParameter(dbCommand, "@AttributeId", DbType.Int32, -1)

          Using connection As Common.DbConnection = _
                                    db.CreateConnection()
              connection.Open()
              Dim transaction As Common.DbTransaction = _
                                    connection.BeginTransaction()
              Try
                  For i As Int32 = 0 To attributeIdList.Length - 1
                       dbCommand.Parameters("@AttributeId").Value = _
                                                    attributeIdList(i)
                       db.ExecuteNonQuery(dbCommand, transaction)
                  Next
                   ' Commit the transaction
                   transaction.Commit()

              Catch ex As Exception
                  ' Rollback transaction 
                   transaction.Rollback()
               End Try

               connection.Close()
           End Using
       End Using
 End Sub

As you can see, this is pretty nice and straight-forward. You instantiate the connection, create the transaction, process the inserts and then depending upon the result of the operations, you either Commit() or Rollback() the transaction. So, what is the benefit of this? Well, aside from the obvious readability (and consequently supportability) improvements - you also get a speed improvement too! In-fact, for the same 100 rows that I inserted previously using the first or second approach, the total time to complete the transaction here is between 156,250 ticks and 312,500 ticks. That compared with with the 937,500 ticks in option 1 and 625,000 ticks in Option 2 - represents an incredible speed improvement - and that's only for 100 rows. Which, I'd imagine, is the high-end in UI defined cases.

If anyone has experienced issues with using .NET transactions or performance foibles, I'd love to hear about them. Please feel free to either comment here or e-mail me at greg at samurai programmer dot com.

Comments [0] | Trackback |

Input Validation

Friday, 23 July 2010 00:35:39 (Central Daylight Time, UTC-05:00) ( ASP.NET )

It seems that I always appear to run into this issue when performing code reviews. As has been well documented in various locations on the web - all input is evil, until proven otherwise. Some people attempt to mitigate this risk by providing dropdowns or calendar controls to coach our users into entering the right values. While this works often, it is still coaching the users and should not be seen as a replacement for solid validation of the user entered data.

For example, imagine you have a textbox control on a page that is supposed to accept a Date value. To "coach" the users into entering an appropriate value, you provide a calendar popup control. Now, when the data has been posted back to the server to be stored in the database, do you:

A) Assume that because we have given the users a calendar popup control, their input will be exactly as you might expect.

B) Perform some additional validation that the textbox has a value entered. If there is a value, assume that it is entered correctly.

C) Imagine that there really isn't that calendar aid and validate the input as if it were entered entirely by hand by the users themselves.

If you answered "C", congratulations, you are correct. The fact is, although we provide those wonderful entry "aids" to assist our users, they will never guarantee correct input. Here is the wrong way to do it:

Dim dateValue as DateTime
dateValue = Date.Parse(me.txtDateInput.Text)

What happens if the Date Input textbox comes across as an empty string? You will get a big fat exception. Now, in .Net 1.1, there were certain methods that you would code to validate input - or if you were using VB, you could use one of the methods in the wonderful Microsoft.VisualBasic namespace. In .Net 2.0, though, there is a wonderful additional method to each one of these primitive objects.... "TryParse(....)".

This is really a great little method that takes as input 2 parameters:

String representing the value to parse.
DateTime/Integer/etc. representing where to store the final value - if the input could be parsed.

The great thing about this method is that it will actually return a boolean value if it could successfully parse the value. Therefore, using this new method, the code above could be amended as follows:

Dim dateValue as DateTime

If Date.TryParse(me.txtDateInput.Text, dateValue) Then
    '// Input is valid.  You can use the dateValue 
    '// variable safely.
Else
    '// Input in NOT valid.  Please direct the user to 
    '// enter a value correctly.
End If

Really powerful! Here's another example - using values obtained from a QueryString as the offending code:

        Dim id As Int32 = -1
        If Request.QueryString("id") IsNot Nothing Then
            id = Int32.Parse(Request.QueryString("id"))
        End If

This looks good, right? It does - and it might even function correctly about 70% of the time. What about that other 30%, though? What happens if a user decides to mess around with the QueryString values? Instead of a number, they decide that it should be a text value instead. Exception! Using the TryParse(...) methods, though, you can do something nice and easy like this:

        Dim id As Int32 = -1
        If Not Int32.TryParse(Request.QueryString("id"), id) Then
            '// Notify user of invalid querystring value.
        End If

This is nice, clean and efficient - plus, it has the added bonus of handling the situation where the QueryString ID value returns nothing.

Really powerful!

These are just the basics on input validation (and I can go on and on and on), but I think you can see my point.

Enjoy!

Comments [0] | Trackback |

Samurai Programmer.com

I know kung fu

Contact

Categories

On this page

Archive

Blogroll

Step 1. Launch WinDBG and attach to the process in question.

Step 2. Load the SOS extension into WinDBG by executing:

Step 4. “Go” in the debugger and wait for your breakpoint to be hit.

Step 5. Print out the call-stack.