Samurai Programmer.com

Contact

Blogroll

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

Are you ready for your next challenge?

Site Update July 2010 - (aka - Where are all of your posts?)

Latest Tweet:

Sunday, 17 October 2010

Yield Return…a little known but incredible language feature

Sunday, 17 October 2010 17:54:49 (Central Daylight Time, UTC-05:00) ( .NET | Performance )

“We shall neither fail nor falter; we shall not weaken or tire…give us the tools and we will finish the job.” – Winston Churchill

I don’t often blog about specific language features but over the past few weeks I’ve spoken to a few folks that did not know of the “yield” keyword and the “yield return” and “yield break” statements, so I thought it might be a good opportunity to shed some light on this little known but extremely useful C# feature. Chances are, you’ve probably indirectly used this feature before and just never known it.

We’ll start with the problem I’ve seen that plagues many applications. Often times, you’ll call a method that returns a List<T> or some other concrete collection. The method probably looks something like this:

public static List<string> GenerateMyList()
{
    List<string> myList = new List<string>();

    for (int i = 0; i < 100; i++)
    {

        myList.Add(i.ToString());    

    }

    return myList;

}

I’m sure your logic is going to be significantly more complex than what I have above but you get the idea. There are a few problems and inefficiencies with this method. Can you spot them?

The entire List<T> must be stored in memory.
Its caller must wait for the List<T> to be returned before it can process anything.
The method itself returns a List<T> back to its caller.

As an aside – with public methods, you should strive to not return a List<T> in your methods. Full details can be found here. The main idea here is that if you choose to change the method signature and return a different collection type in the future, this would be considered a breaking change to your callers.

In any case, I’ll focus on the first two items in the list above. If the List<T> that is returned from the GenerateMyList() method is large then that will be a lot of data that must be kept around in memory. In addition, if it takes a long time to generate the list, your caller is stuck until you’ve completely finished your processing.

Instead, you can use that nifty “yield” keyword. This allows the GenerateMyList() method to return items to its caller as they are being processed. This means that you no longer need to keep the entire list in memory and can just return one item at a time until you get to the end of your returned items. To illustrate my point, I’ll refactor the above method into the following:

private static IEnumerable<string> GenerateMyList()
{
    for (int i = 0; i < 100; i++)
    {
        string value = i.ToString();
        Console.WriteLine("Returning {0} to caller.", value);
        yield return value;
    }

    Console.WriteLine("Method done!");

    yield break;
}

A few things to note in this method. The return type has been changed to an IEnumerable<string>. This is one of those nifty interfaces that exposes an enumerator. This allows its caller to cycle through the results in a foreach or while loop. In addition, the “yield return i.ToString()” will return that item to its caller at that point and not when the entire method has completed its processing. This allows for a very exciting caller-callee type relationship. For example, if I call this method like so:

IEnumerable<string> myList = GenerateMyList();

foreach (string listItem in myList)
{
    Console.WriteLine("Item: " + listItem);
}

The output would be:

Thus showing that each item gets returned at the time we processed it. So, how does this work? Well, at compile time, we will generate a class to implement the behavior in the iterator. Essentially, this means that the GenerateMyList() method body gets placed into the MoveNext() method. In-fact, if you open up the compiled assembly in Reflector, you see that plumbing in place (comments are mine and some code was omitted for clarity’s sake):

private bool MoveNext()
{
    
            this.<>1__state = -1;
            this.<i>5__1 = 0;
            // My for loop has changed to a while loop.
            while (this.<i>5__1 < 100)
            {
                // Sets a local value
                this.<value>5__2 = this.<i>5__1.ToString();
                // Here is my Console.WriteLine(...)
                Console.WriteLine("Returning {0} to caller.", this.<value>5__2);
                // Here is where the current member variable
                // gets stored.
                this.<>2__current = this.<value>5__2;
                this.<>1__state = 1;
                // We return "true" to the caller so it knows
                // there is another record to be processed.
                return true;
                ...
            }
            // Here is my Console.WriteLine() at the bottom
            // when we've finished processing the loop.
            Console.WriteLine("Method done!");
            break;

    
}

Pretty straightforward. Of course, the real power is that it the compiler converts the “yield return <blah>” into a nice clean enumerator with a MoveNext(). In-fact, if you’ve used LINQ, you’ve probably used this feature without even knowing it. Consider the following code:

private static void OutputLinqToXmlQuery()
{
    XDocument doc = XDocument.Parse
                        (@"<root><data>hello world</data><data>goodbye world</data></root>");

    var results = from data in doc.Descendants("data")
                  select data;

    foreach (var result in results)
    {
        Console.WriteLine(result);                
    }
}

The “results” object, by default will be of type “WhereSelectEnumerableIterator” which exposes a MoveNext() method. In-fact, that is also why the results object doesn’t allow you to do something like this:

var results = from data in doc.Descendants("data")
              select data;

var bad = results[1];

The IEnumerator does not expose an indexer allowing you to go straight to a particular element in the collection because the full collection hasn’t been generated yet. Instead, you would do something like this:

var results = from data in doc.Descendants("data")
              select data;            

var good = results.ElementAt(1);

And then under the covers, the ElementAt(int) method will just keep calling MoveNext() until it reaches the index you specified. Something like this:

Note: this is my own code and is NOT from the .NET Framework – it is merely meant to illustrate a point.

public static XElement MyElementAt(this IEnumerable<XElement> elements, 
                                   int index)
{
    int counter = 0;
    using (IEnumerator<XElement> enumerator = 
                                 elements.GetEnumerator())
    {            
        while(enumerator.MoveNext()){
            if (counter == index)
                return enumerator.Current;
            counter++;
        }
    }

    return null;
}

Hope this helps to demystify some things and put another tool in your toolbox.

Until next time.

Comments [0] | Trackback |

Friday, 20 August 2010

Micro optimization or just good coding practice?

This is a common topic and I thought I’d write up some thoughts I have on it. In-fact, I was just working with a customer on improving their code reviews and what they should be checking for and the question arose - “Should performance be targeted during a code review?” It’s an interesting question. I’m a big fan of performance testing early and often and not waiting until the end of a dev cycle but code reviews, IMO, should focus on logic, maintainability and best practices. I may be in the minority and if you look around the web, you’ll see varying opinions on the topic. For example, one of the PAG articles states:

“Code reviews should be a regular part of your development process. Performance and scalability code reviews focus on identifying coding techniques and design choices that could lead to performance and scalability issues. The review goal is to identify potential performance and scalability issues before the code is deployed. The cost and effort of fixing performance and scalability flaws at development time is far less than fixing them later in the product deployment cycle.

Avoid performance code reviews too early in the coding phase because this can restrict your design options. Also, bear in mind that that performance decisions often involve tradeoffs. For example, it is easy to reduce maintainability and flexibility while striving to optimize code.”

As I mentioned above, I am a huge proponent of performance analysis and optimization many times throughout a typical product development cycle. I can say with a fair amount of certainty that if you don’t build performance reviews into your project plan at regular intervals, you will hit some problem (or multiple problems) in production and have to refactor some code.

Circling back to the original question, though, are code reviews the place for performance analysis? Typically, I’d recommend using them to squash little bits of bad code but maintainability and code-cleanliness should be first and foremost in your minds. That said, if you see a pattern that you know can be improved, by all means bring it up. What’s an example of that type of situation?

Let’s take a look at predicates, specifically their usage in the Find method of a List<T>. If you’re not aware, the Find() method performs a linear search through all of the items until it finds the first match – then it returns. This makes it a O(n) operation where “n” is the number of items in the list. Basically, this means that the more items you have in the list, the longer a Find() operation can potentially take. So, if we slam about 10,000 elements into a list:

private static List<Data> LoadList()
{
    List<Data> myList = new List<Data>();
    for (int i = 0; i < 10000; i++)
    {
        myList.Add(new Data() { Id = "Id" + i.ToString(), 
                                Value = "Value" + i.ToString() });
    }

    return myList;
}

Then, if someone wants to return the instance of the Data class that contains an Id of say “Id10000”, they might write the following code:

static Data Find1(List<Data> myList, string idToFind)
{
    Data data = myList.Find(s => 
                        s.Id.ToLower() == 
                        idToFind.ToLower());

    return data;
}

Now, keep in mind that the predicate is executed for each element in the List<T> until it finds the instance you care about. With that in mind, we would probably want to refactor out the “idToFind.ToLower()” above the predicate since that value isn’t changing. So, you might end-up with something like this:

static Data Find2(List<Data> myList, string idToFind)
{

    idToFind = idToFind.ToLower();

    Data data = myList.Find(s => 
                            s.Id.ToLower() == 
                            idToFind);

    return data;
}

Another route you may want to go is just to use the string.Equals(…) method to perform the comparison. That would look like:

static Data Find3(List<Data> myList, string idToFind)
{

    Data data = myList.Find(s => 
                            string.Equals(
                                s.Id, 
                                idToFind, 
                                StringComparison.
                                    InvariantCultureIgnoreCase)
                           );

    return data;

}

Fact is, the last method IS the fastest way to perform the operation. I can say that without even needing to run it through a profiler. But if you don’t believe me…

Function Name	Elapsed Inclusive Time
...Find1(System.Collections.Generic.List`1<....Data>,string)	6.34
...Find2(System.Collections.Generic.List`1<....Data>,string)	4.47
...Find3(System.Collections.Generic.List`1<....Data>,string)	3.65

That’s something I might put into the category of a micro-optimization AND just good coding practice. But is this something that should be caught during a code review? I’d say “yes” because logically it all makes sense and none of the solutions would really hurt maintainability or readability of the code.

So, I’d tag this as a good coding practice. Other thoughts on the topic?

Enjoy!

Comments [0] | Trackback |

Tuesday, 03 August 2010

Azure: Why did my role crash?

Tuesday, 03 August 2010 01:48:17 (Central Daylight Time, UTC-05:00) ( .NET | Azure | Development | Logging )

One thing you might encounter when you start your development on Windows Azure is that there is an insane number of options available for number of options for logging. You can view a quick primer here. One of the things that I like about it is that you don’t necessarily need to learn a whole new API just to use it. Instead, the logging facilities in Azure integrates really well with the existing Debug and Trace Logging API in .NET. This is a really nice feature and is done very well in Azure. In-fact to set-up and configure it is all of about 5 lines of code. Actually it’s four lines of code with one line that wraps:

        1: public override bool OnStart() 

       2: { 

       3:     DiagnosticMonitorConfiguration dmc =          

       4:             DiagnosticMonitor.GetDefaultInitialConfiguration(); 

       5:     dmc.Logs.ScheduledTransferPeriod = TimeSpan.FromMinutes(1); 

       6:     dmc.Logs.ScheduledTransferLogLevelFilter = LogLevel.Verbose;

       7:     DiagnosticMonitor.Start("DiagnosticsConnectionString", dmc); 

       8: }

One specific item to note is the ScheduledTransferPeriod property of the Logs property. The minimum value you can set for that property is the equivalent of 1 minute. The only downside to this method of logging is that if your Azure role crashes within that minute, whatever data you have written to your in-built logging will be lost. This also means that if you are writing exceptions to your logging, that will be lost as well. That can cause problems if you’d like to know both when your role crashed and why (the exception details).

Before we talk about a way to get around it, let’s review why the Role might crash. In the Windows Azure world, there are two primary roles that you will use, a Worker and Web role. Main characteristics and reasons it would crash are below:

Role Type	Analogous	Why would it crash/restart?
Worker Role	Console Application	Any unhandled exception.
Web Role	ASP.NET Application hosted in IIS 7.0+	Any unhandled exception thrown on a background thread. StackOverflowException Unhandled exception on finalizer thread.

As you can see, the majority of the reasons why an Azure role would recycle/crash/restart are essentially the same as with any other application – essentially an unhandled exception. Therefore, to mitigate this issue, we can subscribe to the AppDomain’s UnhandledException Event. This event is fired when your application experiences an exception that is not caught and will fire RIGHT BEFORE the application crashes. You can subscribe to this event in the Role OnStart() method:

       1: public override bool OnStart()

       2: {

       3:     AppDomain appDomain = AppDomain.CurrentDomain;

       4:     appDomain.UnhandledException += 

       5:         new UnhandledExceptionEventHandler(appDomain_UnhandledException);

       6:     ...

       7: }

You will now be notified right before your process crashes. The last piece to this puzzle is logging the exception details. Since you must log the details right when it happens, you can’t just use the normal Trace or Debug statements. Instead, we will write to the Azure storage directly. Steve Marx has a good blog entry about printf in the cloud. While it works, it requires the connection string to be placed right into the logging call. He mentions that you don’t want that in a production application. in our case, we will do things a little bit differently. First, we must add the requisite variables and initialize the storage objects:

       1: private static bool storageInitialized = false;

       2: private static object gate = new Object();

       3: private static CloudBlobClient blobStorage;

       4: private static CloudQueueClient queueStorage;

       5:  

       6: private void InitializeStorage()

       7: {

       8:    if (storageInitialized)

       9:    {

       return;

   }

  

   lock (gate)

   {

      if (storageInitialized)

      {

         return;

      }

  

  

      // read account configuration settings

      var storageAccount = 

        CloudStorageAccount.FromConfigurationSetting("DataConnectionString");

  

      // create blob container for images

      blobStorage = 

          storageAccount.CreateCloudBlobClient();

      CloudBlobContainer container = blobStorage.

          GetContainerReference("webroleerrors");

  

      container.CreateIfNotExist();

  

      // configure container for public access

      var permissions = container.GetPermissions();

      permissions.PublicAccess = 

           BlobContainerPublicAccessType.Container;

      container.SetPermissions(permissions);

  

      storageInitialized = true;

  }

}

This will instantiate the requisite logging variables and then when the InitializeStorage() method is executed, we will set these variables to the appropriate initialized values. Lastly, we must call this new method and then write to the storage. We put this code in our UnhandledException event handler:

       1: void appDomain_UnhandledException(object sender, UnhandledExceptionEventArgs e)

       2: {

       3:     // Initialize the storage variables.

       4:     InitializeStorage();

       5:  

       6:     // Get Reference to error container.

       7:     var container = blobStorage.

       8:                GetContainerReference("webroleerrors");

       9:  

  

    if (container != null)

    {

        // Retrieve last exception.

        Exception ex = e.ExceptionObject as Exception;

  

        if (ex != null)

        {

            // Will create a new entry in the container

            // and upload the text representing the 

            // exception.

            container.GetBlobReference(

               String.Format(

                  "<insert unique name for your application>-{0}-{1}",

                  RoleEnvironment.CurrentRoleInstance.Id,

                  DateTime.UtcNow.Ticks)

               ).UploadText(ex.ToString());

        }

    }

    

}

Now, when your Azure role is about to crash, you’ll find an entry in your blog storage with the details of the exception that was thrown. For example, one of the leading reasons why an Azure Worker Role crashes is because it can’t find a dependency it needs. So, in that case, you’ll find an entry in your storage with the following details:

       1: System.IO.FileNotFoundException: Could not load file or assembly 

       2:     'GuestBook_Data, Version=1.0.0.0, Culture=neutral, 

       3:     PublicKeyToken=f8a5fcb6c395f621' or one of its dependencies. 

       4:     The system cannot find the file specified.

       5:  

       6: File name: 'GuestBook_Data, Version=1.0.0.0, Culture=neutral, 

       7:     PublicKeyToken=f8a5fcb6c395f621'

You can find some other common reasons why an Azure role might crash (especially when you first deploy it) can be found at Anton Staykov’s excellent blog:

Hope this helps.

Enjoy!

Comments [0] | Trackback |

Friday, 23 July 2010

.NET 2.0 and Access Databases

For an internal application at my company, very particular users have the rights to download an MDB that contains links to a replicated instance of a DB. This was done so they could create their own queries without any assistance from IT - essentially a poor man's Cognos or Reporting Services. This worked fine in .Net 1.1.

We are currently in the process of upgrading this application to .Net 2.0 and one of the developers on the project brought to my attention that the download was failing with a message stating:

She did some research and uncovered this forum post with step by step instructions on how to re-enable the downloading of an Access database.

Please keep in mind, though, that this was done for security purposes - so any page that links to a secured Access DB - should be secured in a variety of ways:

1. Code Access Security:

        1: <PrincipalPermission(SecurityAction.Demand, _ 

       2:                      Authenticated:=True, _

       3:                      Role:="Secured User")> _

This should be placed in the code-behind on the page with the link to the file. As you can see, the "Role" property will secure this page to just a user in a particular role. If you have a page that is accessible to multiple roles, you can stack the class attributes on top of each other as seen here:

       1: <PrincipalPermission(SecurityAction.Demand, _

       2:                      Authenticated:=True, 

       3:                      Role:="First Role")> _

       4: <PrincipalPermission(SecurityAction.Demand, _

       5:                      Authenticated:=True, 

       6:                      Role:="Second Role")> _

       7: Partial Class PagesToBeSecured

2. File Access Restriction:

Also, by putting a web.config in the folder that houses the Access DB, you can secure access to the contents of a particular folder. For example with a structure of:

       1: Root/Downloads/Database/db.mdb

You can put a web.config in the "Database" folder that restricts access to that folder to a particular user/role/etc. For example, the web.config in this example would look something like:

       1: <?xml version="1.0" encoding="utf-8"?>

       2: <configuration>

       3:     <system.web>

       4:         <authorization>

       5:             <allow roles="Role 1" />

       6:             <allow roles="Role 2"/>

       7:             <deny users="*" />

       8:         </authorization>

       9:     </system.web>

      10: </configuration>

As you can see, we are only allowing users in roles Role 1 and Role 2 to access the contents of this folder. If any other users attempt to access the file, they will get a Page Cannot Be found exception.

In conclusion - if you are going to remove this particular restriction in IIS to allow users to download Access DB files directly from your web application - then you shouldn't completely discount security and you implement both of the following measures.

Enjoy!

Comments [0] | Trackback |

ASP.NET 2.0: Textbox ReadOnly Property

Friday, 23 July 2010 22:42:12 (Central Daylight Time, UTC-05:00) ( .NET | .NET Upgrade | ASP.NET | Development )

This has been documented in numerous places including here, here, here (with an incorrect fix) and here.

Essentially, if you set the Readonly Property of a textbox using either the XHTML or in the code-behind, any changes you make to the value of that textbox client-side (ie: javascript) will not be persisted across postbacks. I had found this method extremely effective for Calendar controls and what-not. You just provide a link to a popup and then when the user selects an item in the popup, you can populate the underlying Textbox with the value from the popup. Unfortunately, out of the box, that functionality has been taken away.

The code in the Textbox that actually signals to the page that textboxes using the ReadOnly property should not be saved in ViewState is in a few places:

        1: Protected Overrides Function SaveViewState() As Object

       2:     If Not Me.SaveTextViewState Then

       3:         Me.ViewState.SetItemDirty("Text", False)

       4:     End If

       5:     Return MyBase.SaveViewState

       6: End Function

       7:  

       8: Private Shadows ReadOnly Property SaveTextViewState() _

       9:                                             As Boolean

      10:     Get

      11:  

      12:         If (Me.TextMode = TextBoxMode.Password) Then

      13:             Return False

      14:         End If

      15:         If (((MyBase.Events.Item(EventTextChanged) Is Nothing) _

      16:                 AndAlso MyBase.IsEnabled) _

      17:                 AndAlso ((Me.Visible AndAlso Not Me.ReadOnly) _

      18:                 AndAlso (MyBase.GetType Is GetType(TextBox)))) Then

      19:             Return False

      20:         End If

      21:  

      22:         Return True

      23:     End Get

      24: End Property

In the SaveTextViewState() method, part of the conditional statement specifies that if the ReadOnly property is set to true, then do not save the ViewState.

Also, in the LoadPostData() method, there is another check to make sure that the control's ReadOnly property is not set to true before it loads the data from the form post:

       1: Protected Overrides Function LoadPostData( _

       2:                 ByVal postDataKey As String, _

       3:                 ByVal postCollection As _

       4:                     NameValueCollection) As Boolean

       5:     MyBase.ValidateEvent(postDataKey)

       6:     Dim text As String = Me.Text

       7:     Dim text2 As String = _

       8:                 postCollection.Item(postDataKey)

       9:     If (Not Me.ReadOnly _

      10:         AndAlso Not [text].Equals(text2, _

      11:             StringComparison.Ordinal)) Then

      12:         Me.Text = text2

      13:         Return True

      14:     End If

      15:  

      16:     Return False

      17: End Function

If the ReadOnly property is set to true, then it will not set the Text property of the TextBox to be the value from the Form post.

The fix that people are touting is relatively simple. In the code behind of the page, you can simply add the attribute manually to the TextBox like this:

       1: Protected Sub Page_Load(ByVal sender As Object, _

       2:         ByVal e As System.EventArgs) Handles Me.Load

       3:     With Me.txtTest

       4:         .Attributes.Add("readonly", "readonly")

       5:     End With

       6: End Sub

And while this would work - this just feels like a bandage. Plus, if you have a big web project you are converting from .net 1.1 to .net 2.0 that contains TextBoxes all over the place that have the ReadOnly=True set, then this would be a lot of code to include on every single page. Even if you have a shared function somewhere that you can call - that means you still have to call the function for each Textbox that you wish to set as ReadOnly.

An alternative is to create your own TextBox class that inherits from the existing TextBox and handles the ReadOnly property in the fashion mentioned above. Any example of the code for the new TextBox is as follows:

       1: Namespace WebControls

       2:     Public Class TextBox

       3:         Inherits System.Web.UI.WebControls.TextBox

       4:  

       5:         Public Overrides Property [ReadOnly]() As Boolean

       6:             Get

       7:                 Return False

       8:             End Get

       9:             Set(ByVal value As Boolean)

      10:                 If value Then

      11:                     Me.Attributes("readonly") = "readonly"

      12:                 Else

      13:                     Me.Attributes.Remove("readonly")

      14:                 End If

      15:             End Set

      16:         End Property

      17:  

      18:     End Class

      19: End Namespace

That's fine for new TextBoxes, but what about my hundreds and hundreds of TextBoxes that are already in my application? Well, .Net 2.0 allows you to use something called tagMappings in your web.config. Essentially, they are a way to redirect the compiler to use a different class for an existing mapping when it compiles your pages. From MSDN:

"Defines a collection of tag types that are remapped to other tag types at compile time."

So, how do we use this neat feature to solve this problem? Well, in the web.config, we will tell the .Net compiler to use our new TextBox control instead of the existing System.Web.UI.WebControls.TextBox control. In order to do this, you simply add the following to your web.config:

       1: <pages>

       2:       <tagMapping>

       3:         <add tagType="System.Web.UI.WebControls.TextBox" 

       4:              mappedTagType="WebControls.TextBox"/>

       5:       </tagMapping>

       6: </pages>

The tagType attribute is the existing tag you would like to change (System.Web.UI.WebControls.TextBox) and the mappedTagType is the new TextBox control from above.

With this tag added in the web.config - whenever you have the following in a page:

       1: <asp:TextBox runat="server" 

       2:              ID="txtTest" 

       3:              ReadOnly="true" />

Your application will use the TextBox defined directly above.

Enjoy!

Comments [0] | Trackback |

Samurai Programmer.com

I know kung fu

Contact

Categories

On this page

Archive

Blogroll