Home | About Me | Developer PFE Blog | Become a Developer PFE

Contact

Categories

On this page

Visualizing Azure performance counters…
Azure: Why did my role crash?

Archive

Blogroll

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

Sign In

# Monday, September 27, 2010
Monday, September 27, 2010 12:19:44 AM (Central Daylight Time, UTC-05:00) ( Azure | Perfmon )

I’ve blogged before about how some things are different in the cloud, namely Azure.  That post dealt with finding out why your Azure Role crashed by using some of the logging facilities and storage available in Azure.  That’s kind of a worst case scenario, though.  Often, you’ll just want to gauge the health of your application.  In the non-cloud world, you’d just use the Windows Performance Monitor to capture performance counters that tell you things like your CPU utilization, memory usage, request execution time, etc.  All of these counters are great to determine your overall application and server health as well as helping you troubleshoot problems that may have arisen with your application. 

I’m often surprised at how many developers have never used Perfmon before or think that it is purely a task that administrators need to care about.  Performance is everyone’s responsibility and a tool like Perfmon is invaluable if you need to gain an accurate understanding of your application’s performance. 

When you’ve needed to log counters in the past, you would have probably become familiar with this dialog:

image

In the Azure world, you can add counters to be collected in a variety of ways.  A popular way is just to add the counters you want in your OnStart() event. 

public override bool OnStart()
{
DiagnosticMonitorConfiguration dmc = DiagnosticMonitor.GetDefaultInitialConfiguration();

// Add counter(s) to collect.
dmc.PerformanceCounters.DataSources.Add(
new PerformanceCounterConfiguration()
{
CounterSpecifier = @"\Processor(*)\*",
SampleRate = TimeSpan.FromSeconds(5)
});

// Set transfer period and filter.
dmc.PerformanceCounters.ScheduledTransferPeriod = TimeSpan.FromMinutes(5);

// Start logging, using the connection string in your config file.
DiagnosticMonitor.Start("DiagnosticsConnectionString", dmc);

...

return base.OnStart();
}

This gets the job done quite nicely and even though we’re sampling our performance counter every 5 seconds, the data won’t get written to our persistent storage until we hit the 5 minute mark.

So, let’s fast forward a bit.  You’ve added counters to monitor and now you want to view some of this data.   Fortunately, there are a few tools at your disposal:

For example, using the Azure Management Tool, you can navigate to the Diagnostics/Analysis section to view the performance monitor counters you stored. 

image

Herein lies one of the challenges you may encounter.  Azure is kind of a different beast and it doesn’t store the data in storage using the typical BLG format.  Instead, if you look at the raw data, it will be in an XML or tabular format and may look like:

image

While this is nice and clean, the problem you’ll hit is the visualization of the data.  Most organizations use Perfmon to visualize the counter data you record.  Alas, in this case you can’t use it that way right out of the box.  Fortunately, my friend Tom Fuller and I realized this gap and he wrote a plug-in to the Azure Management Tool that will transform this data from the format above into a native BLG format.  He wrote up a great blog entry with step-by-step instructions and the source code here. You can also download a compiled version of the extension here (along with the Windows Azure MMC) here.  To install the plugin, just copy the DLL into the \WindowsAzureMMC\release folder:

image

Then, when you open up the tool, you’ll see another option in that dropdown in the Diagnostics > Analysis window:

image

Then, when you click show, you’ll get a nice popup telling you to click here to view your data in Perfmon:

image

Then, when you click on it, Perfmon will start with all of the counters added to the display:

image

A few things to note:

  • It will store the resulting BLG file at the following path:

c:\Users\<user name>\AppData\Local\Temp\<File Name>.blg

The BLG can be used in a variety of applications.  For example, you can use it with the great PAL tool.  

  • All of the counters will appear in the graph by default.  Depending upon the amount of data you’re capturing, this could give you data overload.  Just something to be aware of.

Until next time.

# Tuesday, August 3, 2010
Tuesday, August 3, 2010 1:48:17 AM (Central Daylight Time, UTC-05:00) ( .NET | Azure | Development | Logging )

One thing you might encounter when you start your development on windows-azure-logo_1-f2e19cWindows Azure is that there is an insane number of options available for number of options for logging.   You can view a quick primer here.  One of the things that I like about it is that you don’t necessarily need to learn a whole new API just to use it.  Instead, the logging facilities in Azure integrates really well with the existing Debug and Trace Logging API in .NET.  This is a really nice feature and is done very well in Azure.  In-fact to set-up and configure it is all of about 5 lines of code.  Actually it’s four lines of code with one line that wraps:

   1: public override bool OnStart() 
   2: { 
   3:     DiagnosticMonitorConfiguration dmc =          
   4:             DiagnosticMonitor.GetDefaultInitialConfiguration(); 
   5:     dmc.Logs.ScheduledTransferPeriod = TimeSpan.FromMinutes(1); 
   6:     dmc.Logs.ScheduledTransferLogLevelFilter = LogLevel.Verbose;
   7:     DiagnosticMonitor.Start("DiagnosticsConnectionString", dmc); 
   8: } 

One specific item to note is the ScheduledTransferPeriod property of the Logs property.  The minimum value you can set for that property is the equivalent of 1 minute.  The only downside to this method of logging is that if your Azure role crashes within that minute, whatever data you have written to your in-built logging will be lost.  This also means that if you are writing exceptions to your logging, that will be lost as well.  That can cause problems if you’d like to know both when your role crashed and why (the exception details).

Before we talk about a way to get around it, let’s review why the Role might crash.  In the Windows Azure world, there are two primary roles that you will use, a Worker and Web role.  Main characteristics and reasons it would crash are below:

Role Type Analogous Why would it crash/restart?
Worker Role Console Application
  • Any unhandled exception.
Web Role ASP.NET Application hosted in IIS 7.0+
  • Any unhandled exception thrown on a background thread.
  • StackOverflowException
  • Unhandled exception on finalizer thread.

As you can see, the majority of the reasons why an Azure role would recycle/crash/restart are essentially the same as with any other application – essentially an unhandled exception.  Therefore, to mitigate this issue, we can subscribe to the AppDomain’s UnhandledException Event.  This event is fired when your application experiences an exception that is not caught and will fire RIGHT BEFORE the application crashes.  You can subscribe to this event in the Role OnStart() method:

   1: public override bool OnStart()
   2: {
   3:     AppDomain appDomain = AppDomain.CurrentDomain;
   4:     appDomain.UnhandledException += 
   5:         new UnhandledExceptionEventHandler(appDomain_UnhandledException);
   6:     ...
   7: }

You will now be notified right before your process crashes.  The last piece to this puzzle is logging the exception details.  Since you must log the details right when it happens, you can’t just use the normal Trace or Debug statements.  Instead, we will write to the Azure storage directly.  Steve Marx has a good blog entry about printf in the cloud.  While it works, it requires the connection string to be placed right into the logging call.  He mentions that you don’t want that in a production application.  in our case, we will do things a little bit differently.  First, we must add the requisite variables and initialize the storage objects:

   1: private static bool storageInitialized = false;
   2: private static object gate = new Object();
   3: private static CloudBlobClient blobStorage;
   4: private static CloudQueueClient queueStorage;
   5:  
   6: private void InitializeStorage()
   7: {
   8:    if (storageInitialized)
   9:    {
  10:        return;
  11:    }
  12:  
  13:    lock (gate)
  14:    {
  15:       if (storageInitialized)
  16:       {
  17:          return;
  18:       }
  19:  
  20:  
  21:       // read account configuration settings
  22:       var storageAccount = 
  23:         CloudStorageAccount.FromConfigurationSetting("DataConnectionString");
  24:  
  25:       // create blob container for images
  26:       blobStorage = 
  27:           storageAccount.CreateCloudBlobClient();
  28:       CloudBlobContainer container = blobStorage.
  29:           GetContainerReference("webroleerrors");
  30:  
  31:       container.CreateIfNotExist();
  32:  
  33:       // configure container for public access
  34:       var permissions = container.GetPermissions();
  35:       permissions.PublicAccess = 
  36:            BlobContainerPublicAccessType.Container;
  37:       container.SetPermissions(permissions);
  38:  
  39:       storageInitialized = true;
  40:   }
  41: }

This will instantiate the requisite logging variables and then when the InitializeStorage() method is executed, we will set these variables to the appropriate initialized values.  Lastly, we must call this new method and then write to the storage.  We put this code in our UnhandledException event handler:

   1: void appDomain_UnhandledException(object sender, UnhandledExceptionEventArgs e)
   2: {
   3:     // Initialize the storage variables.
   4:     InitializeStorage();
   5:  
   6:     // Get Reference to error container.
   7:     var container = blobStorage.
   8:                GetContainerReference("webroleerrors");
   9:  
  10:  
  11:     if (container != null)
  12:     {
  13:         // Retrieve last exception.
  14:         Exception ex = e.ExceptionObject as Exception;
  15:  
  16:         if (ex != null)
  17:         {
  18:             // Will create a new entry in the container
  19:             // and upload the text representing the 
  20:             // exception.
  21:             container.GetBlobReference(
  22:                String.Format(
  23:                   "<insert unique name for your application>-{0}-{1}",
  24:                   RoleEnvironment.CurrentRoleInstance.Id,
  25:                   DateTime.UtcNow.Ticks)
  26:                ).UploadText(ex.ToString());
  27:         }
  28:     }
  29:     
  30: }

Now, when your Azure role is about to crash, you’ll find an entry in your blog storage with the details of the exception that was thrown.  For example, one of the leading reasons why an Azure Worker Role crashes is because it can’t find a dependency it needs.  So, in that case, you’ll find an entry in your storage with the following details:

   1: System.IO.FileNotFoundException: Could not load file or assembly 
   2:     'GuestBook_Data, Version=1.0.0.0, Culture=neutral, 
   3:     PublicKeyToken=f8a5fcb6c395f621' or one of its dependencies. 
   4:     The system cannot find the file specified.
   5:  
   6: File name: 'GuestBook_Data, Version=1.0.0.0, Culture=neutral, 
   7:     PublicKeyToken=f8a5fcb6c395f621'

You can find some other common reasons why an Azure role might crash (especially when you first deploy it) can be found at Anton Staykov’s excellent blog:

Hope this helps.

Enjoy!