Samurai Programmer.com - Performance

PInvoke Error in .NET 4: Array size control parameter index is out of range

Greg Varveris — Sat, 19 Nov 2011 13:45:14 GMT

So in a code-base I was working in yesterday, we use PInvoke to call out to the Performance Data Helper (PDH) API’s to collect performance information for machines without using Perfmon. One of those PInvoke calls looked like this:

/*

PDH_STATUS PdhExpandCounterPath(

LPCTSTR szWildCardPath,

LPTSTR mszExpandedPathList,

LPDWORD pcchPathListLength

);

*/

[DllImport("pdh.dll", CharSet = CharSet.Unicode)]

private static extern PdhStatus PdhExpandCounterPath(

string szWildCardPath,

[MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 3)] char[] mszExpandedPathList,

ref uint pcchPathListLength

);

In .NET 3.5 and below, this PInvoke call works perfectly fine. In .NET 4.0, though, I saw this exception:

System.Runtime.InteropServices.MarshalDirectiveException: 


Cannot marshal 'parameter #2': Array size control parameter index is out of range. 


at System.Runtime.InteropServices.Marshal.InternalPrelink(IRuntimeMethodInfo m) 


at System.Runtime.InteropServices.Marshal.Prelink(MethodInfo m)

So, can you identify what’s wrong in the code above?

Well, the Array size control parameter index indicates the zero-based parameter that contains the count of the array elements, similar to size_is in COM. Because the marshaler cannot determine the size of an unmanaged array, you have to pass it in as a separate parameter. So in the call above, parameter #2, we specify “SizeParamIndex = 3” to reference the pcchPathListLength parameter to set the length of the array. So what’s the catch?

Well, since the SizeParamIndex is a zero-based index, the 3rd parameter doesn’t really exist. So, to fix this, we just change the “SizeParamIndex=3” to “SizeParamIndex=2” to reference the pcchPathListLength:

/*

PDH_STATUS PdhExpandCounterPath(

LPCTSTR szWildCardPath,

LPTSTR mszExpandedPathList,

LPDWORD pcchPathListLength

);

*/

[DllImport("pdh.dll", CharSet = CharSet.Unicode)]

private static extern PdhStatus PdhExpandCounterPath(

string szWildCardPath,

[MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 2)] char[] mszExpandedPathList,

ref uint pcchPathListLength

);

It looks like in .NET 3.5 and below, though, we allowed you to reference either 1-based index or a zero-based index but in .NET 4.0, we buttoned that up a bit and force you to use the zero-based index. Big thanks to my co-worker and frequent collaborator, Zach Kramer for his assistance in looking at this issue.

Until Next Time!

The (potentially) dark side of parallelism…

Greg Varveris — Sun, 05 Dec 2010 18:52:17 GMT

Read a great blog entry by Scott Hanselman recently talking about the parallel dilemma that I’m sure we’ll see folks face in the future with the (old/new) Parallel classes. I wanted to add a few things to this discussion as he focused on the mechanics of the parallel requests but maybe not the potential effects it could have on the macro view of your application. This was originally written as an e-mail I sent to my team but thought others might find it interesting.

There will be an inclination by people to use the new Parallel functionality in .NET 4.0 to easily spawn operations onto numerous background threads. That will generally be okay for console/winform/wpf apps – but could also be potentially bad for ASP.NET apps as the spawned threads could take away from the processing power and threads available to process new webpage requests. I’ll explain more on that later.

For example, by default, when you do something like Parallel.ForEach(…) or some such, the parallel library starts firing Tasks to the thread pool so that it can best utilize the processing power available on your machine (oversimplification but you get the idea). The downside is that the thread pool contains a finite number of worker threads threads available to a process. Granted, you have about 100 threads per logical processor in .NET 4 – but it’s worth noting.

While Scott’s entry talks about the new way to implement the Async pattern, I’ve already seen a bunch of folks use the “Parallel” class because it abstracts away some of the plumbing of the Async operations and that ease of use could become problematic.

For example, consider this code:

string[] myStrings = { "hello", "world", "you", "crazy", "pfes", "out", "there" };



Parallel.ForEach(myStrings, myString =>

{



System.Console.WriteLine(DateTime.Now + ":" + myString + 


" - From Thread #" + 


Thread.CurrentThread.ManagedThreadId);

Thread.Sleep(new Random().Next(1000, 5000));



});

This is a very simple implementation of Parallel’izing a foreach that just writes some string output with an artificial delay. Output would be something like:

11/16/2010 2:40:05 AM:hello - From Thread #10
11/16/2010 2:40:05 AM:crazy - From Thread #11
11/16/2010 2:40:05 AM:there - From Thread #12
11/16/2010 2:40:06 AM:world - From Thread #13
11/16/2010 2:40:06 AM:pfes - From Thread #14
11/16/2010 2:40:06 AM:you - From Thread #12
11/16/2010 2:40:07 AM:out - From Thread #11

Note the multiple thread ids and extrapolate that out to a server that has more than just my paltry 2 CPUs. This can be potentially problematic for ASP.NET applications as you have a finite number of worker threads available in your worker process and they must be shared across not just one user but hundreds (or even thousands). So, we might see that spawning an operation across tons of threads can potentially reduce the scalability of your site.

Fortunately, there is a ParallelOptions class where you can set the degree of parallel’ism. Updated code as follows:

string[] myStrings = { "hello", "world", "you", "crazy", "pfes", "out", "there" };



ParallelOptions options = new ParallelOptions();

options.MaxDegreeOfParallelism = 1;



Parallel.ForEach(myStrings,options, myString =>

{



// Nothing changes here

...



});

This would then output something like:

11/16/2010 2:40:11 AM:hello - From Thread #10
11/16/2010 2:40:12 AM:world - From Thread #10
11/16/2010 2:40:16 AM:you - From Thread #10
11/16/2010 2:40:20 AM:crazy - From Thread #10
11/16/2010 2:40:23 AM:pfes - From Thread #10
11/16/2010 2:40:26 AM:out - From Thread #10
11/16/2010 2:40:29 AM:there - From Thread #10

Since I set the MaxDegreeOfParallelism to “1”, we see that it just uses the same thread over and over. Within reason, that setting *should* correspond to the number of threads it will use to handle the request.

Applying to a website

So, let’s apply the code from the above to a simple website and compare the difference between the full parallel implementation and the non-parallel implementation. The test I used ran for 10 minutes with a consistent load of 20 users on a dual-core machine running IIS 7.

In all of the images below, the blue line (or baseline) represents the single-threaded implementation and the purple line (or compared) represents the parallel implementation.

We’ll start with the request execution time. As we’d expect, the time to complete the request decreases significantly with the parallel implementation.

But what is the cost from a thread perspective? For that, we’ll look at the number of physical threads:

As we’d also expect, there is a significant increase in the number of threads used in the process. We go from ~20 threads in the process to a peak of almost 200 threads throughout the test. Seeing as this was run on a dual-core machine, we’ll have a maximum of 200 worker threads available in the thread pool. After those threads become depleted, you often see requests start getting queued, waiting for a thread to become available. So, what happened in our simple test? We’re look at the requests queued value for that:

We did, in-fact, start to see a small number of requests become queued throughout our test. This indicates that some requests started to pile up waiting for an available thread to become available.

Please note that I’m NOT saying that you should not use Parallel operations in your website. You saw in the first image that the actual request execution time decreased significantly from the non-parallel implementation to the parallel implementation. But it’s important to note that nothing is free and while parallelizing your work can and will improve the performance of a single request, it should also be weighed against the potential performance of your site overall.

Until next time.

Yield Return…a little known but incredible language feature

Greg Varveris — Sun, 17 Oct 2010 22:54:49 GMT

“We shall neither fail nor falter; we shall not weaken or tire…give us the tools and we will finish the job.” – Winston Churchill

I don’t often blog about specific language features but over the past few weeks I’ve spoken to a few folks that did not know of the “yield” keyword and the “yield return” and “yield break” statements, so I thought it might be a good opportunity to shed some light on this little known but extremely useful C# feature. Chances are, you’ve probably indirectly used this feature before and just never known it.

We’ll start with the problem I’ve seen that plagues many applications. Often times, you’ll call a method that returns a List<T> or some other concrete collection. The method probably looks something like this:

public static List<string> GenerateMyList()

{

List<string> myList = new List<string>();



for (int i = 0; i < 100; i++)

{



myList.Add(i.ToString()); 




}



return myList;



}

I’m sure your logic is going to be significantly more complex than what I have above but you get the idea. There are a few problems and inefficiencies with this method. Can you spot them?

The entire List<T> must be stored in memory.
Its caller must wait for the List<T> to be returned before it can process anything.
The method itself returns a List<T> back to its caller.

As an aside – with public methods, you should strive to not return a List<T> in your methods. Full details can be found here. The main idea here is that if you choose to change the method signature and return a different collection type in the future, this would be considered a breaking change to your callers.

In any case, I’ll focus on the first two items in the list above. If the List<T> that is returned from the GenerateMyList() method is large then that will be a lot of data that must be kept around in memory. In addition, if it takes a long time to generate the list, your caller is stuck until you’ve completely finished your processing.

Instead, you can use that nifty “yield” keyword. This allows the GenerateMyList() method to return items to its caller as they are being processed. This means that you no longer need to keep the entire list in memory and can just return one item at a time until you get to the end of your returned items. To illustrate my point, I’ll refactor the above method into the following:

private static IEnumerable<string> GenerateMyList()

{

for (int i = 0; i < 100; i++)

{

string value = i.ToString();

Console.WriteLine("Returning {0} to caller.", value);

yield return value;

}



Console.WriteLine("Method done!");



yield break;

}

A few things to note in this method. The return type has been changed to an IEnumerable<string>. This is one of those nifty interfaces that exposes an enumerator. This allows its caller to cycle through the results in a foreach or while loop. In addition, the “yield return i.ToString()” will return that item to its caller at that point and not when the entire method has completed its processing. This allows for a very exciting caller-callee type relationship. For example, if I call this method like so:

IEnumerable<string> myList = GenerateMyList();



foreach (string listItem in myList)

{

Console.WriteLine("Item: " + listItem);

}

The output would be:

Thus showing that each item gets returned at the time we processed it. So, how does this work? Well, at compile time, we will generate a class to implement the behavior in the iterator. Essentially, this means that the GenerateMyList() method body gets placed into the MoveNext() method. In-fact, if you open up the compiled assembly in Reflector, you see that plumbing in place (comments are mine and some code was omitted for clarity’s sake):

private bool MoveNext()

{



this.<>1__state = -1;

this.<i>5__1 = 0;

// My for loop has changed to a while loop.

while (this.<i>5__1 < 100)

{

// Sets a local value

this.<value>5__2 = this.<i>5__1.ToString();

// Here is my Console.WriteLine(...)

Console.WriteLine("Returning {0} to caller.", this.<value>5__2);

// Here is where the current member variable

// gets stored.

this.<>2__current = this.<value>5__2;

this.<>1__state = 1;

// We return "true" to the caller so it knows

// there is another record to be processed.

return true;

...

}

// Here is my Console.WriteLine() at the bottom

// when we've finished processing the loop.

Console.WriteLine("Method done!");

break;





}

Pretty straightforward. Of course, the real power is that it the compiler converts the “yield return <blah>” into a nice clean enumerator with a MoveNext(). In-fact, if you’ve used LINQ, you’ve probably used this feature without even knowing it. Consider the following code:

private static void OutputLinqToXmlQuery()

{

XDocument doc = XDocument.Parse

(@"<root><data>hello world</data><data>goodbye world</data></root>");



var results = from data in doc.Descendants("data")

select data;



foreach (var result in results)

{

Console.WriteLine(result); 


}

}

The “results” object, by default will be of type “WhereSelectEnumerableIterator” which exposes a MoveNext() method. In-fact, that is also why the results object doesn’t allow you to do something like this:

var results = from data in doc.Descendants("data")

select data;



var bad = results[1];

The IEnumerator does not expose an indexer allowing you to go straight to a particular element in the collection because the full collection hasn’t been generated yet. Instead, you would do something like this:

var results = from data in doc.Descendants("data")

select data; 




var good = results.ElementAt(1);

And then under the covers, the ElementAt(int) method will just keep calling MoveNext() until it reaches the index you specified. Something like this:

Note: this is my own code and is NOT from the .NET Framework – it is merely meant to illustrate a point.

public static XElement MyElementAt(this IEnumerable<XElement> elements, 


int index)

{

int counter = 0;

using (IEnumerator<XElement> enumerator = 


elements.GetEnumerator())

{ 


while(enumerator.MoveNext()){

if (counter == index)

return enumerator.Current;

counter++;

}

}



return null;

}

Hope this helps to demystify some things and put another tool in your toolbox.

Until next time.

Don’t guess when it comes to performance…a RegEx story.

Greg Varveris — Sun, 19 Sep 2010 18:30:02 GMT

Many times, when I work with a customer, it’s because they’ve tried to accomplish something and need a little extra help. Often, this falls into the application optimization area. For example, a few years ago, I had a customer that was developing a rather sophisticated SharePoint workflow that had some custom code that would process and merge two Excel spreadsheets together. They were using Excel 2007 so their merging was being done using the excellent Open XML SDK. To their credit, the application did what it needed to do – but it took about an hour to process these spreadsheets. The developers on the project knew about the performance problems but as so often happens, they thought they knew where the bottlenecks were and how they should approach optimizing it. So, they started injecting some tracing into their code and worked hard to optimize this lengthy process. After a while, though, they had only shaved a few seconds off of that 60 minute time and while they did show some improvement – they knew they needed to get the processing done even faster for this to become a viable solution for their organization. So, they sent me a simple repro of the code and together in just a few short days, we were able to get the processing from 60 minutes to under a minute. That’s a BIG win. Big like the Titanic big.

I love these types of engagements because, well, I like to make things faster. The biggest problem that I see, though, is that some people shy away from using the tools in optimization scenarios because they’ve been so invested in their code that they think they know why it’s not performing. This is the “psychic” effect and the mindset is usually something like this:

“I wrote this darn code and while I was writing it – I knew this method could be improved so now I’m going to finally optimize the darn thing.”

It sounds good in-theory, right? You wrote the code, so you should know how to improve it and really, what’s a tool going to tell you that you don’t already know? In truth, it can tell you quite a bit. In other situations, the developers will add some instrumentation (via tracing/debugging statements) to what they perceive as the critical code paths and the resultant timing points to one area when the problem really resides in a completely different section of code. That’s right, folks, your tracing statements may be lying to you. So what’s a developer to do? Well, use the tools, of course – the right tool for the right job, as they say. Let me expand upon that in the context of some code that yours truly wrote a little while ago.

The Problem

A few short weeks ago, I posted a entry about parsing your ASP.NET event log error messages. For those that didn’t read it, we just convert the EventLog messages to XML, parse the messages using RegEx.Match and then generate some statistics on them. The code I provided on that blog entry appeared to be very fast when it was processing a few records – but as the XML files grew, I started noticing that it seemed to take longer and longer to process the results. So, I did what any developer does initially. We’ll add some instrumentation into our code to see if we can figure out the problem. This usually takes the form of something like the following:

Console.WriteLine("Start:  Load XDocument." + DateTime.Now.ToString());

XDocument document = XDocument.Load(@"C:\Work\TestData\AllEVTX\MyData.xml");

Console.WriteLine("End: Load XDocument." + DateTime.Now.ToString());



Console.WriteLine("Start: Load Messages into object." + DateTime.Now.ToString());

var messages = from message in document.Descendants("Message")

select EventLogMessage.Load(message.Value);

Console.WriteLine("End: Load Messages into object." + DateTime.Now.ToString());



Console.WriteLine("Start: Query Objects." + DateTime.Now.ToString());

var results = from log in messages

group log by log.Exceptiontype into l

orderby l.Count() descending, l.Key

select new

{

ExceptionType = l.Key,

ExceptionCount = l.Count()

};

Console.WriteLine("End: Query Objects." + DateTime.Now.ToString());



Console.WriteLine("Start: Output Exception Type Details." + DateTime.Now.ToString());

foreach (var result in results)

{

Console.WriteLine("{0} : {1} time(s)",

result.ExceptionType,

result.ExceptionCount);

}

Console.WriteLine("End: Output Exception Type Details." + DateTime.Now.ToString());

As you can see, all I’ve done is taken the code and slapped some Console.WriteLine statements with a DateTime.Now call to find out when the operation starts and when it completes. If I run this code I get the following timings:

Start: Load XDocument.9/19/2010 11:28:08 AM
End: Load XDocument.9/19/2010 11:28:09 AM
Start: Load Messages into object.9/19/2010 11:28:09 AM
End: Load Messages into object.9/19/2010 11:28:09 AM
Start: Query Objects.9/19/2010 11:28:09 AM
End: Query Objects.9/19/2010 11:28:09 AM
Start: Output Exception Type Details.9/19/2010 11:28:09 AM
End: Output Exception Type Details.9/19/2010 11:28:49 AM

This is clearly a problem Now, the question is why?

The Research

This might lead you to believe that the simple loop I have to get the ExceptionType and the ExceptionCount was the root of all evil, so to speak. The problem, though, is that there’s not really a lot you can do to improve this:

foreach (var result in results)

{

Console.WriteLine("{0} : {1} time(s)", 


result.ExceptionType, 


result.ExceptionCount); 


}

Oh sure, you could use the excellent Parallel functionality in .NET 4 to send off the work to alternate threads. So, you might do something like this:

Parallel.ForEach(results, result=> {



Console.WriteLine("{0} : {1} times(s)", 


result.ExceptionType, 


result.ExceptionCount);



});

But if you re-run the test, you get the following results:

Start: Load XDocument.9/19/2010 11:46:20 AM
End: Load XDocument.9/19/2010 11:46:20 AM
Start: Load Messages into object.9/19/2010 11:46:20 AM
End: Load Messages into object.9/19/2010 11:46:20 AM
Start: Query Objects.9/19/2010 11:46:20 AM
End: Query Objects.9/19/2010 11:46:20 AM
Start: Output Exception Type Details.9/19/2010 11:46:20 AM
End: Output Exception Type Details.9/19/2010 11:46:54 AM

Wait a second…is it taking longer now??? That’s just not right. At this point, you should take a step back and tell yourself “STOP GUESSING AND JUST USE THE PROFILER.” So, always being one to listen to myself, I kick off the Visual Studio profiler and immediately it shows us the “hot path”. This was actually a nice improvement in the VS2010 profiler. Right on the summary page, it will show us our most expensive call paths:

But wait, it’s pointing to the RegEx.Match(…). Why is it pointing to that? From my own metrics above, I see that the loading of the Message strings into an object takes less than a second to execute. Well, the real reason is that LINQ uses kind of a lazy loading algorithm. Basically, it won’t necessarily process your queries until you try to do something with the data. Yes, that’s an over-simplification but in our case, it means that my EventLogMessage.Load(…) method won’t be called until I actually try to do something with the data. So, now, armed with this information, I can take a look at my Load() method and see what it’s actually doing and how it’s using the RegEx.Match(…) functionality:

Match myMatch = s_regex.Match(rawMessageText);



EventLogMessage message = new EventLogMessage();



message.Eventcode = myMatch.Groups["Eventcode"].Value;

message.Eventmessage = myMatch.Groups["Eventmessage"].Value;

message.Eventtime = myMatch.Groups["Eventtime"].Value;

message.EventtimeUTC = myMatch.Groups["EventtimeUTC"].Value;

...



return message;

So, basically we’re using the Match(…) to take that Message property and parse it out into the properties of the EventLogMessage object. The fact that this is the slowest part of the code shouldn’t necessarily shock you. Jeff Atwood wrote up a good blog entry a few years ago on something like this. If we look just at the RegEx.Match(…) in the profiler, we see that the problem isn’t necessarily with each Match call but the overall cost with 10,000+ calls:

Function Name

Number of Calls

Min Elapsed Inclusive Time

Avg Elapsed Inclusive Time

Max Elapsed Inclusive Time

System.Text.RegularExpressions.Regex.Match(string)

10,881

1.01

3.27

684.29

The Fix

So, now that we know the problem and the reason for it – what’s a Dev to do? Well, at this point, we should be thinking of an alternate way of performing this parsing without using RegEx. The simplest method is just to use some general string parsing, so we’ll start off with replacing our large RegEx pattern with a simple string array:

public static readonly string[] ParserString = new string[] {

@"Event code:" ,

@"Event message:" ,

@"Event time:" ,

@"Event time (UTC):" ,

@"Event ID:" ,

...

@"Stack trace:" ,

@"Custom event details:"};

This array will be used in our fancy new GetValue method:

private static string GetValue(string rawMessageText, int Key)

{

int startLoc = rawMessageText.IndexOf(ParserString[Key]);

int endLoc;

if (Key + 1 == ParserString.Length)

endLoc = rawMessageText.Length;

else

endLoc = rawMessageText.IndexOf(ParserString[Key + 1], startLoc);



return rawMessageText.Substring(startLoc + ParserString[Key].Length, endLoc - startLoc
- ParserString[Key].Length);

}

This method accepts our raw message string and a key index. This method just finds the existence of a string like “Event message:” in our raw message and then finds the index of the next string, like “Event time:” and subtracts the two to get at the value of the field. For example, given the following string:

“… Event message: An unhandled exception has occurred. Event time: …”

The red text are the keys and the highlighted blue text is the string between them. The idea for the above GetValue(…) method was actually provided by a fellow PFE engineer, Richard Lang during a late night chat session.

The last step to this process is just to call the GetValue(…) method from our new Load method:

public static EventLogMessage Load(string rawMessageText)

{

EventLogMessage message = new EventLogMessage();



int Key = 0;

message.Eventcode = GetValue(rawMessageText, Key++);

message.Eventmessage = GetValue(rawMessageText, Key++); ;

message.Eventtime = GetValue(rawMessageText, Key++); ;

...

return message;



}

So, now we’ve essentially removed the need for RegEx by implementing our own string parsing algorithm. Once we compile the code and run our application again, we see some major improvements:

Start: Load XDocument.9/19/2010 12:55:31 PM
End: Load XDocument.9/19/2010 12:55:31 PM
Start: Load Messages into object.9/19/2010 12:55:31 PM
End: Load Messages into object.9/19/2010 12:55:31 PM
Start: Query Objects.9/19/2010 12:55:31 PM
End: Query Objects.9/19/2010 12:55:31 PM
Start: Output Exception Type Details.9/19/2010 12:55:31 PM
End: Output Exception Type Details.9/19/2010 12:55:33 PM

We essentially improved the processing of our records from 40 seconds to under 2 seconds. I’d say that’s a pretty big improvement. Even better, we can use our Parallel.ForEach(…) code from above to make this even faster since we’re no longer bound by the RegEx parser:

Start: Load XDocument.9/19/2010 1:00:13 PM
End: Load XDocument.9/19/2010 1:00:13 PM
Start: Load Messages into object.9/19/2010 1:00:13 PM
End: Load Messages into object.9/19/2010 1:00:13 PM
Start: Query Objects.9/19/2010 1:00:13 PM
End: Query Objects.9/19/2010 1:00:13 PM
Start: Output Exception Type Details.9/19/2010 1:00:13 PM
End: Output Exception Type Details.9/19/2010 1:00:14 PM

So now it takes just about a second to process these records. Considering it’s processing over 10,000 event log messages, I’d say this is acceptable performance, for now.

Closing Comments

I just want to say a few things real quick. RegEx is not inherently evil. It is still one of the fastest and easiest methods to consume and parse data. You should not feel the need to go back to your own applications that are working just fine and refactor all of your code to strip out the RegEx expressions. It just so happens that sometimes too much of a good thing can be bad for your health. In our case, with over 10,000 RegEx.Match(…) calls in rapid succession, the RegEx appeared to be our bottleneck. This may or may not be the root cause of your own performance problems. The key takeaway from this blog entry should be that you should NOT guess when it comes to optimizing code paths. Instead, you should use the tools available at your disposal to find the bottleneck.

Until next time.

Micro optimization or just good coding practice?

Greg Varveris — Sat, 21 Aug 2010 04:36:27 GMT

This is a common topic and I thought I’d write up some thoughts I have on it. In-fact, I was just working with a customer on improving their code reviews and what they should be checking for and the question arose - “Should performance be targeted during a code review?” It’s an interesting question. I’m a big fan of performance testing early and often and not waiting until the end of a dev cycle but code reviews, IMO, should focus on logic, maintainability and best practices. I may be in the minority and if you look around the web, you’ll see varying opinions on the topic. For example, one of the PAG articles states:

“Code reviews should be a regular part of your development process. Performance and scalability code reviews focus on identifying coding techniques and design choices that could lead to performance and scalability issues. The review goal is to identify potential performance and scalability issues before the code is deployed. The cost and effort of fixing performance and scalability flaws at development time is far less than fixing them later in the product deployment cycle.

Avoid performance code reviews too early in the coding phase because this can restrict your design options. Also, bear in mind that that performance decisions often involve tradeoffs. For example, it is easy to reduce maintainability and flexibility while striving to optimize code.”

As I mentioned above, I am a huge proponent of performance analysis and optimization many times throughout a typical product development cycle. I can say with a fair amount of certainty that if you don’t build performance reviews into your project plan at regular intervals, you will hit some problem (or multiple problems) in production and have to refactor some code.

Circling back to the original question, though, are code reviews the place for performance analysis? Typically, I’d recommend using them to squash little bits of bad code but maintainability and code-cleanliness should be first and foremost in your minds. That said, if you see a pattern that you know can be improved, by all means bring it up. What’s an example of that type of situation?

Let’s take a look at predicates, specifically their usage in the Find method of a List<T>. If you’re not aware, the Find() method performs a linear search through all of the items until it finds the first match – then it returns. This makes it a O(n) operation where “n” is the number of items in the list. Basically, this means that the more items you have in the list, the longer a Find() operation can potentially take. So, if we slam about 10,000 elements into a list:

private static List<Data>
LoadList()

{

List<Data> myList = new List<Data>();

for (int i
= 0; i < 10000; i++)

{

myList.Add(new Data() { Id = "Id" +
i.ToString(), 


Value = "Value" + i.ToString() });

}



return myList;

}

Then, if someone wants to return the instance of the Data class that contains an Id of say “Id10000”, they might write the following code:

static Data
Find1(List<Data> myList, string idToFind)

{

Data data = myList.Find(s => 


s.Id.ToLower() == 


idToFind.ToLower());



return data;

}

Now, keep in mind that the predicate is executed for each element in the List<T> until it finds the instance you care about. With that in mind, we would probably want to refactor out the “idToFind.ToLower()” above the predicate since that value isn’t changing. So, you might end-up with something like this:

static Data
Find2(List<Data> myList, string idToFind)

{



idToFind = idToFind.ToLower();



Data data = myList.Find(s => 


s.Id.ToLower() == 


idToFind);



return data;

}

Another route you may want to go is just to use the string.Equals(…) method to perform the comparison. That would look like:

static Data
Find3(List<Data> myList, string idToFind)

{



Data data = myList.Find(s => 


string.Equals(

s.Id, 


idToFind, 


StringComparison.

InvariantCultureIgnoreCase)

);



return data;



}

Fact is, the last method IS the fastest way to perform the operation. I can say that without even needing to run it through a profiler. But if you don’t believe me…

Function Name	Elapsed Inclusive Time
...Find1(System.Collections.Generic.List`1<....Data>,string)	6.34
...Find2(System.Collections.Generic.List`1<....Data>,string)	4.47
...Find3(System.Collections.Generic.List`1<....Data>,string)	3.65

That’s something I might put into the category of a micro-optimization AND just good coding practice. But is this something that should be caught during a code review? I’d say “yes” because logically it all makes sense and none of the solutions would really hurt maintainability or readability of the code.

So, I’d tag this as a good coding practice. Other thoughts on the topic?

Enjoy!