Multicore Programming in .NET
Introduction
Your CPU meter shows a bottleneck. One core is running at 100 percent, but all the other cores are idle. Your application is CPU-bound, but the code execution is using only a fraction of the computing power of your multicore system. Sound like a familiar problem?
The solution to this problem is parallel computing/programming. Traditionally, programmers have written sequential code that is familiar to all programmers, but it often fails to meet the modern application's performance goals. To use the system's processing power efficiently, we need to split the application execution into pieces that can run at the same time.
In the past, spanning threads was the way to go to execute code in parallel, which had its own issues in terms of thread synchronization, locks, deadlocks, and a minefield of subtle, hard-to-reproduce software defects.
In .NET framework 4+, we have the enhanced support provided for parallel programming. Some of the features include a new runtime, new class library types, new diagnostic tools, Tasks, Parallel For, Parallel Linq, and concurrent collections.
Potential Parallelism
This indicates the program is written so that it runs faster when parallel hardware is available and roughly the same as an equivalent sequential program when it's not. In .NET, this is achieved via
1. Task Parallel Library (TPL) and
2. Parallel LINQ (PLINQ)
The Task Parallel Library (TPL)
TPL optimizes developer productivity by simplifying the process of adding parallelism and concurrency to an application. .NET frameworks facilitate this by providing a set of public types and APIs in the System.Threading and System.Threading.Tasks namespaces.
Parallel LINQ (PLINQ)
PLINQ is a parallel implementation of LINQ to Objects. It combines the simplicity and readability of LINQ syntax with the power of parallel programming. PLINQ implements the full set of LINQ standard query operators as extension methods for the T:System.Linq namespace and has additional operators for parallel operations.
Parallel Loops
.NET parallel framework includes Parallel For and Parallel ForEach to provide parallel looping functionalities. The framework mandates that the steps of the loop body must be independent of one another. The steps must not communicate by writing to shared variables.
1. Use the Parallel.For method to iterate over a range of integer indices.
2. Use the Parallel.ForEach method to iterate over user-provided values.
Parallel.For
Parallel.For is a static method with multiple overloaded version. The Parallel.For method does not guarantee any particular order of execution.
Sequential For sample in C#
int n = 100;
for (int i = 0; i < n; i++)
{ }
To optimally use the multiple cores available in the CPU, replace the for keyword with a call to the Parallel.For method and convert the body of the loop into a lambda expression:
int n = 100;
Parallel.For(0, n, i =>
{ });
Parallel.ForEach
Parallel.ForEach is a static method with multiple overloaded versions. It mandates that iterations should be independent. The loop body must only make updates to fields of the particular instance that's passed to it.
Example of a sequential foreach loop in C#
IEnumerable<MyObject> myEnumerable = ...
foreach (var obj in myEnumerable)
{ }
To take advantage of multiple cores, replace the foreach keyword with a call to the Parallel.ForEach method:
IEnumerable<MyObject> myEnumerable = ...
Parallel.ForEach(myEnumerable, obj =>
{ });
Breaking Out of Loops Early
Breaking out of loops is a familiar part of sequential iteration. It's less common in parallel loops, but you'll sometimes need to do it. Both Parallel.For and Parallel.ForEach methods support the loop state Break and Stop methods.
Example of the sequential case:
int n = ...
for (int i = 0; i < n; i++)
{
// ... if (/* stopping condition is true */)
break;
}
We have two options available with Parallel Loops:
1. Parallel Break
2. Parallel Stop
Parallel Break
The Parallel.For and Parallel.ForEach methods have an overload that provides a ParallelLoopState object as a second argument to the loop body. We can break the loop by calling the Break method of the ParallelLoopState object.
Here's an example:
int n = ...
Parallel.For(0, n, (i, loopState) =>
{
if (/* stopping condition is true */)
{
loopState.Break();
return;
}
});
Parallel Stop
There are also situations such as unordered searches where we want the loop to stop as quickly as possible after the stopping condition is met. To stop a loop in this way, call the ParallelLoopState class's Stop method instead of the Break method. When the Stop method is called, the index value of the iteration that caused the stop isn't available. Here is an example of parallel stop:
int n = ...
Parallel.For(0, n, (i, loopState) =>
{
if (/* stopping condition is true */)
{
loopState.Stop();
return;
}
});
Tasks
A task is a set of actions that run in sequence; however tasks can often run in parallel. In .NET, a task is also an object with properties and methods of its own. Parallel.Invoke is the simplest expression of the parallel task pattern. It creates new parallel tasks for each delegate method that is in its params array argument list. The Invoke method returns when all the tasks are finished.
Here's some sequential code:
DoLeft();
DoRight();
If the methods are independent, you can use the Invoke method of the Parallel class to call them in parallel. This is shown in the following code.
Parallel.Invoke(DoLeft, DoRight);
Alternate way to create tasks:
Task t1 = Task.Factory.StartNew(DoLeft);
Task t2 = Task.Factory.StartNew(DoRight);
Task.WaitAll(t1, t2);
Type grossStatus = new Type();
Type tissueType= new Type();
Type processingMode = new Type();
//Create the tasks
Task<Type> task1 = Task<Type>.Factory.StartNew(() => GetLov1(grossStatus));
Task<Type> task2 = Task<Type>.Factory.StartNew(() => GetLov2(tissueType));
Task<Type> task3 = Task<Type>.Factory.StartNew(() =>GetLov3(processingMode));
Task.WaitAll(task1 , task2 , task3)
//Use the results
grossStatus = task1.Result;
tissueType = task1.Result;
processingMode = task1.Result;
//Private methods which do the main task
private Type GetLov1(Type grossStatus) {return XYZ}
private Type GetLov2(Type tissueType) {return XYZ}
private Type GetLov3(Type processingMode) {return XYZ}
Task Cancellation
In .NET, a cancellation request does not forcibly end a task. Instead, tasks use a cooperative cancellation model. This means that a running task must poll for the existence of a cancellation request at appropriate intervals and then shut itself down by calling back into the library. The .NET Framework uses separate types. One allows a program request cancellation, and the other checks for cancellation requests. Instances of the CancellationTokenSource class are used to request cancellation, while CancellationToken values indicate whether cancellation has been requested. Here's an example:
CancellationTokenSource cts = new CancellationTokenSource(); CancellationToken token = cts.Token;
Task myTask = Task.Factory.StartNew(() =>
{
for (...)
{
token.ThrowIfCancellationRequested();
// Body of for loop.
}
}, token);
// ... elsewhere ...
cts.Cancel();
MSDN References
http://msdn.microsoft.com/en-us/library/dd460693.aspx