Asked  7 Months ago    Answers:  5   Viewed   77 times

I'm new to parallel programming. There are two classes available in .NET: Task and Thread.

So, my questions are:

  • What is the difference between those classes?
  • When is it better to use Thread over Task (and vice-versa)?



Thread is a lower-level concept: if you're directly starting a thread, you know it will be a separate thread, rather than executing on the thread pool etc.

Task is more than just an abstraction of "where to run some code" though - it's really just "the promise of a result in the future". So as some different examples:

  • Task.Delay doesn't need any actual CPU time; it's just like setting a timer to go off in the future
  • A task returned by WebClient.DownloadStringTaskAsync won't take much CPU time locally; it's representing a result which is likely to spend most of its time in network latency or remote work (at the web server)
  • A task returned by Task.Run() really is saying "I want you to execute this code separately"; the exact thread on which that code executes depends on a number of factors.

Note that the Task<T> abstraction is pivotal to the async support in C# 5.

In general, I'd recommend that you use the higher level abstraction wherever you can: in modern C# code you should rarely need to explicitly start your own thread.

Tuesday, June 1, 2021
answered 7 Months ago

Thread pool will provide benefits for frequent and relatively short operations by

  • Reusing threads that have already been created instead of creating new ones (an expensive process)
  • Throttling the rate of thread creation when there is a burst of requests for new work items (I believe this is only in .NET 3.5)
    • If you queue 100 thread pool tasks, it will only use as many threads as have already been created to service these requests (say 10 for example). The thread pool will make frequent checks (I believe every 500ms in 3.5 SP1) and if there are queued tasks, it will make one new thread. If your tasks are quick, then the number of new threads will be small and reusing the 10 or so threads for the short tasks will be faster than creating 100 threads up front.

    • If your workload consistently has large numbers of thread pool requests coming in, then the thread pool will tune itself to your workload by creating more threads in the pool by the above process so that there are a larger number of threads available to process requests

    • check Here for more in depth info on how the thread pool functions under the hood

Creating a new thread yourself would be more appropriate if the job were going to be relatively long running (probably around a second or two, but it depends on the specific situation)

@Krzysztof - Thread Pool threads are background threads that will stop when the main thread ends. Manually created threads are foreground by default (will keep running after the main thread has ended), but can be set to background before calling Start on them.

Tuesday, June 1, 2021
answered 7 Months ago

Parallel.For doesn't divide the input into n pieces (where n is the MaxDegreeOfParallelism); instead it creates many small batches and makes sure that at most n are being processed concurrently. (This is so that if one batch takes a very long time to process, Parallel.For can still be running work on other threads. See Parallelism in .NET - Part 5, Partioning of Work for more details.)

Due to this design, your code is creating and throwing away dozens of Dictionary objects, hundreds of List objects, and thousands of String objects. This is putting enormous pressure on the garbage collector.

Running PerfMonitor on my computer reports that 43% of the total run time is spent in GC. If you rewrite your code to use fewer temporary objects, you should see the desired 4x speedup. Some excerpts from the PerfMonitor report follow:

Over 10% of the total CPU time was spent in the garbage collector. Most well tuned applications are in the 0-10% range. This is typically caused by an allocation pattern that allows objects to live just long enough to require an expensive Gen 2 collection.

This program had a peak GC heap allocation rate of over 10 MB/sec. This is quite high. It is not uncommon that this is simply a performance bug.

Edit: As per your comment, I will attempt to explain the timings you reported. On my computer, with PerfMonitor, I measured between 43% and 52% of time spent in GC. For simplicity, let's assume that 50% of the CPU time is work, and 50% is GC. Thus, if we make the work 4× faster (through multi-threading) but keep the amount of GC the same (this will happen because the number of batches being processed happened to be the same in the parallel and serial configurations), the best improvement we could get is 62.5% of the original time, or 1.6×.

However, we only see a 1.25× speedup because GC isn't multithreaded by default (in workstation GC). As per Fundamentals of Garbage Collection, all managed threads are paused during a Gen 0 or Gen 1 collection. (Concurrent and background GC, in .NET 4 and .NET 4.5, can collect Gen 2 on a background thread.) Your program experiences only a 1.25× speedup (and you see 30% CPU usage overall) because the threads spend most of their time being paused for GC (because the memory allocation pattern of this test program is very poor).

If you enable server GC, it will perform garbage collection on multiple threads. If I do this, the program runs 2× faster (with almost 100% CPU usage).

When you run four instances of the program simultaneously, each has its own managed heap, and the garbage collection for the four processes can execute in parallel. This is why you see 100% CPU usage (each process is using 100% of one CPU). The slightly longer overall time (2.3s for all vs 2.05s for one) is possibly due to inaccuracies in measurement, contention for the disk, time taken to load the file, having to initialise the threadpool, overhead of context switching, or some other environment factor.

Sunday, August 1, 2021
answered 5 Months ago

The pthread_key_create and friends are much older, and thus supported on more systems.

The __thread is a relative newcomer, is generally much more convenient to use, and (according to Wikipedia) is supported on most POSIX systems that still matter: Solaris Studio C/C++, IBM XL C/C++, GNU C, Clang and Intel C++ Compiler (Linux systems).

The __thread also has a significant advantage that it is usable from signal handlers (with the exception of using __thread from dlopened shared library, see this bug), because its use does not involve malloc (with the same exception).

Monday, August 2, 2021
answered 5 Months ago

Inside Task.Delay, it looks like this (the single parameter (int) version just calls the below version):

public static Task Delay(int millisecondsDelay, CancellationToken cancellationToken)
    if (millisecondsDelay < -1)
        throw new ArgumentOutOfRangeException("millisecondsDelay", Environment.GetResourceString("Task_Delay_InvalidMillisecondsDelay"));
    if (cancellationToken.IsCancellationRequested)
        return FromCancellation(cancellationToken);
    if (millisecondsDelay == 0)
        return CompletedTask;
    DelayPromise state = new DelayPromise(cancellationToken);
    if (cancellationToken.CanBeCanceled)
        state.Registration = cancellationToken.InternalRegisterWithoutEC(delegate (object state) {
            ((DelayPromise) state).Complete();
        }, state);
    if (millisecondsDelay != -1)
        state.Timer = new Timer(delegate (object state) {
            ((DelayPromise) state).Complete();
        }, state, millisecondsDelay, -1);
    return state;

As you can hopefully see:

    if (millisecondsDelay == 0)
        return CompletedTask;

Which means it always returns a completed task, and therefore your code will always continue running past that particular await line.

Friday, August 13, 2021
answered 4 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :