Comparing Go and C#

Solving a simple asynchronous task

Posted by Friðrik Runólfsson on 17 April 2015

Lately I have been writing a lot of Go code. Go is an extremely interesting language designed for concurrency. To support that the language provides channels, goroutines and select statements. To give an insight into Go and demonstrate these concepts I thought it would be interesting to solve a simple asynchronous task both in Go and C#.

Before diving into the details about each implementation I think its helpful to give a quick overview the two code blocks that I ended up with. There are some similarities and some differences. For me the most important thing is that when writing Go code, I dont need to think about thread safety. I simply use channels, goroutines and select statments (more about that later) and can be sure that only one goroutine has access to the data at any given time. So the concurrency is built into the language.

In C# on the otherhand I need to think about which blocks of code can be accessed by multiple threads at the same time. We of course have thread-safe data structures such as the ConcurrentBag which help, but I need know when to use them instead of standard .Net structures. Asynchronous C# code is easily readable and there I think C# has some advantage over Go.

Channels take some time to get used to. Once you get used to reading Go code containing channels I think Go code is not far behind with regards to readability. Looking at the Go implementation, we have three main functions which clear responsibilities: parseFile, getUrls and selectResults. These functions communicate using channels. Writing code using channels imposes certain structure and you are in a way forced to write smaller functions that do one thing at a time. That, in my opinion, is a good thing.

###The Task

Given a file containing a list of urls,

  1. Download the content for each url
  2. Record the content size and the time it took
  3. Return a single collection containing all the download results.

This should be done asynchronously so that a single slow page does not block other downloads. As input I selected the top icelandic web sites from Modernus.

###The C# implementation

First we read the file, iterate through each url and create a new Task for it, using the DownloadAsync method. There we download the data for the url and return a DownloadResult object. Using the ContinueWith method, we create an asynchronous code block that is executed after the Task is completed. This code block is responsible for adding the results from the Task to the result collection we want to return.

Here we have to be careful, since this collection (results) will be accessed from multiple threads, so we need to think about thread safety. We could use locking around the results.add call but that could create a bottleneck so instead we opted to use a thread-safe collection, ConcurrentBag.

After we have created all the tasks we call Task.WaitAll which waits for all the tasks to complete execution. When they are all finished, our result collection contains a DownloadResult object for each of the urls we downloaded and then we can return it. This is a fairly common pattern in C#. You have a set of activities you need to complete. To increase performance you create more than one Task that run asynchronously. All Tasks return their results to a single collection.


public static List<DownloadResult> DownloadUrls(string path)
{
    var result = new ConcurrentBag<DownloadResult>();
    List<Task<DownloadResult>> tasks = ParseFileToTasks(path, result);
    Task.WaitAll(tasks.ToArray());
    return result.ToList();
}

private static List<Task<DownloadResult>> ParseFileToTasks(string path, ConcurrentBag<DownloadResult> results)
{
    var tasks = new List<Task<DownloadResult>>();
    foreach (string url in File.ReadLines(path))
    {
        tasks.Add(DownloadAsync(url, results));
    }
    return tasks;
}

private static Task<DownloadResult> DownloadAsync(string url, ConcurrentBag<DownloadResult> results)
{
    var task = Task.Run(() =>
    {
        try
        {
            using (WebClient client = new WebClient())
            {
                Stopwatch stopwatch = new Stopwatch();
                stopwatch.Start();
                byte[] data = client.DownloadData(url);
                stopwatch.Stop();
                return new DownloadResult()
                {
                    Url = url,
                    Time = stopwatch.ElapsedMilliseconds,
                    Success = true,
                    SizeInBytes = data.Length
                };
            }
        }
        catch (Exception x)
        {
            return new DownloadResult()
            {
                Url = url,
                Time = int.MaxValue,
                SizeInBytes = 0,
                Success = false
            };
        }
    }
    );

    task.ContinueWith((res) =>
    {
        results.Add(res.Result);
    }
    );
    return task;
}

public class DownloadResult
{
    public string Url { get; set; }
    public long Time { get; set; }
    public bool Success { get; set; }
    public int SizeInBytes { get; set; }
}

###The Go implementation First we create two channels, one channel of string (in) and one channel of GetResult structs (out). You can think of a channel as a pipeline to communicate between goroutines - you can send and receive data on theses channels. It is also worth mentioning that channels have a configurable buffer size. The size indicates how many elements can be sent to the channel before the send blocks. In our case these values have insignificant effects on performance. Both operations which receive off our channels are rather simple and not expected to be time consuming.

After we have created the channels we call the parseFile function that goes through the input file asynchronously and for each url we send it on the in channel. After we have gone through the entire file we close the channel. The getUrls function handles all these values that are sent on the in channel. There we iterate over the in channel and for each value we receive on it, we create a new goroutine. That goroutine is responsible for downloading the url and sending the results on the out channel.

We then have another function thats responsible for receiving values on the out channel, the selectResults function. There we have a go select statement. Select statements are defined on channels and there you can define actions depening on which channel we are sending/receiving (one way to think of these select statements is as switch statements). There we receive on the out channel and add the value received of the channel to the results collection until there are no more values on the channel. To know when we are done we use the WaitGroup type. For each goroutine we create we call Add() to increment the number of goroutines to wait for. When a goroutine finishes we call Done(). We then call Wait() on the WaitGroup to block until all goroutines are done.


const MaxInt64 int64 = 9223372036854775807
const InChannelSize int = 118

func DownloadUrls(path string) []GetResult {
    var wg sync.WaitGroup
    in := make(chan string, InChannelSize)
    out := make(chan GetResult)
    results := []GetResult{}

    go parseFile(path, in)
    go getUrls(in, out, &wg)
    go closeChannelWhenFinished(out, &wg)
    selectResults(out, &results)
    return results
}

func parseFile(path string, in chan string) {
    file, err := os.Open(path)
    if nil != err {
        panic(err)
    }

    scanner := bufio.NewScanner(file)
    for scanner.Scan() {
        url := scanner.Text()
        in <- url
    }
    close(in)
}

func getUrls(in <-chan string, out chan<- GetResult, wg *sync.WaitGroup) {
    for url := range in {
        wg.Add(1)
        go func(u string) {
            startTime := time.Now()
            resp, err := http.Get(u)
            if nil == err {
                bytes, err := ioutil.ReadAll(resp.Body)
                if nil == err {
                    duration := time.Now().Sub(startTime)
                    out <- GetResult{200 == resp.StatusCode, u, len(bytes), duration.Nanoseconds() / 1000000}
                }else{
                    out <- GetResult{false, u, 0, MaxInt64}
                }
            }else{
                out <- GetResult{false, u, 0, MaxInt64}
            }

            wg.Done()
        }(url)
    }
}

func closeChannelWhenFinished(c chan GetResult, wg *sync.WaitGroup) {
    wg.Wait()
    close(c)
}

func selectResults(c chan GetResult, results *[]GetResult) {
    for {
        select {
        case x, more := <-c:
            if more {
                *results = append(*results, x)
            } else {
                return
            }
        }
    }
}

###[Update] Honeyboy Wilson asked in a comment why I used Task instead of async/await. There were no particular reasons for that choice other than that I used Task(s) in the first version of the C# code I wrote and I never ended up refactoring it to use async/await. So I took some time to implement this solution using async/await. Although these two implementations are equivalent, the syntax differs. My personal opinion is that the first version of the C# implementation is more readable than this version, but this version ended up being shorter and more compact.


public static async Task<List<DownloadResult>> DownloadUrlsAsync(string path)
{
    var results = new List<DownloadResult>();
    List<Task<DownloadResult>> tasks = File.ReadLines(path).Select(a => DownloadAsync(a)).ToList();
    while (tasks.Any())
    {
        Task<DownloadResult> task = await Task.WhenAny(tasks);
        tasks.Remove(task);
        results.Add(task.Result);
    }
    return results;
}

private static async Task<DownloadResult> DownloadAsync(string url)
{
    try
    {
        using (WebClient client = new WebClient())
        {
            Stopwatch stopwatch = new Stopwatch();
            stopwatch.Start();
            byte[] data = await client.DownloadDataTaskAsync(new Uri(url));
            stopwatch.Stop();
            return new DownloadResult()
            {
                Url = url,
                Time = stopwatch.ElapsedMilliseconds,
                Success = true,
                SizeInBytes = data.Length
            };
        }
    }
    catch (Exception x)
    {
        Console.WriteLine("error downloading {0}: {1}", url, x);
        return new DownloadResult()
        {
            Url = url,
            Time = int.MaxValue,
            SizeInBytes = 0,
            Success = false
        };
    }
}


Photo credit: Brynjudalsá, Iceland. Own photo.