Post

ASP.NET Core Playground - 2. Entity Framework Core

So we have our 1.5GB table and we connected it to our Entity Framework Core. What should we do with it now?

At first let’s try a super simple scenario. Imagine that, for some reason, we want to break few rules and our WebAPI will serve models returned from Entity Framework Core directly and allow ASP.NET Core to convert them to JSON with it’s default behavior. Now let’s checkout some endpoints and let me explain to you how exactly they work.

This is not a benchmark so please ignore the exact times shown in the logs below. This scenario uses to many parts (like docker, SqlServer, network interface) for the timings to be consistent and conclusive. Each of them can have a temporary slowdown, and Spotify I was running in the background doesn’t help.

Realize the data with ToArray or ToArrayAsync

In this endpoint I force the Entity Framework Core to download all the data from the table and put it inside a real in memory array. This approach will not be loved by Garbage Collector, because after every call it will have to free 1.5GB of allocated heap memory at once.

1
2
3
4
5
6
7
8
9
10
11
[HttpGet("Array")]
public async Task<IEnumerable<WeatherForecast>> GetArray(CancellationToken cancellationToken)
{
    var stopwatch = new Stopwatch();

    Console.WriteLine($"Start SQL"); stopwatch.Restart();
    var data = await _dbContext.WeatherForecasts.ToArrayAsync(cancellationToken);
    Console.WriteLine($"End SQL - {stopwatch.ElapsedMilliseconds} ms");

    return data;
}

Console logs after using curl command to fetch the data:

1
2
3
4
5
6
Start SQL
info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (3ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT [w].[Id], [w].[Date], [w].[Summary], [w].[TemperatureC]
      FROM [WeatherForecasts] AS [w]
End SQL - 25951 ms

This code behaves in an awful way. At first the SQL query is executed and the memory needs to be allocated and filled with data. This procedure, every time the endpoint is called, takes around 25 seconds. What is worse the endpoint doesn’t return any data for all that time so the consumer will have to wait for it, and probably even time out the request. We also have to think about all the garbage that needs to be collected, and resources that this will take.

Realize the data with ToList or ToListAsync

Same as above, but this time we use a list instead of above.

1
2
3
4
5
6
7
8
9
10
11
[HttpGet("List")]
public async Task<IEnumerable<WeatherForecast>> GetList(CancellationToken cancellationToken)
{
    var stopwatch = new Stopwatch();

    Console.WriteLine($"Start SQL"); stopwatch.Restart();
    var data = await _dbContext.WeatherForecasts.ToListAsync(cancellationToken);
    Console.WriteLine($"End SQL - {stopwatch.ElapsedMilliseconds} ms");

    return data;
}

Console logs after using curl command to fetch the data:

1
2
3
4
5
6
Start SQL
info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (3ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT [w].[Id], [w].[Date], [w].[Summary], [w].[TemperatureC]
      FROM [WeatherForecasts] AS [w]
End SQL - 24585 ms

Since List is just an Array with some useful methods the behavior and timing is exactly the same as when using ToArray or ToArrayAsync.

Streaming the data with AsEnumerable

Documentation says that AsEnumerable can be used to stream the data. It will load only one row of data at a time, return it and then forget it. This means that there is no need to wait. ASP.NET Core can process and send out each row of data immediately when it’s received. The RAM usage is minimal, and even though Garbage Collector will eventually have to also free a total of 1.5GB of memory it will be able to do it in small chunks as often as it wants.

1
2
3
4
5
6
7
8
9
10
11
[HttpGet("Enumerable")]
public IEnumerable<WeatherForecast> GetEnumerable(CancellationToken cancellationToken)
{
    var stopwatch = new Stopwatch();

    Console.WriteLine($"Start SQL"); stopwatch.Restart();
    var data = _dbContext.WeatherForecasts.AsEnumerable();
    Console.WriteLine($"End SQL - {stopwatch.ElapsedMilliseconds} ms");

    return data;
}

Console logs after using curl command to fetch the data:

1
2
3
4
5
6
Start SQL
End SQL - 0 ms
info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (3ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT [w].[Id], [w].[Date], [w].[Summary], [w].[TemperatureC]
      FROM [WeatherForecasts] AS [w]

This time there is no delay! First row is loaded close to instantly and ASP.NET returns it as soon as it’s available. Endpoint feels much more responsive, and there is no risk of timeout on the consumer side.

Streaming the data with AsAsyncEnumerable

Entity Framework Core also provides us with asynchronous enumerable that we can simply return from the endpoint just as we did above with AsEnumerable.

1
2
3
4
5
6
7
8
9
10
11
[HttpGet("AsyncEnumerable")]
public IAsyncEnumerable<WeatherForecast> GetAsyncEnumerable(CancellationToken cancellationToken)
{
    var stopwatch = new Stopwatch();

    Console.WriteLine($"Start SQL"); stopwatch.Restart();
    var data = _dbContext.WeatherForecasts.AsAsyncEnumerable();
    Console.WriteLine($"End SQL - {stopwatch.ElapsedMilliseconds} ms");

    return data;
}

Console logs after using curl command to fetch the data:

1
2
3
4
5
6
Start SQL
End SQL - 0 ms
info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (3ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT [w].[Id], [w].[Date], [w].[Summary], [w].[TemperatureC]
      FROM [WeatherForecasts] AS [w]

The endpoint behaves exactly the same as when we were using AsEnumerable.

Streaming the data with an Enumerator

There is another way to handle streamed data. Simply by writing yield return compiler will do a lot of work for us. It will generate code for an Enumerator class that will freeze the thread while waiting for new row from the database, and allows us to perform operations on each row.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[HttpGet("Enumerator")]
public IEnumerable<WeatherForecast> GetEnumerator(CancellationToken cancellationToken)
{
    var stopwatch = new Stopwatch();

    Console.WriteLine($"Start SQL"); stopwatch.Restart();
    var data = _dbContext.WeatherForecasts.AsEnumerable();
    Console.WriteLine($"End SQL - {stopwatch.ElapsedMilliseconds} ms");

    foreach (var d in data)
    {
        yield return d;
    }
}

Console logs after using curl command to fetch the data:

1
2
3
4
5
6
Start SQL
End SQL - 0 ms
info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (3ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT [w].[Id], [w].[Date], [w].[Summary], [w].[TemperatureC]
      FROM [WeatherForecasts] AS [w]

Streaming the data with an AsyncEnumerator

We can also go fully asynchronous. The yield return combined with await foreach makes the compiler generate an AsyncEnumerator class that behaves in almost the same way as Enumerator but instead of freezing the thread it simply awaits new row of data. This way the thread can be acquired by for example other concurrent requests and the risk of emptying the thread pool is minimal.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[HttpGet("AsyncEnumerator")]
public async IAsyncEnumerable<WeatherForecast> GetAsyncEnumerator([EnumeratorCancellation] CancellationToken cancellationToken)
{
    var stopwatch = new Stopwatch();

    Console.WriteLine($"Start SQL"); stopwatch.Restart();
    var data = _dbContext.WeatherForecasts.AsAsyncEnumerable();
    Console.WriteLine($"End SQL - {stopwatch.ElapsedMilliseconds} ms");

    await foreach (var d in data)
    {
        yield return d;
    }
}

Console logs after using curl command to fetch the data:

1
2
3
4
5
6
Start SQL
End SQL - 0 ms
info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (3ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT [w].[Id], [w].[Date], [w].[Summary], [w].[TemperatureC]
      FROM [WeatherForecasts] AS [w]

Conclusion

Now I know that using ToList and ToArray is a big mistake in case of simply returning the data, even though from my other tests, that I will describe in later posts, there are many cases when they are useful.

In 2023 streaming is the thing! We need to use more of that!

This post is licensed under CC BY 4.0 by the author.