Historically, generating and returning a CSV file via .NET with CsvHelper might look something like:
var dapper = new Dapper(new SqlConnection("..")); // Connection string
var query = "SELECT * FROM Users";
var data = await dapper.QueryAsync<ReportRow>(query);
using var memoryStream = new MemoryStream();
using var streamWriter = new StreamWriter(memoryStream);
using var csvWriter = new CsvWriter(streamWriter);
csvWriter.WriteRecords(data);
streamWriter.Flush();
memoryStream.Seek(0, SeekOrigin.Begin);
return File(memoryStream, "text/csv", "my-report.csv");
This works well enough, but has some important considerations that don't become apparent when working with small files or low traffic.
The Problem
Given the above snippet, we can draw some observations:
All of the data is buffered from the database via
dapper.QueryAsync
, and is all allocated to thedata
array. This also blocks execution until fully buffered, meaning the code won't continue on until everything has been read from the DB.The entire
data
array is written to thememoryStream
Stream sequentially and in a blocking fashion. Execution won't continue until all rows are written.This entire process is sequential. The HTTP file result won't start returning to the client browser until all prior steps are complete.
Overall, this is a lot of waiting around as we gradually step through each part of the code.
The Solution
With a few small changes, we can gain some nice performance boosts here.
var sqlConnection = new SqlConnection(".."); // Connection string
var query = "SELECT * FROM Users";
var data = sqlConnection.QueryUnbufferedAsync<ReportRow>(query);
Response.StatusCode = 200;
Response.ContentType = "text/csv";
Response.Headers.Add(
"Content-Disposition",
"attachment; filename=\"my-report.csv\"");
await using (var stream = Response.Body)
await using (var streamWriter = new StreamWriter(stream))
await using (var csvWriter = new CsvWriter(streamWriter))
{
await foreach (var row in data)
{
csvWriter.WriteRecord(row);
await csvWriter.NextRecordAsync();
}
await csvWriter.FlushAsync();
}
return new EmptyResult();
What we have changed is:
Swapped to using
QueryUnbufferedAsync
from Dapper. This returns anIAsyncEnumerable<T>
and lets us continue on with our execution.Instead of writing to a new
MemoryStream
to hold our formatted output, write directly to theResponse.Body
stream.Consume our
IAsyncEnumerable
to write directly to theCsvWriter
in anawait foreach
, making full use of the async power we've gained.Manually formatting the
Response
properties that were previously handled byreturn File(...)
, such asStatusCode
,ContentType
, etc.
Why?
With the above changes...
Our execution continues throughout the code, not blocking on any one particular line of code. Ultimately, the final
return new EmptyResult()
executes very quickly (~20ms after Controller invocation) while the file continues to be streamed to completion afterwards.We don't allocate and re-allocate different collections or memory streams. There's no wasted memory here. As an added side-effect, we're never allocating the entire data set at once. This allows us to generate and return files of limitless size!
How?
This, under the hood, makes use of the chunked
Transfer-Encoding header, which is essentially streaming. It omits the Content-Length
header, and instead just streams all the data to completion.