How one line of code improved performance over 900x

Over time, some of my export logic started taking longer and longer to perform. At its worst, on large sets of data, it was taking 15 minutes to perform the entire export. Obviously, this isn’t ideal, so I began investigating.

The project is developed in C#/.NET and uses Entity Framework (EF) as an object-relational mapper. Entity Framework can be amazing to get applications up and running quickly… But, as the object models and database get more complicated, the effort grows to keep logic performing well.

In this case, I’m dealing with about a hundred top-level entities, which have several thousand child entities, and perhaps another thousand grand-child entities.

The export logic is complex and performs loops many times on the [grand]child entities. I did not want EF to hit the database on every iteration, because that could mean tens of thousands of single hits to the database. To avoid this, I followed some online guidance and pre-loaded the entities like so:

this.objects = rep.Objects.Where(y => ids.Contains(y.Id))
  .Include(y => y.ChildObject)
  .Include(y => y.ChildObjects.Select(x => x.GrandChild1))
  .Include(y => y.ChildObjects.Select(x => x.GrandChild2))
  .Include(y => y.ChildObjects.Select(x => x.GrandChildrenList))
  .AsEnumerable();

It took me about three hours of debugging nested logic (with SQL Server Profiler running on a separate screen) before I started realizing that the database was being hit to ask for data about my top-level object, which I had thought I was pre-loading.

I took another look at the above section of code and decided to double-check that it is doing what I thought it was doing, and swapped out .AsEnumerable() with .ToList(). I didn’t notice any dramatic difference (not right away, at least) because I was deep down in the logic trying to figure out why it was asking for information about my top-level entity on the one single line of code. But, I did notice that my nested logic was no longer asking for top-level entity information from the database.

Wanting to see the impact of that minor change on the overall export, I ran the export from end-to-end without debugging any further and found that my 15 minute wait time dropped to less than 10 seconds.

It seems some online information led me astray, believing that .AsEnumerable() would force EF to pre-fetch all the data in my query. Instead, I am now relying on .ToList() to do this. It would be nice if EF came out with an easy-to-use extension method .PreFetch() (something more specifically named to do what I want).

1 thought on “How one line of code improved performance over 900x”

Leave a Reply

Your email address will not be published.

Humanity Verification *Captcha loading...