Bucketizing – A simple approach for solving hidden memory issues

By Ori Pearl updated June 4, 2025

Sometimes, seemingly simple loops may hide memory consumption bugs. Let’s look at the following C# code snippet that’s responsible for doing maintenance on a list of users.

long[] userIds = GetUserIdsForMaintenance();
using(DbContext dbContext = newDbContext())
{
foreach(longid inuserIds)
{
User user = dbContext.GetUser(id);
// ... Do maintenance on user ...
}
}

As implied, each dbContext.GetUser(id) creates a DB call that fetches a User. Many popular O/R Mapping frameworks, such as Entity Framework or NHibernate, utilize a caching mechanism when fetching entities from the DB, so in our example all the fetched Users might be cached by the framework in its first-level cache (More about first-level caching: Entity Framework, NHibernate).

When our userIds list is very long, this cache can quickly fill up to a point where we run out of memory and receive an OutOfMemoryException.

How Bucketizing can help memory issues

One way to avoid these memory issues without turning off the caching feature is to periodically clear the cache before it fills up.

An easy way to do that would be to split our userIds into buckets and for each bucket to initialize a new DbContext instance:

IEnumerable userIds = dbContext.GetAllUserIds();
foreach (IEnumerable idBucket in userIds.Bucketize(5000))
{
using (DbContext dbContext = new DbContext())
{
foreach (long id in idBucket)
{
User user = dbContext.GetUser(id);
// … Do maintenance on user …
}
}
}

What we see here is a new extension method called Bucketize that splits the long userId list into buckets, each containing 5,000 IDs.

When handling each bucket, we are creating a new instance of DbContext. This effectively clears the cache of the old DbContext instances by letting the garbage collector collect the entire object and free all of its memory.

What does Bucketize code looks like?

publicstaticIEnumerable<IEnumerable<T>> Bucketize(thisIEnumerable vals, intbucketSize)
{
varcurrentList = newList();
foreach(varelement invals)
{
if(currentList.Count == bucketSize)
{
yieldreturncurrentList;
currentList = newList();
}
currentList.Add(element);
}
if(currentList.IsEmpty())
{
yieldbreak;
}
yieldreturncurrentList;
}

As you can see, Bucketize is an extension method for IEnumerable which utilizes the yield keyword in order to retrieve the next bucket when needed, and not iterate on the entire collection.

“Bucketizing” large data collections can help us overcome memory issues that are sometimes hidden behind seemingly simple-looking loops.

Integrations

Integrations

Integrations

Integrations

Integrations

Integrations

Contact us

Contact us

Contact us

Bucketizing – A simple approach for solving hidden memory issues

How Bucketizing can help memory issues

What does Bucketize code looks like?

You may also like

How We Reduced Support Tickets in One Product by 67% in Six Months

From Legacy to Leading Edge: Upgrading Webhook Delivery with Kafka

Building the Future of Data Operations: A Comprehensive Guide to Airflow on Kubernetes at Scale

Integrations

Integrations

Integrations

Integrations

Integrations

Integrations

Contact us

Contact us

Contact us

How Bucketizing can help memory issues

What does Bucketize code looks like?

You may also like

How We Reduced Support Tickets in One Product by 67% in Six Months

From Legacy to Leading Edge: Upgrading Webhook Delivery with Kafka

Building the Future of Data Operations: A Comprehensive Guide to Airflow on Kubernetes at Scale

Ready to save time and money?

Thank you!

Customer Success Stories