How introducing a state in Azure Functions punches you in the face

This post is a bit longer explanation of an issue I reported a few days ago. To make a long story short - I was trying to find a way to fix a bug in my function, where almost each call resulted in an error logged to Application Insights. The exception was a simple Collection was modified; enumeration operation may not execute - I knew what was failing and where but I couldn't find an answer why. I'll try to explain this a little bit and - on the occasion - tell you why state in Azure Functions is a bad, bad thing.

As you can see, each call of my function was polluted by an exception

The background

As you probably know(and if now - you'll get it in a second), before Azure Function can be actually triggered, it has to be initialized. There's a ScriptHostManager class, which is created at the very beginning of the first call to your function - it's responsible for handling the whole host. Once initialized, it waits and listens for one of three things:

  • incoming calls(triggers)
  • a signal for restart
  • an exception(which by the way will cause a restart)

This host will be kept alive for a fixed amount of time and if it's not triggered, it will be closed. Of course it can be scaled if needed, but it's not the case now.

A host is also responsible for digesting all metadata and description related to your functions. It will read function.json file to get information about bindings, find an entry point and validate it against a configuration file. All this operations happen only why time, when host is created, and once it knows everything what it needs, it can invoke your function again and again. However, this is where problem could occur.

The habits

Let's consider following function:

/
public class FooFunction 
{	
	public static void Run(string myQueueItem,
		ICollector<EventLog> fooTable,
		TraceWriter log)
	{
		// Do some processing...	
		fooTable.Add(new FooTable());
	}
}

So far so good - nothing unexpected happens. In fact - this example will work just fine. But let's say we'd like to refactor it a little and extract logic responsible for processing an item:

/
public class FooFunction 
{
	private static ICollector<FooTable> _fooTable;
	
	public static void Run(string myQueueItem,
		ICollector<EventLog> fooTable,
		TraceWriter log)
	{

		_fooTable = fooTable;

		Proc(myQueueItem);
	}
	
	private static void Proc(string myQueueItem) {
		// Do some processing...
		
		_fooTable.Add(new FooTable());
	}
}

Initially there's nothing wrong with this design. On the other hand, we've just introduced a some kind of state, which will last as long as our function is alive. It still could work though and we can live unaware of this flaw. This is how I introduced this bug into my system - I was refactoring my functions as always with a little help of Resharper and in some point just moved ICollector<T> parameter to a static field. 

The problem

As I mentioned, initially you could live with a state in your function and even don't see any problems. If you're using e.g. TimerTrigger, for sure it'll work - you need just a one instance of a function called at specific interval. However, what about triggering a function for each queue item? Per each HTTP request? Or even an event in ServiceBus? In those scenarios, your function will be called concurrently and will simultaneously access your static field, overwriting what other calls have added to a collection. Sooner than later you'll end up with a pretty Collection was modified; enumeration operation may not execute everywhere in your logs. This is why initially this problem won't affect you - if traffic is low enough, it won't trigger a function fast enough to actually make this exception happen.

Please have in mind, that documentation of Azure Functions makes this rather clear, that state should be avoided so introducing it is, well... you deserve to be punished :)

The solution

The solution of this problem is pretty simple - just pass input of your entry point as parameters to other methods in your function. This will help keep your logs clean(interesting fact is that Monitor tab in your function won't show this problem - each call will be marked as successful while the error count will grow!!) and save you from potential other problems related to sharing a state within a function.

 

Migrating schema and data in Azure Table Storage

Recently I faced a problem, when I had to change and adjust schema in tables stored in Azure Table Storage. The issue there was to actually automate changes so I don't have to perform them manually on each environment. This was the reason why I created a simple library called AzureTableStorageMigratorwhich helps in such tasks and eases the whole process.

The basics

The base idea was to actually create two things:

  • a simple fluent API, which will take care of chaining all tasks
  • a table which will hold all migration metadata

Current version(1.0) gives you following possibilities:

  • void Insert<T>(string tableName, T entity, bool createIfNotExists = false)
  • void DeleteTable(string tableName)
  • void CreateTable(string tableName)
  • void RenameTable<T>(string originTable, string destinationTable)
  • void Delete<T>(string tableName, T entity)
  • void Delete(string tableName, string partitionKey)
  • void Delete(string tableName, string partitionKey, string rowKey)
  • void Clear(string tableName)

and when you take a look at the example of usage:

/
var migrator = new Migrator();
migrator.CreateMigration(_ =>
{
  _.CreateTable("table1");
  _.CreateTable("table2");
  _.Insert("table1", new DummyEntity { PartitionKey = "pk", RowKey = DateTime.UtcNow.Ticks.ToString(), Name = "foo"});
  _.Insert("table1", new DummyEntity { PartitionKey = "pk", RowKey = DateTime.UtcNow.Ticks.ToString(), Name = "foo2"});
  _.Insert("table2", new DummyEntity { PartitionKey = "pk", RowKey = DateTime.UtcNow.Ticks.ToString(), Name = "foo"});
}, 1, "1.1", "My first migration!");

you'll see, that's pretty straightforward and self-describing. 

The way how it works is very simple - each CreateMigration() method is described using 3 different values - its id, version number and description. Each time this method is called, it'll add a new record to the versionData table to make sure, that metadata is saved and the same migration won't be run twice.

Why should I use it?

In fact it's not a matter of what you "should" do but rather what is "good" for your project. Versioning is generally a good idea, especially if you follow CI/CD pattern, where the goal is to deploy and rollback with ease. If you perform migrations by hand, you'll eventually face the situation, where rollback is either very time-consuming or almost impossible. 

It's good to remember that making your database a part of your repository(of course in terms of storing schema, not data) is considered a good practice and is one of the main parts of many modern projects.

What's next?

I published ATSM because I couldn't find a tool similar to it, which would help me version tables in Table Storage easily. For sure some new features will be added in the future, however if you find this project interesting, feel free to post an issue or a request - I'll be more than happy to discuss it.