Mad, Beautiful Ideas
Runtime-Built LINQ Clauses Building Expression Trees

Microsoft's Language Integrated Queries (LINQ) introduced in .NET is an amazing tool, however, there are several types of requests that can be...difficult to build in the default engines. The problem I encountered was where I had a string of data which was being treated as an array of single-character codes that I would want to query for a subset. The data in my database could appear as 'ABCDE', and I'd want to match it if it contains A or D, or if it contains neither.

Schedules of Classes Footnote Selection

This implementation is being used on Washington State University's Schedules of Classes Search (pictured above), to handle the footnote search listed near the bottom of the search form. My constraints are as follows:

  1. Footnotes is stored in the database as a CHAR(5), but is to be treated more as an Array than a String
  2. I want the records that match either ANY of the user selected options, or NONE of the user selected options, depending on whether or not they choose the 'exclude footnotes' option
  3. This should be convertable to SQL and run on the SQL Server

Point 3 is important. Using LINQ-to-SQL I had the option to pull back more records from the database and do the additional filtering on the client, but in circumstances where a user was only searching by footnotes, this would result in an enormous delay as we transfer back way too much data from the database to be filtered in a much slower .NET layer. SQL is designed for data searching, and we should let it do the work.

My initial implementation worked in the .NET layer, since I wanted something that 'just worked' and looked something like this:

        foreach (char c in footnotes) {
            queryLocal = queryLocal.Where((s => s.Footnotes.IndexOf(c) != 1) == !excludeFootnotes);
        }
        

Unfortunately, this didn't even work. For one, it was slow, especially when footnotes were the only search term. For another, it only represents AND relationships, when my requirement was for OR relationships. Oh, and the s.Footnotes.IndexOf thing is because LINQ-to-SQL can't translate the Contains method, but this is a minor issue.

For many people, unfortunately, this probably would probably be a non-starter. However, what LINQ does internally is convert the Lambda expression you provide it, into an Expression Tree, which allows the C# (or VB.NET) code to be translated to SQL. With that in mind, what's stopping you from building your own custom expression tree? Nothing...except the willingness to step just a little ways down the rabbit hole.

Let's first build the function we want to add to our WHERE clause out using the following prose, which does not include the 'exclusion' flag from the requirements:

LINQ Expression Tree Steps Example

For each character in the user input, we want to test that character to see if it is contained in a string of input from the database. If any character in the user input is found in the database input, the row should be returned. For user input 'BD', the following boolean expression should be generated: ('B' in databaseInput) || ('D' in databaseInput).

If the exclusion flag is set, the expression will be: ('B' not in databaseInput) && ('D' not in databaseInput), which can be more easily represented by the homomorphism: false && ('B' in databaseInput || 'D' in databaseInput). This homomorphism is important, because now we have two forms of the same boolean expression. These can essentially be represent as follows: !exclude && ('B' in databaseInput || 'D' in databaseInput), which allows me to build a single expression regardless of the value of my exclusion flag.

This can not be directly plugged in LINQ because I don't even know the length of the user input at compile time. But let's start with an input of 1, and look what it takes to build that simple conditional. First, some book-keeping.

        var param = Expression.Parameter(typeof(Database.SectionInfo), "sectionInfo");
        var indexOfMethod = typeof(string).GetMethod("IndexOf", newType[] { typeof(char) });
        

Now, the boolean expressions we have above are useful, but what we're really getting ready to do is build an expression tree. Let's begin with a single footnote conditional and see what that would look like. As a boolean expression, we have ('B' in dataBaseinput). In C#/LINQ-to-SQL, this will look like si.Footnotes.indexOf('B') != -1

This gets me a parameter I can use which represents the SectionInfo table in my database that I will be querying, as well as a reference to the IndexOf method that I require for my test. You'll need an Expression.Parameter for each argument you'll need to add at runtime, and a Reflective reference to any methods. There is no way around this. Next, let's build the first request. Keep in mind that Expressions can only operate on other Expressions, so we need to use the System.Linq.Expressions.Expression factory methods to generate our expressions.

        // Step 1. Lookup the Footnote Properties
        // The param value is the one declared in the 'book-keeping' section above.
        var footnotesProperty = Expression.Property(param, "Footnotes");
        
        // Step 2. Call IndexOfMethod on Property
        // indexOfMethod was also declared above
        // Expression.Constant converts a static value into an Expression
        var methodCall = Expression.Call(footnotesProperty, indexOfMethod, Expression.Constant(inputCharacter));
        
        // Step 3. Test return of method
        Expression.NotEqual(methodCall, Expression.Constant(-1));
        
        // All together now:
        Expression.NotEqual(
                Expression.Constant(-1),
                Expression.Call(
                        Expression.Property(param, "Footnotes"),
                        indexOfMethod,
                        Expression.Constant(inputCharacter)));
        

Next, we want to expand this to include a list of elements, which can be wrapped easily in a ForEach loop:

        var optionsString = "BDF";
        // Build the first request
        var builtExpression = 
                Expression.NotEqual(
                        Expression.Constant(-1),
                        Expression.Call(
                                Expression.Property(param, "Footnotes"),
                                indexOfMethod,
                                Expression.Constant(optionsString[0])));
        
        // Build the subsequent OR parts
        foreach(char option in optionsString.substring(1)) {
            builtExpression = Expression.Or(
                    builtExpression,
                    Expression.NotEqual(
                            Expression.Constant(-1),
                            Expression.Call(
                                    Expression.Property(param, "Footnotes"),
                                    indexOfMethod,
                                    Expression.Constant(optionsString[0]))));
        }
        

Of course, this should be rewritten according the DRY-principle:

        private Expression checkForFootnote(ParameterExpression param, char footnoteCode) {
            static MethodInfo indexOfMethod = typeof(string).GetMethod("IndexOf", newType[] { typeof(char) });
        
            return Expression.NotEqual(
                                Expression.Constant(-1),
                                Expression.Call(
                                        Expression.Property(param, "Footnotes"),
                                        indexOfMethod,
                                        Expression.Constant(footnoteCode)));
        }
        
        var optionsString = "BDF";
        // Build the first request
        var builtExpression = checkForFootnote(param, optionsString[0]);
        
        // Build the subsequent OR parts
        foreach(char option in optionsString.substring(1)) {
            builtExpression = Expression.Or(
                    builtExpression,
                    checkForFootnote(param, option);
        }
        

This gets us 90% of the way to where I need to be. The only thing we're missing is the exclude option. In my implementation, exclude is equal to true when I want it active, and false otherwise, which means it must be negated in order to meet the requirement above. Remember in this case, the boolean expression we want is: !exclude && (footnoteCheck) where footnoteCheck is the expression constructed above.

        builtExpression = Expression.And(
                builtExpression,
                Expression.Not(Expression.Constant(exclude)));
        

At this point, we've built the entire expression in a way that can be sent to LINQ for conversion into dynamically generated SQL. After this was done, I saw a dramatic increase in speed on footnote-only searches via our search page as I'm letting my SQL Server do the work it was designed to do. More than that, I've been able to deconstruct a method to show that nearly ANY data processing problem can be solved in LINQ, only requiring a bit more thought.