February 2011 Archives

Adding a Column to a YUI3 DataTable

2 Comments

In December, I wrote instructions on how to add additional columns to a YUI2 DataTable while using YUI2in3. Since then, YUI3.3.0 was released, and in it the first Beta release of YUI3’s DataTable. Recently, I decided to take some time to upgrade my implementation to use the new datatable, knowing that my needs were relatively simple and straight-forward, I figured it would be a good opportunity to test the new API and make it fit my needs.

The following implementation follows exactly my YUI2 implementation. It takes an existing HTML Table, converts it to a DataSource, and then plugs that into a new DataTable instance. YUI3, with it’s plugin architecture, makes certain aspects of this easier than it was in YUI2. In spite of that though, it’s clear that the YUI3 DataTable API is still pretty fresh and probably needs some work. This post is to serve both as documentation for the current state of YUI3 DataTable, as well as an examination of where potential improvements that could be made to the API.

Let’s start with the module list:

Y.use('datatable', 'datasource-xmlschema', 'datasource-local', function(Y) {

To begin, I’m not going to be using any of the datatable plugins, like datatable-sort, but I will address them later. First, we need to set up the DataSource. In YUI2, there was a datasource type for HTML tables, which would parse the columns out in order. Currently, no such datasource exists for YUI3, but it can be easily mapped to an XML Schema:

var dataSource = new Y.DataSource.Local({
    source: Y.Node.getDOMNode(Y.one('#tableId tbody')),
    plugins: [
        {
            fn: Y.Plugin.DataSourceXMLSchema,
            cfg: {
                schema: {
                    resultListLocator: "tr",
                    resultFields: [
                        { key: "abbr", locator: "td[1]" },
                        { key: "name", locator: "td[2]" },
                        { key: "loc", locator: "td[3]" },
                        { key: "loc_href", locator: "td[3]/a/@href" }
                    ]
                }
            }
        }
    ]
});
Y.one('#tableId').remove();

This will remove the table from the DOM, while still keeping it in memory for manipulation, which will be important shortly. This code is taken almost directly from the YUI3 examples for DataSource, but it also marks one of the first places I ran into a caveat with YUI3 DataSource.

In my data, the third column was optionally a link. In the YUI2 DataSource for HTML Tables, the column was read as essentially being the innerHTML of that third column, however, the way that DataSourceXMLSchema is currently implemented, it will always take the textContent of the Node before it takes the XML representation. As such, I had to grab the href attribute off the link (if it exists), which I’ll be able to use with a custom formatter next, when we build the datatable.

var table = new Y.DataTable.Base({
    columnset: [
        { key: "abbr", label: "Abbreviation", sortable: true },
        { key: "name", label: "Building Name", sortable: true },
        { key: "loc", label: "Location", sortable: true, formatter: function (obj) {
                var data = obj.data;
                if (data.loc_href) {
                    return Y.Lang.sub("{loc}", data);
                } else {
                    return obj.value;
                }
            }
        }
    ],
    plugins: [
        { fn: Y.Plugin.DataTableDataSource, cfg: { datasource: source} }
    ]
}).render('#datatable');
table.datasource.load();

This is pretty straightforward, and most of the definition should be familiar to anyone who has used YUI2’s DataTable. The only thing of note is the formatter function on the columnset definition list. It is able to take advantage of the fact that any unmatched fields in the datasource will be undefined, allowing me to use it as a condition for selectively formatting. In an implementation of DataSourceHTMLTableSchema, which I may do for the Gallery, the value will be the innerHTML, and not it’s text.

So far, so good. The API is different only in it’s use of plugins, and aside from the fact that DataSource clearly prefers to deal with data returned off of IO calls instead of HTML, which is a fairly minor inconvenience in my case. The work done here by Tilo Mitra and Jenny Han Donnely was, up to this point excellent. However, there are some caveats to come in the adding of the additional column.

The first step to adding the column is to add it to the datasource. This can be accessed through the table.datasource.get('datasource').get('source'). This returns the TBODY DOM Node, which is the other problem with the DataSourceXMLSchema method that I’m using. I’ve broken the YUI3 abstraction by being provided with a raw DOM Node, something which doesn’t seem correct with the rest of the library. Again, something a DataSchema designed for HTML Tables will be able to handle more appropriately.

var schema = table.datasource.get('datasource').schema.get('schema');
schema.resultFields.push({ key: "new", locator: "td[4]" });
table.datasource.get('datasource').schema.set('schema', schema);

Y.Array.each(table.datasource.get('datasource').get('source').rows, function (row) {
    var node = new Y.Node(row);
    node.append("New Column!");
});

Aside from the brief cognitave dissonance of getting a raw DOM Node, and the verbosity of getting properties off of nested plugins, this is not a difficult process, but actually adding that column to the DataTable is where things get more difficult. In YUI2, it was as simple as calling table.insertColumn() with a new column definition. In YUI3, doing this required me reading a lot of the internals of DataTable, which was not completely pleasant. But let’s start with the code.

var columnset = table.get('columnset'), columns = columnset.get('definitions');
columns.push({ key: "new", label: "This is a new column", sortable: true});
columnset.set('definitions', columns);
columnset.initializer();
table.set('columnset', columnset);
table.datasource.load();

Looking at it now, it’s a bit anti-climactic. It isn’t that much code, but writing it took a deep understanding. But I’m going to hilight a few of the most important calls. The value pushed onto the columns array is the same as the column definition when building the datatable, however, what is actually done internally is YUI builds up a ColumnSet and a collection of Y.Columns. It would, no doubt, be possible to create the Y.Column instance directly and append it to the ColumnSet, but by modifying the definitions and reinitializing the set, I don’t need to know the details of how it’s set up. Finally, setting the columnset Attribute on the table, forces the table to generate the table headers for the new columns, and finally we need to reload the data.

The above code came from a desire to do as little work as possible in the backend, as adding a new column seems that it should be fairly easy. However, in truth, while I’m not creating a new ColumnSet object, I’m doing all the work except creating a new object. I could just as easily take the definitions array, and pass that into table.set('columnset', columns);. This does a bit more work, but probably not much, and saves a couple of lines of code in my source, looking perhaps a bit simpler.

var columns = table.get('columnset').get('definitions');
columns.push({ key: "new", label: "This is a new column", sortable: true});
table.set('columnset', columns);
table.datasource.load();

Either way, this is not an ideal API. To start, an ‘addColumn’ or ‘insertColumn’ method is required, either on the DataTable, or the ColumnSet. Putting it on the ColumnSet makes a lot of sense, but it would still require the set('columnset' method on the DataTable in order to ensure that the DataTable was updated with the new column definitions, or the columnset should fire a changed event that the DataTable can respond to.

There is, however, one potentially big issue with DataTable in it’s current implementation. When table.datasource.load() is called, it creates a new RecordSet object which is what is the actual underlying data for the DataTable. This is, in part, because DataSource is mostly stateless. It’s designed to be pointed at a collection of data, and to return that collection to a callback function. RecordSet actually contains that data. The problem with the current implementation of the DataSource plugin for DataTable, is that it completely replaces the RecordSet in use, which can actually break other plugins on DataTable.

For instance, I was using the DataTableSort plugin to allow my headers to be clickable to sort by any column. This plugin works by augmenting the DataTable’s recordset with the RecordsetSort plugin to help with it’s implementation. However, since DataSource replaces the Recordset, instead of modifying it’s data (even if that were to be by emptying it and reloading it), every call to table.datasource.load() must be followed by replugging the RecordsetSort into the Recordset, which is not expected behaviour. I also found DataTableSort to be painfully slow, but that might be an implementation detail on my end, I haven’t determined yet.

YUI3 DataTable doesn’t support editing yet. Row Clicks are not explicitly supported (though easily delegated). In short, if you just need to display data and want to allow some interaction, such as sorting and filtering, then YUI3 DataTable will probably meet your needs, but it has a ways to go, and the API is liable to change going forward. Still, it’s a solid core to work from, and I look forward to seeing it moving forward, especially if these API issues I ran across can be fixed.

Book Review: RESTful Web Services Cookbook

O’Reilly Media is really fond of the ‘Cookbook’ format of technical guide. I’m guessing it sells well for them, and I’m glad because I happen to enjoy it. They’re good for reference for solving specific kinds of problems, and if you’re like me, they provide a solid ‘learn-by-example’ mechanism to learn how to work with a new technology or technique. It’s not ideal for everything, but it’s good. I’ve had web-service stuff hanging in the back of my head for a while, so I decided to pull the “RESTful Web Services Cookbook” by Subbu Allamaraju1 to stimulate my thinking on the matter.

And it worked greatly. This particular cookbook is of the ‘Patterns’ variety, as there is absolutely zero code anywhere in the book (XML and HTTP Headers don’t count). This never really feels like a weakness, since HTTP and XML (and even JSON) are all well supported across virtually every programming environment you could imagine.

While these cookbooks are not necessary to read from cover to cover, and this is no exception, I found it to be comfortable to read from that perspective. The chapters are organized such that they start with the simplest things that you need to know to work in REST, and then slowly layers on top of that with more detail and nuance that help to make a really stellar web service. And I suppose that you could be building a service while reading this book, though I found it short enough I’d probably suggest reading at least any potentially relevant recipes while in the design phase, since I think some recipes are easier to integrate into an existing design than others, though depending on your tolerance for changing the API, or the ease of your ability to keep multiple version endpoints, this may be more or less of a problem you.

I started reading this very shortly after I had put together an API for Washington State University Schedule data2, and I found this really inspiring. Our heaviest API user is using a few hundred megabytes per day, and the Queries chapter hinted me toward the way that I can reduce that usage dramatically by giving them more querying options, which incidentally also supports my plans for a Mobile site that can be completely API driven in the near future, and saves me from creating a ton of specialized endpoints when query parameters just make more sense. A lot of this book isn’t relevant to what I’m doing at this moment, since I’m working with GET only resources right now, but the more I play around with REST, the more I like it as a mechanism for web services, there is an elegance there that I don’t see in other web-based API standards, and it’s not really even a standard on it’s own, it just leverages other existing standards.

I think this is a great book for anyone working in REST. I have a few others on my reading list that I need to get to, but this one has definitely pushed my thinking on this technology in a way that has me really interested in how I can best design my services. I can’t say that it’s definitive, but I was really happy with having read it, and I look forward to reading more on the topic.

  1. http://oreilly.com/catalog/9780596801694
  2. http://schedules.wsu.edu/API/

When a Keyword....isn't.

JavaScript is a beautiful language, but it has some bad parts, like any language. JavaScript’s Bad Parts can be basically be broken down into two categories: Those that are confusing, and those that are dangerous. This post is specifically about one of those dangerous features.

Recently, I found a subtle bug1 in YUI where the array methods defined in NodeList would fail with edge-case values, such as 0. The bug was on a line of source reading as follows:

while (arg = arguments[i++]) {

This is a common trick, one often used in C/C++, where you can perform an assignment from the array, and test for the values ‘truthiness’ all in a single statement. The problem is that JavaScript has a broad definition of what is ‘false’ in the language. The list of ‘false’ values, when used in conditionalys, include: boolean false, 0, ”, NaN, null, and undefined. However, there are situations where one of these ‘false’ values would be perfectly valid.

I ran into this bug, when working on my gallery-node-extras2 module, specifically the nextAll and prevAll methods, get all the siblings preceeding or following a node that match a given selector (or any valid argument to filter, which includes functions if you use my gallery-nodelist-extras3). In prevAll, I get all the siblings that precede the current node with the following snippet:

    var list = this.ancestor().get('children');
    list = list.slice(0, list.indexOf(this));

In my tests, this failed, because the call to slice was returning all the children, because the first argument, 0, is falsy, and therefore the while loop doesn’t execute at all, resulting in slice being called on the NodeList with no arguments, returning a shallow copy. This bug has been fixed in git master4, but Matt Sweeney didn’t take my suggested fix.

// My fix:
while ((arg = arguments[i++]) !== undefined) {

// What Matt Committed
while (typeof(arg = arguments[i++]) !== 'undefined') {

I was a bit confused. My version was more compact, and it seemed that it should be the better solution. However, I respect Matt a lot, so I knew that there must have been something I was missing. It turns out, that in JavaScript, the value of the keyword undefined can be…redefined. So, if you set undefined to something ridiculous, say the value of Pi multiplied by the Avagadro’s number5, my code would find itself in an infinite loop!

So, why is undefined allowed to be variable? I have no idea. If anyone can tell me how this came about, I’d LOVE to hear it. The ‘null’ keyword can’t have it’s value changed, so this just seems like a ridiculous oversight. If you’re using this oddity of the language, stop. You’re a bad person. And I don’t mind saying so.

Firefox 4 has made undefined read-only, as demonstrated by the following script.

undefined = true;

console.log(undefined);
console.log(typeof(undefined));

test2 = function() {
    undefined = true;
    console.log(undefined);
}

test = function() {
    "use strict";
    console.log(undefined);
}

console.log(undefined);
test();
console.log(undefined);
test2();
console.log(undefined);

/// Outputs in Firefox 4 Beta 10
/// undefined
/// undefined
/// undefined
/// undefined
/// undefined
/// undefined
/// undefined

/// Outputs in Chrome 9.0.597.86
/// true
/// boolean
/// true
/// true
/// true
/// true
/// true

The console.log statement inside of my ‘strict’ block is still tainted by the non-strict reassingment of undefined in Chrome. Even if you get rid of the first reassignment, calling test2() above redefines it for test(). Hopefully, when Chrome supports ES5 strict mode, it will protect undefined as agressively at Firefox 4. Incidentally, when in strict mode, trying to assign to undefined raises a TypeError due to it’s read only nature.

Why ECMAScript 5 did not promote undefined from what appears to be a variable into a protected keyword, like null is (Chrome balks at attempts to assign to null), I have no idea. Protection of undefined should not be down to setting the read-only flag on the undefined property of the window object, especially when null is already appropriately protected. This is the kind of language change that I think should have been made, because any code it may have broken isn’t just bad, it is dangerous. There are already people calling for the death of JavaScript due to such poor security decisions6, a sentiment I don’t necessarily agree wholesale with, but I think that the language can be made better by fixing this sort of thing. Even at the risk of breaking code.

At the very least, it would have been lovely to see this addressed in EcmaScript 5’s new ‘strict mode7.’ Using strict mode is simple, though it’s opt-in. Basically the first instruction in your JavaScript file, or a given method, must be the string "use strict"; For YUI3, which is built on the module pattern, this allows you to enable strict mode for your module easily, since your module code is wrapped either in an add function call or a use call. Unfortunately, strict mode is only enforced at the moment in Firefox 4, which is still in Beta.

Had this been enforced in strict mode, then we could be ensured that undefined was actually, well, undefined in all strict-mode scripts. We could have expected a reference error if, by some accidental idiocy we attempted to change the value of undefined. And, there would have been reason to begin enforcement of ES5 strict mode through a page-level meta tag which could have set a flag requiring all code on the page be strict. That way, strict mode is still opt-in, but I can be notified if I’ve either written, or am trying to use, shit code that I should probably have some concerns with.

  1. YUI3 Ticket #2529933
  2. Gallery Node Extras Information
  3. Gallery NodeList Extras Information
  4. YUI3 Commit Record for Ticket #2529933
  5. Avagadro’s Constant Wikipedia
  6. JavaScript Security Problems Slide Deck
  7. Douglas Crockford on Strict Mode

Runtime-Built LINQ Clauses Building Expression Trees

Microsoft’s Language Integrated Queries (LINQ) introduced in .NET is an amazing tool, however, there are several types of requests that can be…difficult to build in the default engines. The problem I encountered was where I had a string of data which was being treated as an array of single-character codes that I would want to query for a subset. The data in my database could appear as ‘ABCDE’, and I’d want to match it if it contains A or D, or if it contains neither.

Schedules of Classes Footnote Selection

This implementation is being used on Washington State University’s Schedules of Classes Search (pictured above), to handle the footnote search listed near the bottom of the search form. My constraints are as follows:

  1. Footnotes is stored in the database as a CHAR(5), but is to be treated more as an Array than a String
  2. I want the records that match either ANY of the user selected options, or NONE of the user selected options, depending on whether or not they choose the ‘exclude footnotes’ option
  3. This should be convertable to SQL and run on the SQL Server

Point 3 is important. Using LINQ-to-SQL I had the option to pull back more records from the database and do the additional filtering on the client, but in circumstances where a user was only searching by footnotes, this would result in an enormous delay as we transfer back way too much data from the database to be filtered in a much slower .NET layer. SQL is designed for data searching, and we should let it do the work.

My initial implementation worked in the .NET layer, since I wanted something that ‘just worked’ and looked something like this:

foreach (char c in footnotes) {
    queryLocal = queryLocal.Where((s => s.Footnotes.IndexOf(c) != 1) == !excludeFootnotes);
}

Unfortunately, this didn’t even work. For one, it was slow, especially when footnotes were the only search term. For another, it only represents AND relationships, when my requirement was for OR relationships. Oh, and the s.Footnotes.IndexOf thing is because LINQ-to-SQL can’t translate the Contains method, but this is a minor issue.

For many people, unfortunately, this probably would probably be a non-starter. However, what LINQ does internally is convert the Lambda expression you provide it, into an Expression Tree, which allows the C# (or VB.NET) code to be translated to SQL. With that in mind, what’s stopping you from building your own custom expression tree? Nothing…except the willingness to step just a little ways down the rabbit hole.

Let’s first build the function we want to add to our WHERE clause out using the following prose, which does not include the ‘exclusion’ flag from the requirements:

LINQ Expression Tree Steps Example

For each character in the user input, we want to test that character to see if it is contained in a string of input from the database. If any character in the user input is found in the database input, the row should be returned. For user input ‘BD’, the following boolean expression should be generated: (‘B’ in databaseInput) || (‘D’ in databaseInput).

If the exclusion flag is set, the expression will be: (‘B’ not in databaseInput) && (‘D’ not in databaseInput), which can be more easily represented by the homomorphism: false && (‘B’ in databaseInput || ‘D’ in databaseInput). This homomorphism is important, because now we have two forms of the same boolean expression. These can essentially be represent as follows: !exclude && (‘B’ in databaseInput || ‘D’ in databaseInput), which allows me to build a single expression regardless of the value of my exclusion flag.

This can not be directly plugged in LINQ because I don’t even know the length of the user input at compile time. But let’s start with an input of 1, and look what it takes to build that simple conditional. First, some book-keeping.

var param = Expression.Parameter(typeof(Database.SectionInfo), "sectionInfo");
var indexOfMethod = typeof(string).GetMethod("IndexOf", newType[] { typeof(char) });

Now, the boolean expressions we have above are useful, but what we’re really getting ready to do is build an expression tree. Let’s begin with a single footnote conditional and see what that would look like. As a boolean expression, we have (‘B’ in dataBaseinput). In C#/LINQ-to-SQL, this will look like si.Footnotes.indexOf('B') != -1.

This gets me a parameter I can use which represents the SectionInfo table in my database that I will be querying, as well as a reference to the IndexOf method that I require for my test. You’ll need an Expression.Parameter for each argument you’ll need to add at runtime, and a Reflective reference to any methods. There is no way around this. Next, let’s build the first request. Keep in mind that Expressions can only operate on other Expressions, so we need to use the System.Linq.Expressions.Expression factory methods to generate our expressions.

// Step 1. Lookup the Footnote Properties
// The param value is the one declared in the 'book-keeping' section above.
var footnotesProperty = Expression.Property(param, "Footnotes");

// Step 2. Call IndexOfMethod on Property
// indexOfMethod was also declared above
// Expression.Constant converts a static value into an Expression
var methodCall = Expression.Call(footnotesProperty, indexOfMethod, Expression.Constant(inputCharacter));

// Step 3. Test return of method
Expression.NotEqual(methodCall, Expression.Constant(-1));

// All together now:
Expression.NotEqual(
        Expression.Constant(-1),
        Expression.Call(
                Expression.Property(param, "Footnotes"),
                indexOfMethod,
                Expression.Constant(inputCharacter)));

Next, we want to expand this to include a list of elements, which can be wrapped easily in a ForEach loop:

var optionsString = "BDF";
// Build the first request
var builtExpression = 
        Expression.NotEqual(
                Expression.Constant(-1),
                Expression.Call(
                        Expression.Property(param, "Footnotes"),
                        indexOfMethod,
                        Expression.Constant(optionsString[0])));

// Build the subsequent OR parts
foreach(char option in optionsString.substring(1)) {
    builtExpression = Expression.Or(
            builtExpression,
            Expression.NotEqual(
                    Expression.Constant(-1),
                    Expression.Call(
                            Expression.Property(param, "Footnotes"),
                            indexOfMethod,
                            Expression.Constant(optionsString[0]))));
}

Of course, this should be rewritten according the DRY-principle:

private Expression checkForFootnote(ParameterExpression param, char footnoteCode) {
    static MethodInfo indexOfMethod = typeof(string).GetMethod("IndexOf", newType[] { typeof(char) });

    return Expression.NotEqual(
                        Expression.Constant(-1),
                        Expression.Call(
                                Expression.Property(param, "Footnotes"),
                                indexOfMethod,
                                Expression.Constant(footnoteCode)));
}

var optionsString = "BDF";
// Build the first request
var builtExpression = checkForFootnote(param, optionsString[0]);

// Build the subsequent OR parts
foreach(char option in optionsString.substring(1)) {
    builtExpression = Expression.Or(
            builtExpression,
            checkForFootnote(param, option);
}

This gets us 90% of the way to where I need to be. The only thing we’re missing is the exclude option. In my implementation, exclude is equal to true when I want it active, and false otherwise, which means it must be negated in order to meet the requirement above. Remember in this case, the boolean expression we want is: !exclude && (footnoteCheck) where footnoteCheck is the expression constructed above.

builtExpression = Expression.And(
        builtExpression,
        Expression.Not(Expression.Constant(exclude)));

At this point, we’ve built the entire expression in a way that can be sent to LINQ for conversion into dynamically generated SQL. After this was done, I saw a dramatic increase in speed on footnote-only searches via our search page as I’m letting my SQL Server do the work it was designed to do. More than that, I’ve been able to deconstruct a method to show that nearly ANY data processing problem can be solved in LINQ, only requiring a bit more thought.

  • http://msdn.microsoft.com/en-us/library/system.linq.expressions.expression.aspx