Mad, Beautiful Ideas

Search

About this Archive

This page is an archive of recent entries in the Computing category.

Cleaning is the previous category.

Food is the next category.

Find recent content on the main index or look in the archives to find all content.

JS Array lastIndexOf

Categories

Recent Entries

Recently in Computing Category

A Comparison of JavaScript Compressors

1 Comment

At YUIConf this year, I mentioned to someone that I was playing around in Node with statically analyzing YUI to aid in crafting use statements that included no more than was strictly necessary. That’s an interesting idea, but is probably not very realistic in pure static analysis, and it’s really not the subject of this post. In that conversation, I was told about UglifyJS, a JavaScript compressor written for NodeJS that works by creating an abstract syntax tree of the JavaScript before minifying it. It seemed that an AST could help in my static analysis, so I started looking at it.

In the README file, under the “Compression - How Good is it?” heading, some basic figures were provided using two libraries (DynaARCH and jQuery) and three minifiers (UglifyJS, YUICompressor, and Google’s Closure Compiler). You can go read his results there, but I took a few issues with his data. First, we were not told what versions of the libraries he used, or the compressors. Recent versions of the YUICompressor have some significant performance increases that I wanted to test as well. Second, YUICompressor was designed to make a few decisions that were meant to improve gzip compressibility of a program, so you can’t really examine the compression without considering gzip, in my opinion.

Let me first describe my methodology. For time performance tests, I ran through each compressor 1000 times using the UNIX time program to get runtimes for each run, I then took the arithmetic mean of the runtimes for each library and compressor combination to get my figures for the following charts. I also ran each compressor one additional time to save a minified version to disk, and gzipped a copy of the minified file to do the file-size comparison. This was done with the aid of a shell script. The tests were run on my workstation, an AMD Athlon 64 X2 Dual Core 6000+ running Ubuntu 10.10, but of course the more interesting part is the relationships, not the raw data.

I tested the following libraries:

  • jQuery 1.4.3
  • Mootools Core 1.3
  • SimpleYUI 3.3.0pr1

Using the following compressors:

  • YUICompressor 2.4.5pre in command-line mode (Distributed with YUI Builder)
  • YUICompressor 2.4.5pre in server mod
  • UglifyJS 0.0.1
  • Google Closure Compiler from 2010.11.10, running with the SIMPLE_OPTIMIZATIONS flag

I want to first address speed. In the readme, the UglifyJS author says that Uglify is about twice as fast as YUICompressor and more than four times as fast as the closure compiler. One of the largest criticisms of closure compiler is it’s speed, and YUICompressor definitely suffered a similar reputation. However, in recent builds, Reid Burke of Yahoo! modified YUICompressor to run as a server that you could POST files up to using cURL, meaning you don’t need to pay the Java startup costs on every invocation. This has dramatically improved the performance.

Time to Compress Data

UglifyJS’s speed is very impressive, given that it’s a pure-javascript implementation. Anyone who thinks JavaScript is too slow now? However, when you take out the startup overhead by running YUICompressor in server mode, it destroys UglifyJS. And Closure is just slow, but everyone knows that. There is one fascinating anomaly in the above data, and that is jQuery. Now there are those who argue that jQuery’s code standards are somewhat shoddy, and that makes minifying it properly difficult, which may account for why YUICompressor takes so long to compress it. And Reid tells me that YUI Compressor uses a slightly different set of defaults in server mode as command-line mode. It would be interesting to see what about jQuery’s source hits this awful regression so hard.

Uglify’s process of building an AST of the program seems to be fairly resilient against poorly formed inputs, which is why it’s compression time seems to be more variable on file-size than anything else. While this means that Uglify is consistent, it does raise a few questions with me about it’s resilience. But I’ll do more on that later. For now, let’s look at compression.

File Size Comparisons Data

This comparison is going to be a lot more relevant for most users, since minification is something that is done rarely, where serving the files is done often. Again, the raw data is linked, but what is more interesting is the comparisons visible in the bars above. Closure and Uglify are pretty similar, with Closure usually taking the uncompresed file size, and Uglify being better once gzip is applied. YUICompressor definitely falls short here, coming in roughly 20% less efficient before gzip, and ~15% less efficient after gzip.

I am not saying not to use YUICompressor. Hell, I am looking at integrating it into our build process at work. Unlike Closure and Uglify, YUICompressor can do CSS, so you have one fewer applications to keep around for minification. There is a far more important point: YUICompressor is more conservative, and it changes less frequently. Closure is known to break code under certain circumstances, Uglify is not covered by as many tests as it’s creator would like.

Will Uglify break your code? I’m not sure. It’s mechanism and process look fairly safe, but it remains untested as far as many are concerned. I am interested in working on a set of Compressor test inputs which will verify the code is still functional and behaving consistently after compression, but haven’t begun that project yet. If these tests show anything it’s that YUICompressor should be used in server-mode if you care about it’s speed, and that it can definitely improve it’s minification ratio, and Uglify may be a good place to look for inspiration.

In the future, I hope that a set of confirmation tests for JavaScript minifiers will be available that ensures that the code they output behaves as expected.

Thoughts on YUI vs. jQuery

There have been several discussions about how YUI can better compete in the JavaScript Framework space. The biggest discussion has been held on quora in a thread entitled How could YUI improve its image compared to jQuery, MooTools, etc.. The question garnered enough attention that John Resig, creator of jQuery, the de facto leader in this space, felt inclined to respond.

John’s comments begin fairly reasonably. The YUI project does need to centralize on a single location. Currently, the forums, gallery and bug trackers are on YUI Library, while the core documentation still lives on the [Yahoo! Developer Network (http://developer.yahoo.com/yui/). However, this is already something that is being addressed, as demoed by Allen Rabinovich at YUI Open Hours from September 29th (video), and it’s an issue I know I raised back when YUI Library was launched.

More than that, the redesign of YUI Library really seems to be centered around highlighting the gallery, in effect trying to show off gallery modules as nearly on par with the core library (though still filterable and clearly marked). Which goes a long way to address the issue of making community contributions clear. Plus, the core team has been getting better at looking at external submissions, though there are a few places I’d still like to see them improve in that respect, and no, I’m not talking about my builder agenda.

Now, for the rest, I’m going to try hard to to parrot Nicholas Zakas’ comments, though I agree with pretty much everything he has to say. The very idea that YUI’s association with Yahoo! is a weakness is simply ridiculous.

Admittedly, there are a lot of open source projects which are corporately sponsored that are hampered by that association. MySQL’s development didn’t seem as rich after Sun bought it, and now Oracle is reducing what the open source edition is capable of. Microsoft has a lot of MS-PL code that they can’t take external contributions for, making their code drops nearly useless. Google’s Android is much the same way, since the work being done by Google Engineers only seems to be made public in periodic large code drops.

Particularly through Gallery, YUI has gotten more and more open to external contributions. Personally, I’ve even gotten a few small changes in the core of 3. What makes an Open Source project is not it’s sponsors, but it’s community. It’s why Canonical has been receiving so much flak lately from the GNOME Community. They aren’t playing nice with the rest of the community, in part because they seem to think that innovation can’t happen quickly enough in that environment. I’m split on the issue, but that’s not the subject of this post.

Yahoo! has built a solid community around YUI. And that community has gotten a lot stronger in the last year. Just looking at the gallery shows a healthy number of contributions, most from outside of Yahoo! While the core is mainly commits from inside, they aren’t all, but that’s the way of most Open Source projects. Most commits come from a small core of developers. And Nicholas is right, having YUI inside of Yahoo! drives some pretty awesome development. The core modules for the Rich Text Editor and Autocomplete for YUI3 were driven by internal needs, that will now benefit all of us.

Personally, I don’t view YUI and jQuery in the same space. jQuery has always felt to me like a DSL for the DOM. A person could use it just fine without knowing any JavaScript. And the people I know who love it most, seem to fall into that mold. YUI3 requires at least basic skill with the language, and more if you really want to exploit it well and extend it. There are some serious web sites built with YUI, and it’s excelled at that. I suspect that more people will begin to take note of that sooner rather than later, especially if we can improve the documentation that introduces a user to the library.

It’s not an us versus them argument, and couching it in those terms is going to do us a lot more harm than good.

Palouse Code Camp 2010 Wrap-Up

This Saturday was the first ever Palouse Code Camp, hopefully the first is a long line. We’d been planning it seriously for 8 months or so, though we’d been knocking the idea around for over a year. We did not draw the crowd we’d hoped for (we had ~30 people), so it was a very small event, but those who attended seemed to enjoy the event, so I think it still needs to be counted a success. Our largest failure in advertising was clearly with the students, as we had virtually no student representation, something which we’ve identified ways to fix for next year. Our sponsorship was also dramatically lower than we’d hoped, but generous donations from Microsoft and WSU’s Social & Economic Sciences Research Center ensured that all our expenses for the year were covered, even leaving us a bit left over to keep us afloat until we start fundraising for next year (which will start much sooner).

We really appreciated all our speakers, Jack Stephens from Spokane, gave a sort of overview of LINQ, using part of it from his talk on using LINQ with DataSets. Dave Sargent, an organizer, who talked about Website Performance (based in part on a talk I gave a few years ago) and MS SQL Server Administration. WSU Professor Robert Lewis with is Introduction to Python talk, which I think has finally convinced Catherine that she really ought to learn Python for her research work. Mithun Dhar, our regional Developer Evangelist from Microsoft, came out to talk about some of what Microsoft is doing in the near term, and to give all our attendees a free month of Windows Azure service. Jason Hurdlow, organizer, redid his XSLT talk, focusing this time a bit more on XPath, I think.

However, I want to give a very special thanks to Mark Michaelis, who volunteered to do “as many talks as we needed”, and gladly gave us four. MSBuild, Powershell, MVVM with WPF, and Pragmatic Unit Testing. I didn’t get a chance to attend any fo Mark’s talks, but I do believe they were very well attended. We’d only made contact with Mark a bit over a week from the event, and his support was amazing.

Myself, I gave two talks. The first, was an update of my Introduction to YUI3 talk I gave at Portland Code Camp earlier this year, updated for YUI 3.2.0 (and of course, YUI 3.3.0pr1 was tagged in git today). I had around ten people, and amazing turnout given the size of the event, and was ecstatic. This talk focused more on SimpleYUI, but I made sure to touch on the Module pattern, as I’m still not entirely comfortable with SimpleYUI. The code is up on GitHub, and the slides are on SlideShare. For those who attended the talk, I’d really appreciate any ratings you can provide.

My second talk was about the YUI3 Component Framework and module creation, and it only had a single attendee, but he was willing to stay, and I wanted to talk about it, so I went ahead. Slides here.

We learned a lot to improve for next year, and a lot of the groundwork is done, so I fully expect to have a good success in 2011.

Paper Review: Robust Flexible Handling of Inputs with Uncertainty

This week I read A Framework for Robust and Flexible Handling of Inputs with Uncertainty, written by Carnegie Mellon researchers Julia Schwarz, Scott E. Hudson and Jennifer Mankoff, and Microsoft Research’s Andrew D. Wilson. I grabbed this paper because I’ve seen several references to it recently by people working on touch interfaces, such as Canoncial with the new Multitouch support they’ve made a priority for Ubuntu 11.04. Howver, while the information in this paper is relevant to touch interfaces, I think there is a lot of lessons to be learned from this paper for writers of UI frameworks.

The basis of the methods explored in this paper are fairly simple. Knowing that some events are hard to lock down to a single item, like a user touching a space that overlaps multiple buttons in a UI, the framework would support handling multiple events until it’s able to determine which one is the one that should be handled, based on changing probabilities as condition change. The first example provided was the user touching near the edge of UI window that had a desktop icon also under the window edge. Is the user trying to move the desktop icon, or resize the window? Under the framework, both events would be handled until it can be determined which one was the ‘real’ event. Too much vertical motion? User was almost certainly not trying to resize the window, move the desktop icon. Can’t determine? Don’t do either.

Certainly, this makes a lot of sense in touch. Currently, when designing touch UIs, the current thinking is that touchable controls should be large enough that the user is unlikely to miss. One of the working examples in the paper involves the size of the user’s finger basically covering three small buttons, one of which happens to be disabled, and correctly determining the right course of action, which based on their example may not have worked out if that third button had been enabled.

This research was most interesting, because they were also able to provide examples of how this system could improve the user experience for vision- and motor-impaired users. Plus, it can apply to simple mouse interactions as well. In the default GNOME themes for Ubuntu since 10.04, the border of a window is almost impossible to select for window resizing, making it nearly impossible to resize the windows, unless you grab the corners, which may not always be what you want.

The nice thing about this paper, is that it definitely talks about a framework, and I’m a bit disappointed that their .NET code doesn’t seem to be available. But since it’s a framework, the bulk of the development would need to be done by the UI toolkit developers, and sensible probabilty function defaults could be defined that many developers probably wouldn’t generally need to modify.

In a sense, I’m a bit disappointed by all the attention this paper has gotten in the ‘touch’ interfaces communities, because I really think that it’s as important, if not more, for accessibility, and improving ease of use for impaired users and even the rest of us. It may add a bit of overhead, both on the system and developers, but the authors mention in the paper that they built an accessible set of text inputs in about 130 lines of code, that would determine if input made sense for them via regular expressions. Adding support to take voice input was about another 30 lines of code.

My hope, after reading this paper, is that the multitouch working that is going to be happening in Ubuntu over the next few months will impact the rest of the system positively as well. I think that’s reasonable, since I’ve heard that there is a huge accessibility focus at the developer summit this week. Now, there is still a lot of work to be done in this space, I think, particularly as we move more toward gesture based inputs and such, but there are definitely places where it could be applied today, though I think it would best be done at the UI Toolkit layer, so that it’s available as broadly as possible.

Next weeks paper: Thumbs up or thumbs down?; semantic orientation applied to unsupervised classification of reviews.

Language Support for Lightweight Transactions

For this week, I read Language Support for Lightweight Transactions by Cambridge Researchers Harris and Fraser from 2003. The paper outlines the development of software transactional memory as a feature built into the Java Programming language. I had first read about STM as a feature of Haskell while reading Beautiful Code, which is what pulled me into this paper.

Concurrency is a big problem in modern computing. We’ve reached a point where processors aren’t getting faster, but we’re getting more cores. On one hand, this is great for multi-application environments, because each application can have full access to a core. In practice, this actually doesn’t work, since on Intel chips, the cache is shared between all the cores (which works better for concurrent programs), and some applications really need to additional speed.

STM is an mechanism of defining small atomic operations that you can be assured of will be viewed by every other application as a single op. For a simple example:

class Account
{
    private int balance;

    Account(int intialValue)
    {
        balance = initialValue;
    }

    public int getBalance() { return balance; }

    public void deposit(int amount) 
    {
        balance += amount;
    }

    public void withdraw(int amount)
    {
        this.deposit(-amount);
    }

    public void transfer(int amount, Account toAccount)
    {
        this.withdraw(amount);
        toAccount.deposit(amount);
    }
}

Now, in a multi-threaded environment, the above code could have problems, because deposit actually involves 3 instructions: read balance, add amount to balance, write balance, and if there are two deposits being made to an account, then they could both read at roughly the same time, which would cause their commits to be inaccurate.

With STM, you indicate somehow that they’re ‘atomic’ functions, and it checks that the value in memory hasn’t changed since you read it, to ensure that your changed amount it correct, if it has, the transaction aborts and can be retried again. It’s a similar practice to transactions in a SQL database, and as such, it does add overhead to the operations. But so does traditional locking using mutexes and semaphores.

Which is where the findings of the paper were most interesting. In their trials, using a highly tuned hash-table implementation using traditional locking mechanisms, either a single lock for the entire table, or fine-grained locking, and a simple implementation using STM, they found that the overhead on each operation in the STM case was actually pretty small compared to the fact that it was essentially non-blocking, only needing to redo work on the off chance that the same records were being updated.

With language support, which in Java would look like this:

class Account
{
    private int balance;

    Account(int intialValue)
    {
            balance = initialValue;
    }

    public int getBalance() { return balance; }

    public void deposit(int amount) 
    {
        atomic {
            balance += amount;
        }
    }

    public void withdraw(int amount)
    {
        atomic {
            this.deposit(-amount);
        }
    }

    public void transfer(int amount, Account toAccount)
    {
        atomic {
            this.withdraw(amount);
            toAccount.deposit(amount);
        }
    }
}

The atomic keyword, at least in the research compiler that this paper was based on, who’s source can be here, would handle the creation of the transaction, as well as the enforcement of it. With STM, deadlocks are basically impossible. If a transaction fails, it can be re-executed after waiting a random interval, and move on from there. Using mutexes, the transfer function could easily deadlock with a programmer error.

This paper was really interesting, because the Haskell version makes a really big deal about not allowing functions that perform IO inside of transactions, and the Beautiful Code discussion of the feature made it sound like STM practically required being used in a purely functional language, but this paper showed that clearly wasn’t the case, and has made me very curious about other ways to use this technique.

Next weeks paper will be A Framework for Robust and Flexible Handling of Inputs with Uncertainty.

Twitter Under Crisis

This week, I read Yahoo! Research’s paper Twitter Under Crisis: Can We Trust What We RT?, written by a trio of researchers out of Chile and Spain. The analysis was done over all the tweets sent over twitter over the time period of February 27 to March 2 of 2010, which most likely was sourced through Yahoo!’s access to Twitter’s ‘firehose’.

The crisis was the major earthquake (apparently the seventh worst ever recorded) that hit Chile on February 27, and the tsunami’s that hit shortly afterward. They filtered the data to try to focus on only those accounts based out of Chile (based largely on Timezone settings, which was the most reliable indicator they had available), and a few other factors so they could limit the data set, but also try to focus on those users most directly affected by the disaster.

The Chilean Earthquake was interesting, because as a disaster did a ton of damage to Chile’s telecommunications infrastructure. As would be expected, the traffic spiked early, quickly overtaking discussion of other events around Chile at the time, but petering out within a week or so. However, the idea that people use Twitter to trade news about disasters is hardly news.

What they found was largely unsurprising. Most people (68%) only sent out one or two tweets specifically about the disaster. The most active tweeters about a disaster generally had the most followers, but they were also generally news outlets covering the story (the top was an account named “BreakingNews”). One thing that really surprised me, was the relatively low number of retweets about the disaster, but I suppose that people in the heart of it, weren’t spending a lot of time reading their Twitter feeds for things to resend.

The keyword analysis was also fascinating, showing that Twitter could be used to gauge the progress of a disaster. The first day was all about the earthquakes, resulting tsunamis and people dying. Day two and three focused on looking for missing people, and day four had a ton of discussion about the NASA story saying that this earthquake was so powerful, it actually disrupted the rotation of the earth making days approximately 1.26 microseconds shorter.

The next interesting part of the Analysis was looking at the discussion of fourteen rumours spotted in the Twitter data, seven proven true, seven false. This is a small data set, but the findings are interesting. People were far more likely to question or deny the false rumours (oddly, there were still a lot of affirmations of the false rumours). This is going to require more study, but with enough data, it appears that Twitter can be used as a reasonable predictor of the truth of a claim made by someone on Twitter.

There were interesting findings in this paper, but for the most part, I think it’s a starting point. The findings are promising, in that if you had full access to Twitter’s Firehose, you could form a lot of reasonable conclusions from the data hitting Twitter over the course of the disaster.

Next week, I’m going to be reading Language Support for Lightweight Transactions, which I’ll post a link to, if it’s in an Open Access journal. The paper serves as the basis of features like Haskell’s Software Transactional Memory.

Becoming a Better Developer: Reading Papers

1 Comment

If I think undergraduate education, across pretty much all fields but I’m going to focus on Computer Science, is really lacking, it’s that undergraduates don’t read research papers. I’m sure some programs are different, but Catherine didn’t read any until she was in her last year of her undergrad, and that might have been related to the course being offered as graduate credit more than anything. I’m sure I was never assigned any to read, though I did from time to time.

Computer Science is unique in this respect for two reasons: there is as much, if not more, research going on in corporate research groups as in academia; but also because so many of the journals in our field are open access, so we can get access to the materials without paying.

Periodically, I’ve made an effort to read a paper a week. I’m planning on returning to that, and I think I’ll post a summary of my thoughts each week, as I did last week. With that in mind, next week’s paper is going to be Twitter Under Crisis: Can we trust what we RT?, so if you want to read and discuss it, then I’ll be glad to engage in the discussion.

Sentiment Analysis

I learned about two awesome things this morning. First Yahoo! Research is on twitter. Second, there is a computer science conference celebrating women in computing named after Grace Hopper. The link from Yahoo! Research pointed to a paper from that Conference on the subject of managing communities by automatically monitoring their moods, based on the premise that a community that tends toward being very sad and angry, tends to discourage participation. Though this does need to be balanced with too much happiness, which aside from simply being treacle, implies that debate and discussion is simply not welcome in the community.

The paper, entitled Anger Management: Using Sentiment Analysis to Manage Online Communities, presents the findings of Yahoo’s Elizabeth Churchill, and Pamona College’s Sara Owsley Sood as they analyzed comments left on a sample of news stories to determine if the comment was relevant, and the overall tone of the comment. The most interesting discussion I saw, was that centering around the differences in language used in different problem spaces. ‘Cold’ is a positive word when talking about drinks, but negative when talking about people, for instance.

The research is, as most research is, based on other work. The relevance section is based on a 1988 paper that I need to read, where they took the algorithm from that other paper and used the article text as a source to compare the comments to in order to generate a relevance score. I’m guessing the analysis is done in a bayesian fashion, but what was really interesting was how this particular method of relevance analysis is how it could be applied to comment filtering on a blog or something similar. Lately, I’ve been deleting a lot of spam comments from here, that actually looked like they might have been non-spam, until I saw what post they were applied to.

The mood analysis was very interesting, though much of this paper was based on other papers that I have not read. However, they seemed to split the analysis into three categories: Happy, Sad, and Angry. Personally, I would like to see another dimension to the analysis, Hostility, that would attempt to detect the difference between someone who is passionate about a subject (which can be easily mistaken for anger) and someone who has gone hostile. But in my experience, the more dangerous thing in a community is not ‘anger’, but hostility. Still, to be able to do a near real-time analysis of mood based on text, which could potentially flag a moderator, has some interesting uses. Again, I suspect this analysis is fundamentally Bayesian.

It may seem that I find this work to be derivative, since I mention several times more papers I need to read. All academic writing generally leads to other sources that must be read to fully understand a topic, but this is also a short paper. It combines a few techniques to reveal a potentially very useful automated system to aide moderators (probably not replace, yet), and it shows it succinctly. It also raises a lot more questions than it’s able to answer at this time, in that the data coming out of this system could be used to aid analyzing and predicting trends in a community before small problems could become big ones. A short paper, yes, but one that may well serve as a pivot in moderation systems.

If there is a weakness to this system, it’s the same as any Bayesian modeling system, it requires a fair amount of domain-specific seed data to be able to determine mood. If a group were to implement this today, they’d need to spend a substantial amount of time training the algorithm for their domain for it to be most useful. Hopefully, corpuses of data can be formed around the various domains to train these sorts of algorithms more easily.

More ANOVA Data in R

In Catherine’s phylogenetic research, she has had the need recently to do some ANOVA analysis on a data set for her current project. Luckily, R has it’s stats module which has good support for this analysis via it’s cor function. However, the cor function, only returns the correlation matrix.

However, there is other relevant information generated from ANOVA that is relevant to the work that Catherine is doing, and is returned on a pair of columns from the cor.test function. She was mostly interested in the p-value, but cor.test also makes available a few other data fields.

To meet Catherine’s immediate need, I wrote the following function, which returns a list of matrices of results from cor.test, the first being the p-values, the second being the t-values, then the parameter, and finally the correlation.

corValues <- function(x) {
    if (!is.matrix(x))
        x <- as.matrix(x)

    size <- attributes(x)$dim[2]
    p = matrix(nrow=size, ncol=size)
    t = matrix(nrow=size, ncol=size)
    df = matrix(nrow=size, ncol=size)
    cor = matrix(nrow=size, ncol=size)

    i <- 1
    while(i <= size) {
        j <- i
        while (j <= size) {
            rv <- cor.test(x[,i], x[,j])
            t[i,j] = rv$statistic
            t[j,i] = rv$statistic
            df[i,j] = rv$parameter
            df[j,i] = rv$parameter
            p[i,j] = rv$p.value
            p[j,i] = rv$p.value
            cor[i,j] = rv$estimate
            cor[j,i] = rv$estimate
            j <- j + 1
        }
        i <- i + 1
    }
    list(p, t, df, cor)
}

It’s noticeably slower than the cor implementation, but it works fast enough. Mainly, I’d like to see this cleaned up to the point that I can at least take arguments similarly to the way the available methods do, but if you’ve got a matrix of data you want more than just correlation values for, the above does work fairly well.

Latent Sexism in Technology

Recently, Google posted a new video and blog post entitled “Grandmother’s Guide to Video Chat”. They’ve even included printable instructions. Aside from the 1950s reminiscent visuals of a little old lady, and the word ‘grandma’ in a few places, it’s just a pretty good description of setting up Google Voice & Video Chat. Now, I’ve never bothered with the software since it doesn’t have a Linux version, but easy video conferencing is important for a lot of people. Since my wife and I will likely be moving away from our parents in the next few years, before we have any children, I suspect that once we do have kids, we’ll be on video chat of some kind pretty regularly.

So, why do I bring it up? At first I just found it interesting that Google decided to go the Grandma route. It reminded me that we almost always talk about people who are uncomfortable with technology by talking our mothers and grandmothers. In the Linux world for years we’ve talked about the ‘Mom Test’, to determine when the OS was available for non-geek consumption. People on Planet Ubuntu still talk about the Mom Test on a regular basis. This prompted me to ask, why do we only ever talk about out mothers when it comes to problems with technology?

I got the following response on Twitter:

@foxxtrot Because people want to take care of their mothers, so it is more frustrating when their mom can’t use technology?

It’s probably a reasonably valid point, but I know I spend a hell of a lot more time explaining things how things on the computer work to my father than my mother.

What I find most interesting, is that the technology industry has been working to promote women on technology via Ada Lovelace Day, the setting up of Ubuntu Women, and competitions to bring attention to women using Linux. Point being, that we recognize the stigma against women in our discipline, but I think the fact that we talk about the ‘Mom Test’ for technology, or Google’s new ‘Grandmother’ guide, suggests a latent sexism that still hasn’t left the industry.

Perhaps I’m reading too much into Google’s campaign. I’m positive that they intended the campaign as nothing more than a humorous set of instructions for setting up your computer for video chat. It may not have been any better if it was “Grandfather’s Guide”. Any generalization is going to be at least partially insulting to someone, and all I’m really trying to suggest is that we need to be fully cognizant of the implications of such generalizations, and in computing, the generalization that women aren’t good with (or interested in) computers is a problematic one. I don’t have statistics handy at this moment about the gender distribution in computer science programs, or similar disciplines. I do know that even five years ago, as I was finishing up my Undergrad, it was still very low. I suspect the trend is up, but if we really want to be vigilant about making that distribution more even, it’s important we’re careful about this particular generalization year round.