July 2008 Archives

Have No Privacy, Do No Evil

Google has always touted themselves as a company which will do no evil. It’s a part of their posted Corporate Philosophy. However, in light of the recent revelation that Google keeps months of user-identifiable YouTube logs, and their recent claim that complete privacy is an unreasonable expectation, some people (okay, Slashdot), seem to wonder how accurate that claim is.

Admittedly, much of the commentary on the Slashdot article is standard Slashdot pants-wetting, but some of it is interesting. Ultimately, I’m not terribly sympathetic to the plaintiffs in the above lawsuit. Do I think that they should be able to have their private drive removed from Google’s Street View? Yes. However, they’re suing for $25,000 for reasons including ‘mental distress’. Admittedly, their suit isn’t too ridiculous as the sum of money is relatively low, and depending on how much they argue that their property value has dropped the amount may be sensible, but I very much doubt that there has been any drop in their property value relating to this photographing.

However, people are starting to notice and think about how much Google actually knows about us. Some people do all their web searching, keep all their e-mail in G-Mail, keep their calendars with Google Calendar, put their documents in Google Docs, and on and on. And Google uses all of this data to form as complete a picture about a user as they can. To date, it doesn’t appear that their using this data to target advertising directly at me, but that is related to it being more efficient to aggregate my behavior with people like me than to target me directly.

Google does a lot of good, I’m not trying to convince anyone to never use Google. I use Google for almost all of my web searching. Google Ads appear on this site. I have a GMail account. I’m interested in developing for and using Android. And I agree with Google’s sentiment that complete privacy is very, very difficult, maybe even impossible, to have in this day and age. But when there is a reasonable argument, such a a private drive, Google should be more receptive to removing such information from their cache. Plus, at my last job, some idiot programmer had, at some point, created an unprotected PHP page that would print out a ton of customer data, including names, addresses, and credit card numbers (if we still have them). Google found and indexed this, and I proceeded to immediately remove it once a customer discovered it while ego-searching. Highly embarrassing, and it took nearly 48 hours for this data to be removed from Google’s cache. More embarrassing from me and my company (I had only just recently took control of the website), but I think Google should hold some shame as well.

Due to Google’s success, and the realization of how much they know, some people are preparing to move forward with plans to take Google down. Cuil is the latest attempt, and their engine and layout is interesting. It’s not terribly accurate sometimes, but it’s interesting. Plus, Yahoo! has really improved their search over the last few years and I’ve noticed that they’ve really improved their relevancy over the last few years. In short, Google isn’t the end-all anymore, and there are privacy concerns with using their services, but those concerns exist everywhere you go on the Web. If you want to maintain any level of privacy, you may want to spread your online identity as much as possible. It’s less convenient, but convenience very rarely implies security or privacy. If you want privacy, you need to be willing to work at it.

All Quiet on the Android Front

The Big G has gone silent on the issue of Android these past few months. Very little of substance has come out since the Google I/O conference in late May. Even worse? The API hasn’t been updated since early March, the latest version being m5-rc15 as of this writing.

This wouldn’t be so annoying if there wasn’t evidence to suggest that Google made updated SDKs available to the winners in the Android Developer’s Challenge, but not to anyone else. There have been bugs in the SDK since March, that we’ve seen no movement on, and I refuse to believe that Google’s gone static. Others are upset about this as well.

Why is this a problem? Simple. The iPhone. Google was in a position at the beginning of this year to make a lot of progress on the developer goodwill front. Android is a better development platform than the iPhone as it has fewer restrictions on Applications, it has means to share data between programs, it doesn’t lock the developer into one single distribution model, and it doesn’t lock the user into a single hardware platform. Will there be a specified baseline Android phone? Very likely. It’s easier for me as a developer if I have a reasonable expectation of a certain level of functionality. And the HTC Dream seems to be that baseline.

Plus, Android users can actually talk about developing for Android. Something that iPhone developers still can’t do. But, the iPhone is apparently taking over the smartphone market. People are excited, consumers are signing up in droves, and developers are scrambling to get on the platform. Of course, there is a second part of this: The rush of Developers to the iPhone has undoubtably sold more Macs, as you must be on an Intel-based Mac to run the SDK. My only Mac is PowerPC, so just as Apple has given me the finger, I’d just like the pass them along a nice big fuck yoo too.

But, the iPhone is here today. And while Android is still on track for the end of the year, those of us who didn’t enter or win the developer’s challenge, need some love too. And we need it before handsets hit the market. If Google wants any chance to unseat the quickly growing iPhone market, they need to help us. We want to talk about Android. Give us something to talk about. We know you’re working hard, but you’re just not meeting our needs.

Moving is Almost Done

No Whole Food Adventures this week, frankly, Catherine and I ate out most days this last week because we just couldn’t be bothered. We’ve began moving across town to WSU’s Family & Graduate Housing, which has a slightly higher base cost per month, but we aren’t paying for Cable or Electrical, so we’re going to save money overall. Still, it’s been a ton of work, and I’m pretty glad it’s almost over.

Luckily, the big move didn’t preclude me from competing in Google’s Code Jam this weekend. Unfortunately, I messed up, and won’t be continuing in the competition this year.

Round 1A, on Friday, I could have continued, but I was working on Problem B for an hour, and decided to jump ship to problem C for the last 40 minutes. Unfortunately, Problem C, which was the grab the three digits to the left of the decimal point for the following equation (3 + sqrt(5))^n for any value of n, was impossible to solve using Python’s standard precision doubles. Unfortunately, I don’t think that Python has a high-precision math library. At least, I couldn’t find one. The real irony, at least to me, was that had I implemented this in C on my 64-bit system, the doubles there may have been precise enough, at least for the small set. There were some interesting matrix-based solutions, but regrettably, my knowledge of number theory isn’t good enough to make sense of what they were doing.

Round 1B, on Saturday, involved problems that my tired body and sleep-deprived mind, couldn’t solve in a timely fashion. At least in Python. Maybe I should have been working in Perl, or C for some of these problems. Oh well, I’ll be practicing, and studying up on problems for next Year’s Code Jam.

ICFP Wrap-Up and Code Jam Upcoming

It’s now been almost two weeks since the ICFP Contest ended, and I just haven’t taken the time to write about the experience. Heath and I worked together this year on the solution, which was to design an AI to communicate with a “Mars” rover as it attempted to navigate a boulder and crater stricken landscape to find its way home Oh, and there are Martians who want nothing more than to smash your delightful little rover to pieces.

The problem spec is fairly simple, but in its simplicity there is an amazing amount of depth in the ways to pursue the problem. We opted (or I did, and Heath followed) for a Python implementation, as I love Python for prototyping, and the problem was less constrained to raw performance and memory efficiency than the last few years problems.

I began by writing multi-threaded app, where one thread would read data from the socket, and parse out informational packets from the server, updating a global world object with things like obstacle locations and current speed and heading. This left our control thread plenty of time to do all the analysis it wished regarding current state for its decision making. Luckily, we had plenty of time to make decisions.

Unfortunately, we never did much with it. We’d talked about implementing a predictive system which would attempt to ‘guess’ where the rover would be at roughly the halfway point between updates, as this would allow sending sensible commands more often. We didn’t get that far though.

It was a good problem, though it’s been years since I’ve had to concern myself this much with trigonometry. We were constantly revisiting the math, trying to ensure that our calculations were correct and sensible, and trying to reconcile Python’s expectations with the server’s expectations. In the end, we had a decent solution that did pretty well avoiding craters and boulders, but was ignorant of Martians.

This leads me to another issue I had. We were only given three test cases, though we could generate JSON to create our own. The test cases we had were fairly thorough, but a part of me wishes there had been more. My more fundamental issue, was that the reference server we were developing and testing against, was likely somewhat gimped compared to the one scheduled to be used in the tournament to be held to determine a winner. Particularly that we were mostly safe to ignore the Martians on the demo server as they were really, really foolish. In the tournament, this could go either way.

Still, it was a great challenge. Simple, but not too much so, and I was glad to see a challenge where I didn’t spend most of Saturday rewriting my engine because it had originally been just too damn slow. Now, we have to wait until the ICFP (which I may try to get to this year) for the results.

Now, we have Google’s Code Jam. I qualified in decent position in the Qualifier, having only done two of the three problems for the qualifier. The problems were nice and simple, but again not trivially so, and I feel that Google’s data sets do an excellent job of testing. They really love to hit the fringe cases, like ‘no data’ and ‘worst-case’ data.

The first problem was given a list of search engines, and a list of search engines to search for, what would be the minimal number of ‘context switches’ needed given the assumption that a search engine should never be used to search for itself. My initial solution to this, a recursive brute force method, was painful. Mostly because one of the test cases involved two search engines, and the search terms just flipped back and forth between the two.

My final solution was to take the list of search engines, and for each member of that list, check the distance before the first appearance of that engine in the terms, and try to find the one with the longest distance. Use that one, and then perform a context switch by removing all the terms up to the offending one, and doing it again. Nice, simple and fast.

The second problem was one of trains traveling between two stations, and how many trains would be needed to make a sequence of trips. This was much easier. I built a list of trips: departure times, arrival times, and departing station. This list would be sorted by departure time, and then I would just take each element in the list, see if a train was available at the departing station by the departing time, and if one was, removing that train from the list of trains, and adding a new train to the destination station with a ready for departure time of the arrival time plus the turnaround time. If a train was not available, I’d note that one needed to be created at the given station, and do the same appending to the destination station list. Clean and simple.

The third problem was a probability problem that I couldn’t remember what to do with. The question was, what is the likelihood of a tennis racket of a given size (expressed in inner and outer radii), with strings of a given radius a given distance apart, of hitting a circular fly of a given radius. No time element is involved.

Given that, I finally decided that this was likely a function of determining first if the fly could be anywhere in the limits of the racket without getting hit (fly bigger than racket, or bigger than space between strings), and then determining the relationship between the area consumed by the racket and the area consumed by the fly. I didn’t get this done before the time limit, and never did finish it, but I decided that it would be an easy issue of determining the area of the ring, a fairly straight forward equation, as it’s just the area of the outer circle minus the area of the inner circle, and then the area of the strings for one quarter of the racket, then multiplying this second area by four.

I got caught up in implementation details on this, as I worried about things like the area under an arch, and derivatives and the like, but I really ought to finish that code…

But, I qualified, and Round 1 is this weekend. I compete first on Friday, and if I fail to make the top 800 in Round 1A, my second chance is Saturday morning. Wish me luck; if I do well here, that’s just one step closer to a trip to Google!

ASP.NET Web Forms Weirdness

I’m not a big fan of ASP.NET. Classic ASP I’m pretty ambivalent to because it’s basically just PHP with VBScript.aspx) and/or JScript.aspx). ASP.NET MVC I actually like, because it actually functions in a web-sensible manner, and I can generate all of the HTML or other output by hand, or by whatever means I want. But the standard ASP.NET model is confusing and irritating to me.

ASP.NET was designed around the assumption that it would be used by Windows Application Developers who suddenly found themselves needing to develop for the Web. This is clear in that ASP.NET controls resemble in many ways the Windows.Forms controls found in .NET, particularly in their event handlers. They seem to be attempting to create state in a stateless environment, which is interesting, but serves as quite the anti-pattern to anyone who has ever done much work on the Web.

As I’ve talked about before, I’ve been working on an ASP.NET WebPart to be placed into Microsoft Office SharePoint Server 2007. Due to the final destination of the code, I really have no choice but to follow the ASP.NET Web Forms model, and I didn’t expect it to be that problematic, as I was merely converting a Web Part written (sloppily) for SharePoint 2003 to run on 2007. I probably could have just dropped the DLL into the new SharePoint, but we also wanted to add some additional security features, so we took the problem a bit further, and I’ve been trying to update the Web Part to compile against the .NET 3.5 Libraries.

As nearly as I can tell, this should have been a seamless upgrade. However, I’ve run into a few problems that are either bugs in the Microsoft code, or my own lack of understanding of the ASP.NET Page Life Cycle.

As nearly as I can tell, ASP.NET begins a postback by trying to rebuild the state of the page before the postback, running the event handlers, and then resetting up the page with the new state and rendering it. If I’m wrong about this, please let me know, but it really does seem like the most likely explanation for what I’m witnessing. This would probably be less frustrating if the initialization methods didn’t require special if blocks to handle the differences in execution paths between a postback event and a fresh page view.

Running on my assumption about ASP.NET’s execution flow, I’ve had a hell of a time nailing down two odd bugs. The first, centered on a DataGrid which was bound to a SqlRecordSet. We’d set the DataKeyField property of the DataGrid to “Id”, one of the values from the SQL statement, but one that was not bound to a column for display on screen. For some reason, the DataKeys member of the DataGrid simply wasn’t being filled in. I finally figured out that accessing the DataKeys.Count field right after the DataGrid.DataBind() function call seemed to fix this issue. Absolutely no idea why, as the Count field is read-only, and I don’t actually do anything with the value. But it worked, and frankly that was enough for that one.

The current issue that I’m having revolves around a DropDownList. The application is a fairly simple database front-end, not unlike what you could build with Microsoft Access. We offer three modes of search, but Student ID (which must be exact), Student Last Name (which can be partial), or Sport the student is playing. The application is used to help ensure that Student Athlete’s are meeting their NCAA requirements, hence the sports part. For some reason, when searching by student last name, or by certain (but not all, frustratingly enough), it becomes impossible to lookup any but the first student on the search results list. The search results are used to populate a DropDownList, which has its AutoPostback property set to true, so that a JavaScript event is fired whenever its value is changed. It defaults to the first student on the list, so data will always come up.

The problem that I’m having is that, save for a few oddball cases, the postback is happening, but the system is convinced that the DropDownList has no items, at least during the Event Handler which handles the DropDownList’s OnChange event. By the time the page renders, this list is populated again with the information it should have. Attempting to access the Items.Count property on the DropDownList immediately after population has failed to fix the issue this time. Though I’m okay with that really. It honestly bothers me that it worked previously.

I’m very, very close at this time to just looking in the Request.Form collection to pull the data that I need from there in the Event Handler, though I’ve refused so far in an attempt to properly follow the ASP.NET model. I just can’t help but feel that the standard ASP.NET Web Forms model is deeply flawed. Not only because it assumes a particular methodology, but also because it’s very difficult to debug these Web Parts because I can’t easily deploy a debuggable Web Part DLL, particularly when said Web Part depends on running within SharePoint, and would not likely run elsewhere.

If anyone can help me with this, please leave a comment. And if you’re a Microsoftie who happens to work on ASP.NET, maybe you can drop some knowledge on me and my readers about what the hell is going on in the ASP.NET execution pathway.

Political Activism in the Digital Age

We live in a fascinating time, politically speaking. Never before has information about the inner doings of our electorate and courts been more available. We have blogs, where people post news, or their impressions of it; sites like wikileaks which try to expose the secrets of all sorts of groups; and so much more. We have legal briefs which are constantly shared online, and can be disseminated faster than ever before due to the ease of digital copying.

However, while we can more easily than ever before watch the government, our ability to communicate with them is being severely hampered. This is a matter of policy, but also a matter of the fact that digital communication is so much easier to ignore.

My mother recently included me on a political e-mail chain letter, which was meant to urge people to sign the petition, and once it reached ~1000 names, send it to a White House drop to convince Bush to veto the legislation. There are two problems with this. One, that Bush would almost certainly not veto the legislation, since it is at least partially in line with policies he’s been pushing for years, but also that small groups of e-mails are easily ignored.

The chain letter (for that’s all it is: a bandwidth-wasting chain letter) should be urging people to e-mail themselves to the White House’s comment box. If you want to make a message, there should be thousands of e-mails reaching that box, each from a unique IP and address (and cryptographic signature, ideally) to prove the point. But it doesn’t change the fact that e-mail is easily ignored.

And rightly so. E-mail doesn’t require much effort. It doesn’t require much thought. It doesn’t require any work to submit it, so while it doesn’t make the opinions expressed through it less valid, it does make them seem less so.

When the Electronic Frontier Foundation is trying to really get Congress’ attention, they don’t use e-mail. They write letters. They make phone calls. These things are a tiny bit harder, cost a tiny amount more, but are infinitely harder to ignore. A thousand names on an e-mail are not even relevant when compared to a thousand letters sitting in the mail, or a thousand phone messages left, demanding that things be done differently.

People are unhappy with the government. I am too. I don’t feel that they’re generally doing anything resembling a good job. But the only way to change the system, is to change the people involved. Make sure they know how you feel on the issues, and if they won’t support your positions, try to get someone in office who will.

One of my core positions is that I feel that career lobbyists have no place in Washington DC, so I tend to prefer politicians who share that feeling. But there are a lot of issues, and since everyone has different opinions, the political system always comes down to which candidate has the fewest opinions you dislike. But just because you’ll never find the perfect candidate (except for yourself, of course), doesn’t mean that you can’t do your part to make sure your opinion is heard. A polite, but firm, phone call is a lot harder to ignore, by any politician.

Whole Foods Adventures: Grains

Recently in modern food science, there has been a big push toward Whole Grain. Myself, i generally feel this is a good thing. Whole Wheat (or other grains) just tends to taste better, and I’m fortunate enough to be descended from a Western European farming culture where quite a lot of our diets were based on grains. Of course, this does imply (and mean) that there are plenty of foods enjoyed around the world that my body isn’t really designed for, so don’t think I’m talking down on anyone.

However, as great as whole grain are, the modern food doctrine likely contains only half the story. Yes, as is claimed our ancestors did tend to eat a lot of whole grains, and didn’t have access to the processed flours and the like we use a lot of today. However, if one does a survey of the majority of indigenous peoples alive today, or looks at grain recipes throughout history, it is rare that they ate the grains without first soaking them (at least overnight), usually in some sort of traditional (meaning rich in lactobacilli) milk product.

It turns out that the bran (outer coat) of a whole grain is rich in Phytic acid. The problem with this is that Phytic acid likes to bind to minerals in our digestive tract, which prevents them from being absorbed. There has been evidence to suggest that people who take to eating a large amount of whole grains, but who don’t soak them first, will tend to develop long term health problems due to mineral deficiency.

Luckily, this is an easy problem to solve. Catherine and I have porridge at least three mornings a week, often more, as it’s a quick breakfast in the morning. I begin the night before, by adding ¼ cup of oats to a ¼ cup of warm water, and a teaspoon of buttermilk. I let that soak while we sleep, and when I get up in the morning, I add a ¼ cup of milk, and put medium heat to it. Stir it occasionally, until the water just begins to boil when you’ll want to turn down the heat and stir it mo re often. Serve with butter and a bit of sugar. Coconut Oil, which is usually solid at room temperature, makes a fantastic replacement for the butter.

It’s just that easy, and the enzymatic and bacterial action that occurred during that soak means that you’ll get better vitamin and mineral conversion out of those grains, and truly get the benefits of those whole grains. People’s all over Africa, Asia, and South America still use these fermentation techniques on their grains, so please, give it a try. You might even find that you like it better.

GTK+ 3.0

gtk-logo.pngThere has been a lot of discussions about the proposed GTK+ 3.0 plan since Imendio presented it at GUADEC last week. Oddly, the presentation is months old, but it seems that the majority of people didn’t become aware until the GUADEC presentation. Imendio, a consulting firm that does a lot of GTK+ consulting, feels that GTK+ needs to move forward, and that their vision is the best means for that. Unfortunately, their vision involves breaking the Application Binary Interface (ABI) stability promise of GTK+ 2.x. Seriously, it’s on page 5 of the presentation linked above.

Why is this a problem? It means that developers will always have to struggle to keep up with the ABI. Independent Software Vendors (ISVs), who have favored GTK+ on Linux to date, would absolutely not put up with this kind of shit. Luckily, Miguel De Icaza, recognizes that Imendio’s plans are short-sighted, particularly in that the current ‘plan’ involves no new features. Breaking the ABI for GTK+ 3.0 isn’t necessarily a bad thing, major version releases change ABIs, and it’s possible to install the 2.x series alongside the 3.x series. Hell, I still have a copy of GTK+ 1.2 on my system, though few applications use it.

Windows is successful because of ISVs. There is really no doubt of this. Unfortunately for Microsoft, some ISVs are beginning to become interested in Apple’s platform, which has a better programming interface. As ISVs begin to support the Macintosh, the current status quo will continue to shift. However, part of the reason that developers love Microsoft’s platform so much is that Microsoft works amazingly hard to prevent ABI breakage. In the early days, this meant buying every application for Windows, and making sure that the old applications worked on the new version. Later, this just meant making sure that the even when fixing bugs, the OS still would function as an application expected, even if the application depended on bad behavior. There is a very, very good chance that I can take a Windows 1.0 Application and run in on XP (Vista actually did change things, so I won’t claim Vista).

Everyone remembers DLL Hell, which was largely a symptom of Microsoft’s approach to this problem, but ultimately, it kept ISVs happy, and kept software coming out for the platform. Apple’s biggest weakness with developers, is that they have been breaking the ABI with every new version of OS X. They usually try to give Developers an opportunity to test the new version of the OS before they release, but this doesn’t always work out. If anything is going to slow the adoption of the Mac platform for desktop application developers. This is it.

And Imendio feels that this is the direction that GTK+ should take. Part of this, I’m fine with. Breaking the ABI when it is necessary to move forward because of an actual design flaw in the previous version, or because some new functionality is actually impossible to provide in the previous version. Occasionally, bringing satellite libraries into the core is worthwhile as well. However, Imendio is not taking about either of these things.

So far, the only things suggested for the 3.0 road map, is getting rid of Public fields in favor a getters and setters, and removing anything marked as deprecated. Okay. Great. Both of these sounds like reasonable things to do. But what then? What does doing these things gain you that you can’t already do in GTK+ 2.x? How does this give you the ability to stack widgets? Build non-standard UI? What the hell is ‘non-standard UI’ (Actually, I think they want WPF-like functionality)? Why do you think we need physics support in GTK+? How do you intend to make creating widgets easier? What kind of OS-level functionality do you think belongs in GTK+? And what’s the deal with tabs?

Okay, so the tabs thing was a joke at KDE’s expense. Back at the point…

Can GTK+ be improved? Almost certainly. However, before it’s worth discussing too much, you need to be able to provide a good case for the move. You need to prove to developers how GTK+ 3.0 will actually improve things for them. Emmanuale Bassi seems to feel that GTK+ 3.0 is absolutely necessary. However, I haven’t read anything that presents a strong case of how GTK+ 3.0 will actually improve GTK+. In fact, Emmanuele even seems to acknowledge that the current GTK+ 3.0 proposal will be feature weak.

The real danger with GTK+ 3.0 at this point, is that the GTK+ team will begin moving in that direction and let the proven GTK+ 2.x fall to the wayside, to be ignored as the team tries to move forward without a plan that ISVs can get behind. Perhaps Emmanuale is right, and the ISVs don’t seem interested in shaping GTK+. However, before embarking on a new major version, the GTK+ team needs to reach out to those ISVs, and make the discussion open to all users of the platform. The GTK+ platform needs to be shaped by the ISVs who use it. And frankly, it should probably be shaped largely by the commercial ISVs. If anything, this is largely QT’s biggest leg up on the API war. It has always been a commercial library, with a commercial company backing it and responding to commercial customer needs.

GTK+ began it’s days as a toolkit for the Gimp, and that’s basically all it was used for until Miguel and Nat decided to use it for the beginning GNOME project. Since then, GNOME has been the primary force driving GTK+ development. But because of the ABI-compatibility promise in GTK+ 2.x, there are more ISVs using GTK+ than even QT on the Linux platform. But this number is still abysmally small. Don’t fork for GTK+ 3.0 until there is a solid picture for what GTK+ 3.0 needs to be in order to supply the functionality that ISVs demand, but GTK+ 2.0 is unable to provide.

I don’t think anyone is completely opposed to the idea of GTK+ 3.0. We just want to see a really, honest to god plan and reason before that direction is taken.

SharePoint 2007 Web Part Security

Microsoft’s SharePoint is a team-oriented Portal System that Microsoft has been pushing heavily due to it’s integration with the Microsoft Office Suite of products. To this end, Microsoft offers a lot of built-in web parts to facilitate this: discussion lists, blogs, simple forums, and quite a bit more. However, occasionally, users will require something more out of SharePoint than is provided by the core functionality.

To fill this void, a Developer can utilize Web Parts, which are small web applications that fit inside of larger applications and are either completely self-contained, or can communicate with other web parts on the same page. Even better, you can write a web part using the .NET 2.0 Web Parts Framework, allowing your common Web Part to function on non-SharePoint .NET Portals. This is useful for web-parts that are largely meant for the consumption of public-data, say a Flickr Web Part, or a weather web part. Sometimes you’re going to want to use a Web Part for other purposes, for instance as a front-end to a database. For that, you need more control, you need to be able to limit user’s access. You need to tie more tightly into SharePoint. Luckily, this can be done easily, with virtually no code changes.

Oddly enough, you don’t even have to inherit from SharePoint’s native Web Part class (Microsoft.SharePoint.WebPartPages.WebPart) in order to get access to the SharePoint Context object (Microsoft.SharePoint.SPContext). In my opinion, it is generally safer to inherit from Microsoft.SharePoint.WebPartPages.WebPart if you use anything in the Microsoft.SharePoint Namespace, as your Web Part will already only work in SharePoint, so you might as well. Unfortunately, this won’t prevent a SharePoint Web Part from loading into a non-SharePoint portal, as Microsoft.SharePoint.WebPartPages.WebPart inherits from System.Web.UI.WebControls.WebParts.WebPart, meaning that if you want your Web Part to always fail gracefully, you’ll require something like this:

Protected Overrides Sub Render(ByVal Output As System.Web.UI.HtmlTextWriter)
                If Microsoft.SharePoint.SPContext == Nothing 
                                ' Output fancy error Message and return.
                End If
End Sub
Of course, you have to be careful about anything that might use the SPContext object earlier in the Web Part execution cycle (CreateChildControls is a common culprit), but this allows you to fail gracefully, which is particularly important if you intend to distribute your Web Part at all. Okay, you’ve got a web part that you want to control user access to in a fine-grained method. As made clear, you do this by checking the current SPContext object, which is static, so you shouldn’t instantiate it. One of the biggest problems that I ran into with SharePoint was determining if the Access Denied page that my users were experiencing were related to code or actually access controls within SharePoint. Due to this difficulty, you’re going to want to make sure you [have the MSDN handy](http://msdn.microsoft.com/en-us/library/microsoft.sharepoint.aspx), or you’re likely to spend a lot of time butting your head against the wall. One potentially easy mistake is accessing the wrong objects to determine permissions. I had begun working under the assumption that to determine a user’s permissions, we’d want to check the permissions that user has on the Site (or sub-Site) that they’re on. Unfortunately, there is a method, CheckForPermissions, on the SPSite object for that. The following code certainly seems like it should work.
SPContext.Current.Web.Site.CheckForPermissions(SPContext.Current.Web.ReusableAcl, permissionsMask)

This only appears to work for users who have Full Control of the site in question. And Microsoft’s API documentation doesn’t appear to offer any commentary on why that might be. If you know, please, leave a comment. Let me know. This is another unfortunate case of a Microsoft API having, or appearing to have, several methods to accomplish a given task, and I really hope that the documentation improves.

It turned out that I’d been over thinking the problem. The SPWeb object give you the exact method you need, though it’s name, while descriptive, wasn’t quite as discoverable. It turns out that the proper way to check a user’s permissions is this:


                SPContext.Current.Web.DoesUserHavePermissions(permissionsMask)

It’s just that easy. The nice part, is that you can now check the permissions against the set permissions in the SPBasePermissions Enum. This enumerated type doesn’t have any permissions specific to Web Parts, but I found that, for my purposes at least, using the ✱ListItem set of permissions was the best means to control the access, as for our purposes users that couldn’t modify the lists also weren’t going to be allowed to modify the data in the Web Part. If this won’t work for you, you may need to create a sub-site, which luckily isn’t very hard, otherwise, I don’t know how to create Custom Permissions. I’m not entirely sure it’s possible.

Maybe the reason that I couldn’t find a reference to this on my Google searches is because other people simply weren’t having this problem, but I was, and I do hope this will help someone else. Why the SPSite.CheckForPermissions method exists, if it won’t work when a non-owner is on the page, then it isn’t terrible useful for limiting permissions. Just remember, to use the correct object, and everything will be fine.

Doctor Horrible's Sing-Along Blog

Doc_Horrible.jpgJoss Whedon is at it again. This time with a online-broadcast of a three-act musical he wrote during the Writer’s Strike. Hey, clearly, some people just do it because they love it. Doctor Horrible’s Sing-Along Blog is a Superhero story pitting the evil Doctor Horrible (played by Neil Patrick Harris) against the hero Captain Hammer (played by (Nathan Fillion](http://www.imdb.com/name/nm0277213/)). Both of whom are competing for the love of the beautiful Penny (played by Felicia Day).

The project is written by Joss and his half-brother Zack, who wrote on Deadwood, and the music was all composed by his half-brother Jeb. Joss and Zack have worked together previously on the Nervouscircus series of YouTube videos, which I had never heard of before today, and will likely be consuming rapidly for a while in my love for all things Joss.

The visual style of Doctor Horrible reminds me greatly of the 1930s and 40s era superhero and monster films, though with perhaps a touch more satirical bent. The first Act went online today, July 15th, with Act 2 due on the 17th, and Act 3 on the 19th. However, the video will only be available online for free until Sunday, July 20th, so if you want to catch this special event, you’ll have to hurry! Luckily, after the fact you’ll be able to get the video off iTunes (I’d say buy, but it’s clearly just a prohibitive license), or on DVD.

The writing appears clever and funny, the acting is well done, and the music is nice to listen to. If you’re a fan of Joss Whedon’s work, like I have been since I first discovered Firefly, you owe it to yourself to at least watch this free webcast. And if you like it, please buy the video when it’s available. YouTube has shown that film production by Independent artists is not only possible, but a great cultural boon (admittedly, 90% of what’s on YouTube is crap. And at least half of what becomes popular on YouTube is crap too). I may be disappointed that the only downloadable version of Doctor Horrible is going to burdened with oppressive DRM, but that won’t stop me from buying the DVD, and enjoying the show for free until the 20th.

Whole Food Adventures: Kvass

Kvass is a lacto-fermented beverage invented by the Russians. It’s a simple mixture of whey (rich in lactic-acid producing bacteria), water, sugar, and, traditionally, bread. Usually a hearty bread like Rye. The word Kvass, is derived from a Russian word for ‘leaven’, which is a clear throwback to the use of bread in the production of the drink, and it does have a very mild alcohol content. So mild that it’s considered fine for consumption by children. It’s probably on par with Kombucha.

However, you don’t have to use bread. Any flavorful starchy vegetable should be able to do the trick. Nourishing Traditions, the book which prompted this series of posts, suggests making the drink with beets. The natural sugar content of beets should be similar to that of most breads, and they’re a good sourch of Vitamic A, C, and B-Complex. Plus, you get the flavor added from the beets themselves.

So, whether you want a beet- or a bread-flavored drink, Kvass is an easy way to get into Home Fermentation. Just cube the base that you’re building your drink from, filling a jar about 1/3 full of bread or other vegetables, fill the container with water, and maybe a half-cup per quart of whey. Leave on the counter for a few days, and then transfer to the fridge. Easy.

Want more flavor? It’s traditional to add a wide variety of herbs and spices at the beginning of the fermentation process. I think that a few sprigs of mint in with a batch of beets should be really tasty, and might well be our next attempt at Kvass.

Kvass is largely an Eastern European drink, and I do like the story of Kvass in Latvia during the 1990s. Through the early 1990s, you could buy Kvass off street vendors all through the country, but after the Soviet Union fell, Health Laws were passed which made the street vending illegal. Coca-Cola swooped in, and quickly dominated the market. In 1998, commercial Kvass bottlers opened up, and managed to take Coke from 65% to 44% market share in less than two years. Coke’s response? Got into the Kvass bottling business. Admittedly, at lot of commercial Kvass is apparently made with mostly sugar and water, and likely loses a lot of the health benefits of home-made Kvass, but it’s still likely to be healthier than Cola.

Yahoo! Wants to be the BOSS

boss_logo.jpgAside from being the most bitching name for a service enabling Web 2.0 Mashups in recent memory, Yahoo!’s announcement yesterday of BOSS, the Build-your-Own-Search-Service API has some interesting possibilities ahead.

Yahoo! was once upon a time the great name in the Search Engine game. Everyone used them. Then Altavista came along for a short period of time, and finally Google showed up on the scene. Quickly, they became the 800 pound gorilla of the Search Engine game. Sure, Microsoft has tried to push Live search recently, but when I reach for a different search engine to search Microsoft’s own sites, I was never able to fully embrace Live search. So, Google it’s been for most of the last decade.

Still, Google’s been slipping on certain searches recently, mostly due to the number of people who’ve learned to game Google. I understand why they have, Google is the one everyone uses. Googling has quickly become the standard verb for searching the web. And let’s face it, Googling is catchier that Yahooing could ever be. Google does a decent job of detecting people who are gaming them, and preventing the games from working, but it has definitely taken it’s toll.

Yahoo! is using this opportunity to revitalize their search, but after reading about BOSS, I don’t think Yahoo!’s dominance is going to come from becoming a better search engine, which they may do, but through the mashups that BOSS will allow. BOSS, as it’s name suggests, is a web-service that Yahoo! is making available that you can call to get access to Yahoo! search results. Once you have them, you can do anything you want with them. Reorder them, drop results you don’t like, whatever. Then, you present them to the user, on you own page. Unlike Google’s Custom Search, which is useful, BOSS makes you the search engine site, you just get to use Yahoo!’s data.

The launch mashups of BOSS are interesting, though not terribly revolutionary, at least to me. Me.dium seeks to watch what users are searching for to create an experience based on buzz-words. Think Digg for Search, I guess. Cluuz seeks to create a tag-cloud based searching experience. I just see this as more Web 2.0 bullshit with fancy names and some garbage about the community, but Yahoo! is allowing companies to try to reinvent search without having to actually reinvent search. Search engines are a pain in the ass. They require a huge amount of data. Being able to leverage something else for that data is simply awesome.

BOSS has the potential to change the way we search, and I have a few ideas in that space. We’ll see what comes out of it, and frankly, I really think this is an awesome move on Yahoo!’s part. Let’s see how it plays out. Oh, and one more reason I really hope that Icahn fails.

Spokane Burning Down

Spokane is on fire. It apparently began at around 5:20 this afternoon, and has grown to consume several hundred acres in about 4 hours. Due to the location, my Wife’s family have had to evacuate.

With any luck, the fire will be out soon, and no more homes will be consumed. It will be incredibly strange to try to return to the Dishman Hills after this is done, based solely on what I’ve heard so far.

No cause reported. Just fire. It’s been quite a few years since fire season has struck so dangerously close to home for me.

Upcoming Programming Contests

Summer has come, and with it a bevy of Programming Contests for all who are interested. In addition to the already mentioned Underhanded C Contest, there are two major contests starting in the next few weeks.

First, is the International Conference of Functional Programming’s (ICFP) Annual Programming Contest. The 2008 Contest begins on Friday at Noon Pacific Time, and has been organized by Portland State University and the University of Chicago. Unfortunately, we won’t know what the challenge is yet, as they won’t announce until the last minute, but the last few years have been interesting. I’m hoping to once again get The Bobcat Hackers together to work over the weekend. All we really know is that a Live CD image will be available, and our programs must run in that environment.

With more information is Google’s Code Jam, scheduled to start next week. Unlike the ICFP contest, there is real money behind this one, but it’s an individual sport. No Teams. Plus, there is a qualifying round beginning next Wednesday. Like the ICFP Contest, there is a tight time-limit, though luckily there are smaller problems for the Code Jam.

The only way to really keep up on Programming is to practice. Events like these are great practice. I’ll be giving my impressions once the contests are over, but in the meantime, take part. You’ll be glad you did.

Microhoo! Reloaded

I’ve neglected to talk about Microsoft’s well-known bid to buy Yahoo! early this year. People were excited by the news, since it may have vastly over-valued Yahoo! at the time, hell even with the immediate $10 jump in the stock-price after the announcement, the stock price was still lower than Microsoft’s buyout offer. However, Yahoo!’s Board of Directors managed to successfully resist the buyout offer. Of course, stockholders were pissed about this, since they were more concerned with getting Microsoft’s money than what was good for Yahoo! or the industry. The stock price dropped over the month of June back to almost it’s six-month low.

Carl Icahn, who had absolutely no association with Yahoo! until the Microsoft deal started going south, bought up a bunch of Yahoo! stock in mid-May and has proceeded to attempt to unseat the current Board of Directors for their ‘irrational’ reaction to Microsoft’s offer and the disservice that he feels that they did their shareholders by refusing it. Needless to say the ~50 million shares of Yahoo! stock that Icahn owns are going to net him a solid bit of cash if he can force Yahoo! to sell.

Now, news is starting to hit the street that Icahn has been in talks with Steve Ballmer, who has allegedly indicated that he might be interested in making another offer, if the Board is changed. Mind you, this is pretty much hearsay from Icahn, and it’s entirely possible he’s exaggerating comments that Ballmer has made (it’s very likely they’ve been talking) to drive Yahoo!’s price back up, though what he’s quoted Ballmer as saying are pretty vague already.

Icahn’s reasons for his actions are simple. He is simply looking out for the money. Unfortunately, so are the majority of Yahoo! shareholders, but such is life with a publically traded company. And if you look at what would cause the best, short-term increase in value for the Yahoo! shareholders, taking the money is a great idea. But, is it good for those shareholders in the long term? Is it good for the Industry in the long term? As for the shareholders, it’s hard to say at this point if Yahoo! will ever be the powerhouse they once were. But then, it’s likely that this is largely an image problem. Yahoo! has an enormous number of users, who are incredibly loyal. Yahoo!’s Mail Service is still considered bigger than Hotmail and GMail. Yahoo! Buzz has users who have never (and likely never would) heard of Digg. Yahoo! has frankly done a lot to bring Web 2.0 services, including Flickr, to users who never would be exposed to them another way. Do we need a Digg clone? Probably not, but users who like Buzz are more likely to be attracted away to something more similar, or users will be attracted to Buzz due to the convenience of having it tied into Yahoo!’s user system.

It makes sense that Microsoft wants Yahoo!. Yahoo! is an understated company, largely because they don’t make news the way that Google and Microsoft do. Are they as well off as they used to be? No, but I feel it would be amazingly imprudent to call them a failing company. The AdSense deal with Google, in my opinion, was only made to improve the companies financial situation a tiny bit to make them look better to investors. Ultimately, it did look desperate, and probably won’t help their case in the long term.

However, I believe that this deal would be incredibly bad for Microsoft, Yahoo! and the industry in general. Microsoft and Yahoo! suffer some similar internal problems. There are lots of teams working on various projects, some of which are duplicating effort. Both have poor communication between teams to prevent such duplication. Money is often poured into several competing projects. These are not healthy things for any large company, though it’s not caused too many problems to date. However, the ways in which the companies are different are even more damaging. Yahoo!’s internal philosophy has usually been one in favor of openness. Yahoo! views themselves as a services company, so a fair amount of their source is Open, and they share their technologies in many cases. Microsoft is very, very slowly moving down this path, but a large number of their projects are still heavily based on proprietary data formats and protocols.

I really don’t see any way for these two companies to come together in anything resembling a healthy business. And trying could frankly lead to a really bad situation for Microsoft. Ignoring the problem of bringing the companies together, I think the bigger problem is that this would simply be bad for the industry. We’d end up with less competition. We’d end up with a lot of Yahoo!’s Projects (like YUI losing their corporate support. Who knows what would happen to services like Flickr, which aren’t based on Microsoft technologies? Admittedly, it took Hotmail years to switch from Unix servers to Windows, but it did happen. I suspect a lot of services, that a lot of people use, would be re-engineered onto Microsoft technologies for no reason than Microsoft would either recreate it, or destroy it. Microsoft wants the search part of Yahoo!’s business, but I don’t think they would shy away from the opportunity to destroy the competition that Yahoo! represents, and snatch up any engineers who might work for them.

Yahoo! Shareholders: Please resist the urge to take a quick payout, which frankly is far from a guarantee at this point, and try to make decisions good for Yahoo! in the long term. If Icahn gets his way, there is no guarantee that Microsoft will actually make an offer. Or that they’ll make an offer for as much as they did before. Even then, you may have made a bit of money today, but you’ve likely hurt the industry for a long time.

Whole Food Adventures: Pickling

Pickling is the most traditional means of food preservation around, however in modern times most people have reserved pickling merely for cucumbers and relegated the practice to a mere handful of companies. Now, I enjoy a good kosher dill as much as anyone, but the fact of the matter is that Pickling is easy, and by doing it yourself, you can make absolutely any kind of pickle you want. Plus, you can pickle most any vegetables, so want to pickle something more than cucumbers? DO IT. And, it can be healthier than eating many of the vegetables straight, and certainly than buying pickles at the store.

There are two methods of pickling. The traditional lacto-fermentation process, and a heat and vinegar process. Both taste good, but the best nutrition comes from the traditional fermentation.

The heat and vinegar process involves heating water, vinegar, and usually some sugar in a saucepan to a gentle boil, and pour it over the vegetables you want to pickle in a seal-able container. Refrigerate for a week, and enjoy. Really good to do this with are baby carrots, with some crushed red pepper and dried chiles for flavor, and you’ve got what Alton Brown calls Firecrackers. However, while this is not unhealthy, it doesn’t have the same health benefits that the traditionally fermented pickles do.

One of the most classic fermented vegetables is Cabbage. Fermented Cabbage goes by many name. In Germany, it’s Sauerkraut. In Korea, it’s Kimchi. And it’s dead easy to make. You begin by coring a head of cabbage. My preferred method of doing this is to quarter (cut in half, then half again) the cabbage, and perform a diagonal cut to remove the nasty stem part of the cabbage. Then, cut the cabbage as thinly as you can. This may be easier with a food processor or a mandolin, but it’s far from impossible with a good sharp chef’s knife.

Put the cabbage in a large mixing bowl, I’d recommend stainless steel, and add a tablespoon of caraway seeds, and two tablespoons of whey, which is very rich in lactobacillus. Then, proceed to pound on the cabbage for ten minutes, preferably with a large wood mallet (see why I suggested a steel mixing bowl?). Anything fairly heavy should work fine, though. I used a partially filled mason jar last time. After about ten minutes, the cabbage should be pretty well broken down, and there should be a fair amount of water in your bowl. Dump the cabbage and water into a jar, and make sure that the liquid comes to just over the top of the cabbage. If you need to add a bit of water, that will be fine, but filter it first. The last thing our little bacterial friends need is chlorine.

Three days on the counter, and you’ll have a fine tasty jar of Sauerkraut. Oh, and don’t have the caraway seeds? Don’t fret too much. It’ll taste almost as good without them. And, this sauerkraut will improve with age. Finally, there are the nutritional aspects. Not only is cabbage a healthy plant to begin with, sauerkraut increases the Vitamin C content of the vegetable by a significant amount. 35% of the daily requirements, to be precise. Not to mention Vitamin K and Iron that our bodies can’t quite get out of cabbage to begin with.

Pickling is easy, and healthy. Admittedly, proper lacto-fermentation does require a bit of whey, but that’s a free by-product of home-made cream cheese, and if you haven’t already discovered the wonders of that, then you really should get on it.

Bad Design: Microsoft Virtual Server

At work, we’ve been moving quickly in the direction of Server Virtualization. There have always been benefits to reducing the attack exposure of a server by minimizing the number of services running. This has traditionally been immensely impractical. Servers take up space, use quite a bit of electricity. The fact has always been that most servers sit around mostly idle, most of the time. So, given that many servers usually aren’t working very hard, why not combine many servers into one? This is exactly what Server Virtualization is all about.

This space has traditionally been dominated by VMWare, though several Open Source options, like Xen have become available in recent years. On the desktop, I’ve been using VirtualBox OSE, which does a decent job, at least until I can justify the purchase of VMWare Workstation (which frankly, isn’t too expensive at ~$200). Between these Open Source Tools, and Microsoft’s Virtual Server and Virtual PC both being free to licensed Window’s users, it will be interesting to see how long VMWare can keep their prices where they are.

To be fair, VMWare is still the superior product, but the cost savings, and our SysAdmin’s tendency to prefer all things Microsoft, led us to use the Microsoft solution. And it’s worked out pretty well for us. Virtual Servers are easy to create, backup, and redeploy. The only problem we’ve had is that these operations take a long time due to the fact that we’re not using a NAS device, but the software has done pretty well for us despite this. 10+ hours to transfer a 150Gb image over Gigabit Ethernet pretty much sucked, but it did work.

So, why is this software being palled for Bad Design? Simple. Deploying a Virtual Server image out of the backup, deletes the backup. Without any option to do otherwise. Our Sys Admin has been in the process of rebuilding all of our servers with Windows Server 2008, and last weekend was his opportunity to rebuild the Virtual Server Hosts (I’m not sure why we’re not using Hyper-V, don’t ask). The rebuild went fine, aside from the re-deployment of some of the Virtual Servers being slow, but again, a NAS will fix this, and it’s an in-progress purchase.

Due to the staggering amount of time it took to do the redeployments, immediate backups were not performed. It was assumed that waiting until this weekend would be fine. Anyone who has done Systems Administration knows what happened next.

The Virtual Server Host lost the largest virtual server. The only one that wasn’t a part of any standard backup scheme, because we had been told that it was for temporary storage of image data as it was being cataloged. It was being used for more. Much more. Attempts were made to recover the images, including running several undelete tools on the server in question. The only thing not done, was immediately taking the server offline and imaging the drive for analysis and possible recovery. The Sys Admin felt it wasn’t necessary, and I lost a much sought opportunity at forensic analysis. :(

Sure, we should have had that backup. However, if best-practice dictates that you should immediately backup a virtual server deployed from the vault, why does the software delete the version it’s deploying? Who in the hell thought that was a good idea? Drives fail. Software fails. We backup to protect ourselves from that. It was entirely possible that the failure that lost the Virtual Machine could have happened in the window between the image being deployed and the backup being completed, and in that case, who’s fault would the failure have been?

One of the first rules of writing software is that it must be resilient. Due to the decision to move an image while deploying from the library rather than copying it, that server was lost. This is not resilient programming. This is not resilient design. This is not resilient software.

For the most part, our experience with Microsoft’s Virtualization technology has been fine. I had a bit of trouble booting Ubuntu 8.04 inside of it (tip: add the ‘noreplace-hypervirt’ boot option), but it’s mostly worked pretty well. Still, this particular bug is so egregious that I can’t imagine what the person who decided on that behavior was thinking.

Don’t be completely afraid of Microsoft’s tech. But do be clear to do your backups religiously.

The Myth of Visual Studio JavaScript Debugging

When Visual Studio 2008 was just around the corner, I had the opportunity to attend a few sessions on the cool new features of this suite. One that everyone was keen to demonstrate was the ability of Visual Studio to debug JavaScript. Start up an application with Debugging Symbols, and it will fire up in your default browser. Hit a breakpoint within some JavaScript (or an error rises from the JavaScript) and the Visual Studio Debugger will fire and take over, allowing you to view the values of variables and step through the JavaScript code. Plus, I’ve been able to use this from Firefox, so it’s not an IE-only feature.

Not all is sunshine and roses, however. The way that the technology apparently works is actually incredibly restrictive, and it locks you into a particular development pattern that I don’t really endorse. Namely, it only works if the JavaScript is all stored in the same output file as the HTML source for the page as well. Want to debug your JavaScript? Better put the JavaScript in-line with the page. This ends up creating output which is larger than necessary, and doesn’t allow you to cache your scripts separate from the page source. Sure, you can do that refactoring when you go to publish, but in that case you now have to perform the refactoring, re-test everything and then publish it.

In general, I think that this was done this way because it fits very neatly with the Development model that ASP.NET seems to promote. My biggest problem with ASP.NET is that it actually seems a lot harder than writing pure HTML, because it abstracts the form away in a sense that doesn’t make a lot of sense to me. In addition, because of the way the User Controls are structured, it seems unnecessarily difficult to build sites that don’t depend on JavaScript being enabled on the browser.

This is an interesting dilemma, that I’ve discussed before. But I very greatly believe that, in most circumstances, a web application can and should operate without JavaScript and CSS. It won’t look as pretty. It won’t be as snappy or cool. But it will work, and sometimes that’s enough. But the only way to do that, is to design with proper abstraction between the Web Application and the JavaScript that can power it. This is, of course, the heart of what Yahoo!’s Graded Browser Support is all about. Your pages should work not-only with A-Grade browsers, but also with older, less capable C-Grade browsers. This can be a hard thing to test, but luckily, testing for C-Grade compatibility is significantly easier than testing A-Grade.

I’ve been trying to follow these guidelines for several years, though I’ve found that most developers simply don’t. And it’s not impossible to do this with ASP.NET, just that ASP.NET makes it harder, since it’s easy to put a dependency on scripting in a page without realizing it. Frankly, even the development I’m doing using the ASP.NET MVC Framework, I’m doing the HTML either myself or with the HTML Helper object that ASP.NET MVC supplies. I haven’t needed, nor really wanted, to use Web Forms.

So, I think it is the assumption that JavaScript belongs inline on a page that causes you to be unable to debug external scripts. This is an unfortunate result, which I hope will be resolved fairly soon. In the meantime, I guess I’ll just continue to use Firebug.

Profiling SSH Tunnels

A set of researchers at the Universita degli Studi di Brescia in Italy recently published a paper detailing a method to fingerprint data being tunneled over SSH, this is a particularly relevant bit of research, as more and more organizations have been busily filtering the kind of traffic allowed through their border routers. In some cases, this has been to protect internal resources, in others to restrict unauthorized use. Whatever the reason, many of these systems can be overcome by the use of tunnels, which the paper begins by discussing.

A tunnel is basically wrapping one applications protocol in a different protocol. The practice serves basically two purposes: 1. to wrap an insecure protocol in a secure one (like when using SSH) and 2. to get an unauthorized protocol out disguised as an authorized one. Of course, method 2, of which the tunneling protocol is usually HTTP, is vulnerable to the increasingly common Deep Packet Inspection that is being done, which actually opens up packets to determine what is going on inside them, and thus they are easily thwarted. However, by wrapping the illicit traffic in an encrypted SSL tunnel, the average firewall configured to allow SSL will allow the data right through.

Enter this research. It turns out that most protocols can be identified with reasonable certainty based on metrics such as the size of the packets and the time interval between sent packets. It’s really quite clever, and the mechanism used (Bayesian Filtering), is increasingly finding uses in this sort of work. The basic theory is that you can analyze the deltas (size and time) for a small set of packets, disregarding the first few packets (which are the SSH authorization packets), and within a small handful of packets, you can have a pretty good idea of what people are doing. Particularly if the behavior is allowed or not.

The general assumption espoused by the article is that the only legitimate uses for SSH are terminal sessions and file transfers, which may be true for SSH, but it’s parent SSL is used for so much more. Luckily, if you implement their technology, you can filter whatever you want. In many ways, this idea will be an incredibly useful addition to your typical Intrusion Prevention System, which already seeks to do something similar, in that it takes the model you’ve defined for acceptable behavior, and disallows anything that doesn’t fit the model. These systems are finicky, and (particularly early on) require constant monitoring and correction, but they’re a far better solution for many networks than the existing methods of fingerprint based Intrusion Detection.

Unfortunately, this system isn’t yet perfect. It will tend to view mucked up passwords as unauthorized traffic.Not to mention, all they can currently tell us is if the SSL connection is a interactive session, a file transfer, or a tunnel. Still, the research is interesting, and no doubt a lot of people in the security appliance business. I really think that the best application for this technology is as a part of a IPS.

The research isn’t ready yet, but it’s interesting, and worth looking at. I suspect that by this time next year the mathematical models for analyzing encrypted traffic will have come a long way. Of course, it does just go to show that analyzing encrypted data, even data you can’t decrypt, is always worthwhile. Watching who is communicating with whom, how often, and how much, can tell a lot about the nature of the communication, whether you’re a general in a war zone, or trying to keep filesharing off your network. Encryption is a great tool, but it still tells much to those who know how to look.