Thursday, December 22, 2011

Introducing HTML5

The other day I was at the bookstore (my wife had organized a book fair for my daughter's school) and decided to check out the books about HTML5. I had read a few things in the past but wanted a deeper understanding of what changes it brings to writing web applications. I grabbed the five books I could find on the subject and started browsing through them. Introducing HTML5 (2nd Edition) hooked me from the introduction and I ended up buying it.

The introduction provides a history of the development of HTML5. I had thought of HTML5 as the evolutionary successor to HTML4 but was interested to find that not for a small group at Opera lead by Ian Hickson it almost wan't a reality. W3C was pushing XHTML 2.0 as the successor to HTML 4 which required XML syntax and broke backward compatibility. Whereas the Opera group took an approach more focused on current browser behavior and looked to add items to simplify web development.

The first chapters deal with the changes to overall structure of an HTML document and text on the page. There are chapters on the new form elements, video, the new canvas tag and offline storage. There are a couple of chapters on geolocation, messaging, and sockets that I haven't gotten to yet. Finally there is a chapter on backward compatibility and using polyfills to make sure your pages work with earlier browsers.

Overall it's a great overview of the new stuff in HTML5, why the new tags were added and when to use them.

Saturday, October 22, 2011

If Statements

Its amazing how something simple like an if statement can have so many different permutations and different options. Here are some of my general rules to follow:
  • Always put the true condition first. Some people prefer the shorter block first (or last) but I think if you consistently put the true condition first it leads to less confusion when looking at the code later.
if (trueCondition) {
    DoSomething();
} else {
    DoSomethingElse();
}
  • Always take advantage of short-circuit evaluation and put the most cached or simplest conditions first.
if (localVar && string.IsNullOrEmpty(s) && expensiveTest()) {
  • Don't explicitly return true or false
// Argh, don't do this.
if (condition) {
  return true;
} else {
  return false;
}
// Just return the result of the condition
return condition;

// Or even worse I've seen this:
bool value = condition ? true : false;
// Just set the value:
bool value = condition;
  • Unless it is a one line if statement always bracket the blocks. This protects against the case where you write the one line case and later another developer has to add another statement to the if condition and doesn't notice the missing brackets and then introduces a bug. For example this is ok:
if (arg == null) return;
but not this:
if (condition) 
  x = 4;
Because too often I've seen another developer later add a line of code thinking it's part of the branch:
if (condition)
    x = 4;
    y = 6;

Tuesday, June 21, 2011

Browser Session Management

First off, I admit I have a problem. I typically have 5-6 browser windows open with 10-15 tabs per window and that's just with one brand of browser. I sometimes have that many tabs/windows with two of the big three (IE, Firefox or Chrome) browsers going at the same time.

There are many reasons why I end up with so many open tabs:
  • I open a new window to search on something and open the links in tabs
  • I get easily distracted. I start to reading a page then switch to something else and don't want to close the tab knowing I will come back to it.
  • There are some tabs that are semi-permanently open (like gmail, Facebook, Twitter, Amazon Cloud Player).
  • I use open pages as a todo list.
Usually this isn't that big of a problem but it can become one. The biggest problem is memory usage. Over time all three browsers seem to leak memory (some worse than others - Firefox I'm looking at you) forcing restarts (and occasionally Windows wants to install updates and reboot). Having the browser re-open without losing all of my open tabs is a big selling point. 

Here is my take on the big three and session management.

Internet Explorer

IE is the worst of the three when it comes to session management. IE8 and 9 added a link to 'Reopen last session' (accessible on new tabs) but as far as I can tell it assumes all of your previously open tabs were in one window. If you do have multiple windows open and restart your computer only the last window closed is restored (which always seems to be the one with the fewest tabs). If your computer hard crashes and you restart IE it does ask if you would like to restore the previous session and this does reopen all windows and tabs. So full session information does seem to be stored somewhere just not surfaced to the end user.

Firefox

Early on Firefox itself did not have great session management but it had good add-ons like "Session Manager". If I noticed things running slow or saw Firefox memory usage spiking in Task Manager I could kill the browser and with Session Manager reload all of the tabs when it restarts. Firefox has now built this in as option to restore all open tabs.

Chrome

With Chrome I ended up having on of those 'doh' moments. I knew Chrome tracked all of the opened windows and tabs because if I restarted from a hard shutdown it would offer to Restore all of them. The problem was that if I noticed Chrome eating up a lot of memory there didn't seem to be a clean way to close all of the windows and completely restart the browser session. Since each tab spawns its own process there was not a single process I could kill like Firefox in the Task Manager. And if you clicked on the X in each browser window like IE the last window became the only one restored. Then I don't remember where I read it but I found out if you use the 'Exit' menu option from the Wrench icon it will close and remember all of your open windows and tabs. It was one of those 'aha' moments that made live easier. 

Friday, June 17, 2011

Scratch and Squeak

Some of my first programming experiences were typing BASIC programs into my TI-99/4A home computer copied from computer magazines. The whole process was madly frustrating. There was no editor, you had to exactly copy the lines into the console and then you could save the program to an analog cassette. If you were lucky you could reload your program from the cassette at a later date. I remember trying to create a Star Trek game and gave up after taking 2 weeks just to create the logo screen. Despite all of this there was something about programming and I was hooked.

Now many years later I am a father of three and hoping to pass along some of that love of programming to my kids. Looking around its amazing the options out there to get them started . When I was starting out the options were either BASIC or Logo. Logo had some neat ideas and I played with it a bit, but moving the turtle around never got me hooked. Now this Wikipedia page lists 71 educational programming languages.

As a past Smalltalker, the most interesting to me of these appear to be Squeak and Scratch. Squeak is a modern, open source implementation of the Smalltalk programming language and environment. Scratch is built using Squeak under the covers but is a visual programming language where programs are built by snapping together blocks. I think my daughters will like the ability to quickly create an animated story and control the action. I also have a first generation Lego Mindstorms kit but I think Scratch will be a good starting point.

Thursday, June 16, 2011

Smalltalk Extensions

In a previous lifetime I worked for a Smalltalk vendor and my last two posts (here and here) got me thinking about all of the ways that you could hack the Smalltalk environment. In Smalltalk, except for a few primitives, all of the base libraries source code is present and editable in your environment. You can edit the default implementations or easily add additional functionality to base classes. This can lead to all sorts of interesting side effects.

I remember working with one customer that had occasional lock-ups in their development environment which they were blaming on us. These lock-ups were intermittent and hard to reproduce. Well one day while in their environment I was able debug right after a lock-up. I started stepping through some code in the base collection class that I didn't recognize. Turns out they had overridden the inherited Sort function and implemented their own (might have been a quicksort I can't remember) along with a very subtle bug. Given the right list size and starting conditions the code would enter an infinite loop. Fixing their sort function fixed their lock-ups and contributed to my reluctance to ever touch base libraries.

Wednesday, June 15, 2011

Extension methods strike back

In my previous post I showed how static utility functions that extend base library functionality can be rewritten using extension methods. Now I will explain why I don't think it is a good idea to do so.

Extension methods were added primarily to allow LINQ functionality to extend base level classes without changing the spec of the base classes. You can see this in action by adding a reference to the System.Linq namespace to your code and suddenly arrays have methods like OrderBy in the intellisense list.

This is great for Microsoft which can co-ordinate between the Linq and core library teams but what happens when an extension method you've written conflicts with a change in the base library. Well this is documented here, you lose:
You can use extension methods to extend a class or interface, but not to override them. An extension method with the same name and signature as an interface or class method will never be called.
For example, say you create an extension method named WordCount that looks for spaces to count words. Then in a new .NET release Microsoft also adds a WordCount method to the String class but their version counts hyphenated words as separate words. Your application may suddenly change its output or crash. Even good unit tests may not catch all of the edge cases.

Granted there is probably a remote chance of this happening and you could safety against it by prefixing all of your extension methods with something non-standard. However I just don't feel there is enough gain to be had with using extension methods over static utility classes.

Tuesday, June 14, 2011

Extension methods

In my previous post I talked about convenience functions and their usefulness in wrapping base level functionality. .Net 3.5 added a new feature called extension methods that allows methods to be added to a base class then called on instances of that class.

For example say you have an existing StringUtils class and method named OccurencesOf:

/// <summary>
/// StringUtils provides a collection of static functions for string manipulations.
/// </summary>
public static class StringUtils {

    /// <summary>
    /// Returns the number of occurences of the match string in the source string
    /// </summary>
    public static int OccurencesOf(string sourceString, string matchString) {
      return Regex.Matches(sourceString, EscapeRegEx(matchString)).Count;
    }
}

For example:
    StringUtils.OccurencesOf("the quick brown fox jumped over the lazy dog", "the");

With extension methods this can be rewritten like:
public static class StringExtensions {
 
    /// <summary>
    /// Returns the number of occurences of the match string in the source string.
    /// </summary>
    public static int OccurencesOf(this String sourceString, string matchString) {
        return Regex.Matches(sourceString, EscapeRegEx(matchString)).Count;
    }
}

And called like:
    "the quick brown fox jumped over the lazy dog".OccurencesOf("the");

Definitely a neat feature but in my next post I will explain why you shouldn't use extension methods.

Monday, June 13, 2011

Convenience functions and utility classes

Convenience functions are a favorite category of mine. The core libraries provide great functionality but (by necessity) operate at an atomic level. You might want to call A,B and then C and I just want C then A. This flexibility is great but can lead to a lot of duplicated code in your application.

Enter a convenience function. This is what I call code that wraps several calls to base level functions that can be reused throughout the application. We group similar ones into a static class with a Utils suffix (e.g. StringUtils or XmlUtils).

Here is an example of one that we use:
/// <summary>
/// Convert either a predefined color name or an ARGB value into a Color object.
/// </summary>
/// <param name="colorString">Either a predefined color name or an ARGB number 
/// value represented as a string./// <returns>Color
public static Color FromColorString(string colorString) {
  // Color will throw an exception if the string is null.
  if (string.IsNullOrEmpty(colorString)) return Color.Empty;

  // Translate HTML color strings.
  if (colorString.StartsWith("#")) {
    return ColorTranslator.FromHtml(colorString);
  } 

  // See if the passed in value is a named color string.
  Color color = Color.FromName(colorString);
  if (color.IsKnownColor) return color;

  // If the string is not a named color and not a number return the empty color.
  if (!StringUtils.IsInteger(colorString)) return Color.Empty;

  // Otherwise convert the string to a number and return a color based on that.
  return Color.FromArgb(Convert.ToInt32(colorString));
}

It neatly wraps several ways to create a color object from an HTML string, named color or argb value. Create a bunch of unit tests for this and you have a nice piece of reusable code.

Monday, June 6, 2011

DateTime serialization issue within a DataSet

For most of our SOAP-based web services we return either simple strings or manually created XmlNodes. We did this to try and make the web services as easy to use with languages other than .NET as possible and because early on we had run into some issues with the built-in serialization code. The built-in serialization is nice when you control both endpoints but if you don't, I've found they can have some unexpected behaviors like below.

For a handful of web services we have a huge variable set of data that needs to be returned. Rather than write a blob of custom serialization code we took the easy way out and return a .NET DataSet (letting the built-in serializer to the work). Things were good until we started to get reports of DateTimes being off when the client-code was calling from a different timezone than the server's timezone. There are two options for the times we return they can be either in UTC or adjusted to a user specified timezone. Neither option was using the server's local time but when the DataSet was unserialized in the client it was being adjusted by the delta between the client and server's timezone.

For example a value in the DataSet might look like the following before serialization:
<date_time>12:19:38</date_time>
In the client it would come with a offset specified:
<date_time>12:19:38.0000000-04:00</date_time>

The client code would adjust this to its local time (much like Outlook when you schedule an appointment with someone in a different timezone).

This was a particularly tricky issue to deal with because the delta was between the two different timezones and not some known value like UTC. Since we didn't know what timezone the calling code was in how were we to account for it. This Microsoft KB article suggests as a workaround creating a new web service that the client can call to adjust the times in the DataSet. This seemed to be a really bad idea and knew their had to be a better way.

Thankfully one of the developers on the team found it. The DataColumn objects in the DataSet have a property called DateTimeMode which controls the serialization of times. UnspecifiedLocal is the default which was causing our issues. Switching this to Unspecified allowed the serialization to happen without an offset to be applied. Hooray!

foreach (DataColumn column in table.Columns) {
    if (column.DataType == typeof(DateTime)) {
      column.DateTimeMode = DataSetDateTime.Unspecified;
    }   
  }

Friday, May 27, 2011

Simpler TimeSpan construction

The TimeSpan structure in .NET is useful when you need to represent an interval of time. It has three different constructors that take varying degrees of days, hours, minutes, seconds and milliseconds to initialize the TimeSpan. This is great for its flexibility but all to often you just want a TimeSpan based on one value (like minutes or seconds) and all of the other values are zero. I've seen code like this to sleep a Thread for 10 seconds:
Thread.Sleep(new TimeSpan(0, 0, 0, 10, 0));

Thankfully there are a number of static convenience functions that take the single value and return a TimeSpan. They are all in form of FromX(double) where X maps to the same days, hours, minutes, seconds and milliseconds as above. So instead of the code above we can write:
Thread.Sleep(new TimeSpan.FromSeconds(10));

Nothing Earth shattering but just a little bit clearer for the next person who reads the code.

Thursday, May 26, 2011

Measuring time with Stopwatches

.NET 2.0 added the Stopwatch class that should definitely be in your tool belt for performance testing.
From MSDN:

The Stopwatch measures elapsed time by counting timer ticks in the underlying timer mechanism. If the installed hardware and operating system support a high-resolution performance counter, then the Stopwatch class uses that counter to measure elapsed time. Otherwise, the Stopwatch class uses the system timer to measure elapsed time. Use the Frequency and IsHighResolution fields to determine the precision and resolution of the Stopwatch timing implementation.
Example:
Stopwatch stopWatch = Stopwatch.StartNew();
// Code to benchmark
stopWatch.Stop(); 
Console.WriteLine ("{0} ms", stopWatch.Elapsed.TotalMilliseconds);

Also note that the ElapsedMilliseconds property returns a rounded number to the nearest full millisecond, the Elapsed.TotalMilliseconds property is a float that can return execution times to the partial millisecond.

Update: Eric Lippert has a new series on benchmarking and highlights this approach in his second post.

Wednesday, May 25, 2011

Optional Parameters

.NET 4.0 added the often requested ability to specify a default value for a parameter making it optional for callers. The optional parameters have to be at the end of the parameter values and must be constants.
Generally they should be used where you would have created an overloaded method just to pass in a default value for one or two parameters. Don’t use them in place of good object initialization code or for every parameter a function takes.
For example the following overloaded methods:
/// <summary>
/// Return the contents of the text node associated with the passed element.
/// Return the empty string if there is no text node.
/// </summary>
public static string GetText(XmlElement element) {
     return GetText(element, string.Empty);
}


/// <summary>
/// Return the contents of the text node associated with the passed element.
/// Return the default value if there is no text node.
/// </summary>
public static string GetText(XmlElement element, string defaultValue) {

Can be simplified into one method by making the second parameter optional:
/// <summary>
/// Return the contents of the text node associated with the passed element.
/// Return the default value if there is no text node 
/// or empty string if not specified.
/// </summary>
public static string GetText(XmlElement element, string defaultValue = "") {

Note while you can use constants like int.MinValue and long.MaxValue, string.Empty is a field on the String class and cannot be specified. You have to use "" for empty strings.

Tuesday, May 24, 2011

IEnumerable(T) != IEnumerable

String.Join(String, IEnumerable(of T)) was added in .NET 4.0 as a new overloaded method that takes a strongly-typed enumeration and returns a separated string. This is great and something we had previously done in a wrapper function. However, one thing to be careful here is that passing in an ArrayList will not throw an exception (compile or runtime) but may not give you the results you are expecting.
For example the following code:
ArrayList array = new ArrayList() {1, 2, 3};
string joinedString = String.Join(",", array);
Console.Write(joinedString);

Will write out:
System.Collections.ArrayList

Not:
1, 2, 3