Thursday, December 5, 2013

Var-ious Thoughts

I started my programming career as a Smalltalker (before Java or C# even existed) and enjoyed the "everything is an object" dynamically-typed variables. In the late 90's, I worked on the VisualAge for Java product (which eventually became the Eclipse platform) using Java. Then I switched jobs and ended up using VB6. It wasn't as bad as I feared because VB6 at least had objects, classes and interfaces (but no inheritance). Still, I was happy when I convinced my bosses that we should switch to C#. It was nice to be back using a fully object-oriented language.

Coming from a Smalltalk background I had been reluctant to embrace a strongly-typed language (let alone strongly-type collections). I found after awhile I liked the some of the benefits like having the compiler catch type exceptions and having the type declaration be a way of self-documenting the code. With the introduction of the var keyword in C# 3.0 I was concerned that we were taking a step back to the VB days. I understood the scenarios with anonymous types they needed to add it for but didn't want to lose documentation benefits. I've come to realize that when you are declaring a variable and then instantiating and assigning an object to it 'var' can save a bunch of typing. Just don't overuse it when the type declaration is not readily apparent (ie don't make me think or have to search to figure out the type).

Here are my general guidelines for using 'var':

  1. Don't use 'var' when the type of variable is not readily apparent.
  2.   var s = SomeMethod();
  3. Use 'var' when declaring a variable and assigning an obvious type to it.
  4.    var collection = new List<string>();
  5. Use 'var' when required to (e.g. with anonymous types).
  6.   var anon = new { Name = "Joe", Age = 34 };

Wednesday, December 4, 2013

Fast XML parsing with XmlReader and LINQ to XML

Using XmlDocuments to parse large XML strings, as we know, can spike memory usage. The entire document is parsed and turned into an in-memory tree of objects. If we want to parse the document using less memory there are a couple of alternatives to using XmlDocuments. We could use an XmlReader but the code can be messy and it’s easy to accidentally read too much (see here). We could use XPath but that’s more designed for searching sections of XML rather than parsing an entire document. Lastly we could use LINQ to XML which offers the simplicity of XmlDocument along with LINQ queries but by default will load the entire document into memory.

This blog post offered an interesting alternative of combining LINQ to XML with XmlReaders. This hybrid approach seemed to offer the speed of forward parsing XmlReaders with the simplicity and functionality of LINQ objects.

The first step was creating a method in an utility class that abstracted out the reader and returns just the matching elements. The secret sauce is the ‘yield return’ keyword which I will explain below.

/// <summary>
/// Given an xml string and target element name return an enumerable for fast lightweight 
/// forward reading through the xml document. 
/// NOTE: This function uses an XmlReader to provide forward access to the xml document. 
/// It is meant for serial single-pass looping over the element collection. Calls to functions 
/// like ToList() will defeat the purpose of this function.
/// </summary>
public static IEnumerable<XElement> StreamElement(string xmlString, string elementName) {
    using (var reader = XmlReader.Create(new StringReader(xmlString))) {
        while (reader.Name == elementName || reader.ReadToFollowing(elementName))
            yield return (XElement)XNode.ReadFrom(reader); 
    }
}
Say you have a large CD catalog to read in like:
<Catalog>
  <CD>
    <Title>Stop Making Sense</Title>
    <Band>Talking Heads</Band>
    <Year>1984</Year>
  </CD>
  ...
</Catalog>
If you were using an XmlDocument to read that from an XML string and process each element you might have code like:
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(catalogXml);
XmlNodeList discs = xmlDoc.GetElementsByTagName("CD");
foreach (XmlElement discElement in discs) {
    //... Process each element
}
You can convert that to using the hybrid LINQ/XmlReader approach like the following:
IEnumerable<XElement> discs = from node in XmlUtils.StreamElement(catalogXml, "CD") select node;
foreach (XElement discElement in discs) {
    //... Process each element
}
The one big caveat is that you can’t call any functions on the discs collection that would require looping over all of the items to get the answer (eg ToList(), Count, etc). This is because we are relying on yield to return each element one at a time. We process it and then move on to the next one. This allows memory associated with individual elements to be garbage collected as we go along and not held into memory en masse. This approach works best when we have an XML document with a set of homogenous elements that can be forward processed.

More on yield:
You consume an iterator method by using a foreach statement or LINQ query. Each iteration of the foreach loop calls the iterator method. When a yield return statement is reached in the iterator method, expression is returned, and the current location in code is retained. Execution is restarted from that location the next time that the iterator function is called.
One thing to stress when making any performance related changes is that you need to establish baseline performance numbers and then verify that the changes improve it. So for each method record the time and memory use before any changes are made and after. You can use something like the following to determine the baseline and any performance gains.
Stopwatch stopWatch = Stopwatch.StartNew();
long startMem = GC.GetTotalMemory(false)

// Code to benchmark

stopWatch.Stop();
long endMem = GC.GetTotalMemory(false);
Console.WriteLine ("{0} ms", stopWatch.Elapsed.TotalMilliseconds);
Console.WriteLine ("{0} mem", endMem - startMem);

Sunday, December 1, 2013

iPhone photos and an empty DCIM folder

My iPhone was overflowing with photos. I've used iCloud to sync them to the computer in the past and it has been ok. The lack of online viewing/sharing is a real negative with iCloud. Recently I tried out the improved SkyDrive app. I like the online viewing of photos and the sharing options are great from SkyDrive. The app needs to be launched each time to start the upload process. Once started it will sync a little over 100 photos at a time but then it needs to be refreshed to continue. It really needs to use the location services to trigger an upload when I get home. The other downside is that it doesn't upload videos.

Once all of my photos were uploaded to SkyDrive I wanted to download the videos and remove the photos to clear up some space. I've avoided installing iTunes on my latest computer, but no problem I figured I would just plug in the iPhone and direct access the pictures from the DCIM folder. After plugging in the phone the DCIM folder was empty and Windows photo import listed 0 pictures. Some panicking and searching lead me to the solution (one of the more helpful links here). I needed to unlock my phone first and then plug it in. Security is often an annoyance especially when there's no visible feedback about what to do.

Friday, November 22, 2013

User-Freindly URLs for HttpHandlers

I was interested in implemented a cleaner URL scheme for an HttpHandler in ASP.NET. Out of the box it's pretty easy to set up an HttpHandler. Simply add a file with an ".ashx" extension and the @WebHandler directive and IIS will use the the Generic Web Handler to route traffic to your handler. The downside is that the handler has to end with ".ashx" and all parameters have to go on the querystring.

With the introduction of MVC, Microsoft built RESTful style routing functionality. I had read that with ASP.NET 4.0 they exposed this functionality so it could be used outside of MVC. I found a number of articles (here and here) on how I could set up custom routes to my handler. As I tried to follow the examples in the articles they all had more complexity than I needed to get going. I was eventually able to get it working but wished for a simpler example.

So I created a stripped down example. In the example below you access the site like http://example.com/quote/MSFT. URLs with /quote are routed to the handler and the next item in the URL is used as a parameter.
  1. Create a new 4.0 (or higher) empty ASP.NET website.
  2. Add an App_Code directory and custom HttpHandler like the following:
    /// 
    /// HTTP Handler to return raw HTML.
    /// 
    public class CustomHttpHandler : IHttpHandler {
    
      // This holds the additional parameters from the URL.
      public RouteData RouteData { get; set; }
    
      public void ProcessRequest(HttpContext context) {
        string symbol = RouteData.Values["symbol"] as string;
        context.Response.ContentType = "text/html";
        context.Response.Write("Stock symbol is " + symbol);
      }
    
      public bool IsReusable {
        get { return false; }
      }
    }
  3. Add the Global.asax file and in Application_Start add a route to the RouteTable. This takes an instance of a class that implements IRouteHandler.
     void Application_Start(object sender, EventArgs e) {
      RouteTable.Routes.Add("quote", new Route("quote/{symbol}", new CustomRouteHandler()));
      }
    
  4. Add the route handler class that implements IRouteHandler. This must return an instance of the HttpHandler you want the URL directed to.
    public class CustomRouteHandler : IRouteHandler {
      public CustomRouteHandler() {
      }
    
      public IHttpHandler GetHttpHandler(RequestContext requestContext) {
        var handler = new CustomHttpHandler();
        handler.RouteData = requestContext.RouteData; // Include the reference to the RouteData
        return handler;
      }
    }
    

Monday, July 1, 2013

Finding the start of the week for a random DateTime

I've written before about extension methods here , here and here. There are pluses and minuses to using them but the other day I was able to add what I thought was a neat little one.

Given a random date we needed to be able to figure out when the week started for that date. Depending on the culture this could be either the Sunday or Monday before the date. After some googling and StackOverflow answers I was able to come up with the following DateTime extension method:
/// <summary>
/// Extension method to return the start of the week for the current culture.
/// </summary>
public static DateTime StartOfWeek(this DateTime dt) {

  // Get the difference in days from today to the start of the week and account for any negative conditions.
  // Sunday is 0, Monday is 1, etc
  int diff = (int)dt.DayOfWeek - (int)CultureInfo.CurrentCulture.DateTimeFormat.FirstDayOfWeek;

  // Adjust back to the start of the week and return a date stripped of time.
  if (diff < 0) { diff += 7; }
  return dt.AddDays(-diff).Date;
}
Now for a given DateTime you can call StartOfWeek like this:
    DateTime.Now.StartOfWeek();

Sunday, March 31, 2013

The SSD - Cheap Memory Paradox

I'm a big convert to using a SSD (solid-state drive) as the primary drive in a system. In terms of a speed boost there little more that can make as large a difference in your computing experience as with adding a SSD. Your OS and programs will load faster and if you have multiple programs open switching between them is smoother. SSD prices have been falling so that you can now find quality drives for under a $1/gigabyte. I was able to get a Corsair Force GT 128GB drive for x.

System memory has also fallen since the last time I upgraded my computer so on my last upgrade I was able to purchase 16GB for around 80 dollars. But here comes the problem Windows swap file (pagefile.sys) wants to be the same size as your RAM as does you hibernation file (hiberfil.sys). This eats up 32GB of an already shrunken drive (128GB actually becomes about 111GB formatted). While you could move the swapfile to another hard drive this reduces the swap speed and one of the benefits of the SSD. The hibernation file cannot be moved (see The File System Paradox) and has to be on the C drive.

Apparently this is improved in Windows 8 with the pagefile reduced to just 25% of RAM when hibernation is enabled (see Swapfile.sys, Hiberfil.sys and Pagefile.sys in Windows 8). If it wasn't for the annoying Metro interface I might consider upgrading. Ah well may have to look for specials on 256GB drives.

Wednesday, February 13, 2013

Virtual Static


In Smalltalk a Class is a first level object just like everything else in the system. Objects get instantiated using the Class as a template but the Class itself can have its own properties and methods. In fact constructors are just class methods in Smalltalk that return an instance.

Now as a C# programmer I've always thought of static properties and methods as the same thing as Smalltalk's class properties and methods. In the back of my head I knew that C# didn't have first-level Class objects but the static paradigm seemed to fit. So when the other day I wanted to override a static method in a subclass I figured no problem just mark the one in the superclass 'virtual' and then add my new implementation. One problem was that compiler balked at 'virtual static'.

After some searching around I found this blog post explaining why this was not a valid combination of keywords. As Eric Libbert writes:
Related questions come up frequently, in various forms. Usually people phrase it by asking me why C# does not support “virtual static” methods. I am always at a loss to understand what they could possibly mean, since “virtual” and “static” are opposites! “virtual” means “determine the method to be called based on run time type information”, and “static” means “determine the method to be called solely based on compile time static analysis”.
Turns out 'static' is really not equal to 'class'. I must admit I almost shed a tear that day as this sunk in.

Friday, January 18, 2013

Upgrading my Home Network

For 2013, I've set out a goal for myself to upgrade several key components of my home network. I have a household with PCs, Macs, iPads, Kindles, and smartphones (which makes me feel like a mini IT admin). Making sure those devices operate optimally and efficiently is my goal. Also with newer technology being designed to use less energy I think this effort can lower my overall power consumption. My three initial items are:
  1. Update wireless network
  2. Update file server and backup strategy
  3. Update home theater PC.
Once I get those items done I can reevaluate the next steps.

First up is my wireless network. I currently use an ASUS (RT-N16) running DD-WRT as my main router connected to a cable modem. While speeds on the wired connections with the gigabit switch have been great the wireless performance has just been ok. As I started researching this more I found out that just having G devices on the same network as N devices forces the router to lower speeds. First I tried to figure out which devices were using which protocols on the network but this proved to be surprisingly difficult. Basically the best I could find from googling was tools to scan the network for wireless devices (dd-wrt already has this list) and then go to each device and check its adapter and connection speed. Instead I set my router to N only and found out which devices could no longer connect. This helped me isolate the G devices to the Kindles and some older iPods (nothing that was critical).

RT-AC66UI looked at upgrading to this beast the ASUS RT-AC66U Dual-Band Router (what SmallNetBuilder calls the Dark Knight) but at close to $200 it was an expensive option. My plan if I got that was to put the N devices on 5GHz band and the G devices on the 2.4GHz. I seem to remember reading that this was possible but might lower the N a little bit. I don't have any AC devices yet but a little future proofing doesn't hurt.



Instead I decided to buy an old router off of eBay and just dedicate that for the G devices. I wanted something compatible with Tomato firmware as I've been interested in trying that out as an alternative to DD-WRT. There are number of newer builds of Tomato (eg Shibby or Toastman) but the original polarcloud one had the best documentation to get me started.

I was able to find a Linksys WRT54G for $20 shipped and quickly went to work getting it setup. I plugged my desktop directly into the router and was able to connect on 192.168.1.1 with the default username/password. Uploading the firmware went smoothly and soon I was running Tomato. Played around some of the settings and went to configure the router.

I made my only mistake when I configured both the router's IP address and its WAN address to be 192.168.1.2 (since the ASUS was still going to be the main router at 192.168.1.1). Connecting the Linksys WAN port to a port on the ASUS router didn't allow me to see the router. After a quick reset I set just the router's IP address to 192.168.1.2, skipped the WAN port and just connected it to one of the switched ports and all seemed good. I then configured the wireless on the Linksys with it's own SSID. I set each router to only serve a specific protocol and put them on channels far apart.

So far everything seems good and connection speeds are higher on the N devices. Fairly inexpensive fix to speed up things. The next project is not going to be as cheap.

These articles were a great in helping me figure out what I needed: