Monday, October 1, 2012

Simple Mechanical Turk example using C#


I started looking at the AWS API and specifically the Mechanical Turk Requester API the other day. After reading the overview and understanding the basics I wanted to test making a simple call to get a feel for it. I set up an account and got my access credentials (getting credentials is a whole other story as it involves two different websites and seems overly complicated but it's done for now). I figured the GetAccountBalance operation seemed like the best first option as I didn't want to create a HIT (Human Intelligence Task - their term for tasks you can create) and manage that. Since I hadn't added any money it should be easy for them to return a zero to me.
The web services are REST based and calling them seems pretty straightforward. You figure out the operation you want to perform, set some querystring parameters and make a web request. The complicated bit is calculating the signature (a SHA1 hash of the operation, service and current time using your secret access id as the key). Amazon doesn't have a nice simple example of constructing this in C# and even the other language examples don't look simple. There is a SDK for .NET it looks like it does a good job wrapping the Amazon endpoints but I wanted to figure out the basics for myself. So by starting with the examples from the SDK I was able to simplify the code down to the following:
using System;
using System.Globalization;
using System.IO;
using System.Net;
using System.Security.Cryptography;
using System.Text;

namespace TurkTest {
 class Program {

  static void Main(string[] args) {
   const string SERVICE_NAME = "AWSMechanicalTurkRequester";  // requester service for MTurk
   const string TIMESTAMP_FORMAT = "yyyy-MM-ddTHH:mm:ss.fffZ";

   // Modify these with your values.
   const string operation = "GetAccountBalance";
   const string accessKey = "[Your access key]";
   const string secretAccessKey = "[Your secret access key]";

   // Millisecond values in the timestamp string can result in intermittent BadClaimsSupplied errors.
   // Get the current UTC time and use that to create a new time with milliseconds set to zero to avoid this case.
   DateTime now = DateTime.UtcNow;
   now = new DateTime(now.Year, now.Month, now.Day, now.Hour, now.Minute, now.Second, 0, DateTimeKind.Utc);
   string timeStamp = now.ToString(TIMESTAMP_FORMAT, CultureInfo.InvariantCulture);

   // Create the hash-based messaged authentication algorithm (SHA1) using our secret access key as the key.
   var hmac = new HMACSHA1(Encoding.UTF8.GetBytes(secretAccessKey));

   // Combine the service name, operation and timestamp and then hash them to produce the signature.
   var dataBytes = Encoding.UTF8.GetBytes(SERVICE_NAME + operation + timeStamp);
   string signature = Convert.ToBase64String(hmac.ComputeHash(dataBytes));

   // Build the URL to send to Amazon
   string url =
    @"https://mechanicalturk.amazonaws.com/?Service=AWSMechanicalTurkRequester&AWSAccessKeyId={0}&Version=2012-03-25&Operation={1}&Signature={2}&Timestamp={3}";
   url = string.Format(url, accessKey, operation, signature, timeStamp);

   // Send the request and write the respose to the console
   using (WebClient client = new WebClient()) {
    using (StreamReader reader = new StreamReader(client.OpenRead(url))) {
     Console.WriteLine(reader.ReadToEnd());
    }
   }

   Console.Read();
  }
 }
}
After a couple of false starts I was able to get it to return my zero balance. Hope this helps the next person get started.

Sunday, September 30, 2012

Is If Condition

Awhile back I wrote about some general rules I had for writing if conditions. So I found it amusing the other day to read a blog post on Coding Horror about new programming jargon and see the following:

Yoda Conditions

Yoda-conditions

Using if(constant == variable) instead of if(variable == constant), like if(4 == foo). Because it's like saying "if blue is the sky" or "if tall is the man".
I will have to add that to my guidelines.

Thursday, August 16, 2012

Return of the Extension Method

Recently a small team of us branched the code to start work on a new feature for a future release while work continued in the mainline for the current upcoming release. As I was working on our new feature I realized that I wanted to extend some common core classes with new functionality. The classes have their own hierarchy tree and normally I would have created a base method for the shared functionality and overridden it for some of the subclasses. However, since I knew that other developers were working on these same classes I looked for ways that I could minimize merging code later. I've written before about extension methods (here and here) and thought this might be a good time to use them.

I quickly ran into an issue where extension methods only work with the same class that the object is cast to. Say for example you have a Shape superclass and Circle subclass. Normally we would pass around Circle objects as Shapes and rely on virtual or abstract methods to provide a common interface. With extension methods there is no inheritance so you have to either explicitly cast our objects or do some type of switching in a common method.

I've created a simple program in LINQPad to demonstrate this behavior:

void Main() {
    // Create a new Circle but cast it as a Shape
    Shape obj = new Circle();
    // Call the virtual method
    Console.WriteLine(obj.WhoAmI());
    // Call the extension method
    Console.WriteLine(obj.WhoAmIEx());
}

public class Shape {
    public virtual string WhoAmI() {
        return "Virtual Shape";
    }
}
public class Circle : Shape {
    public override string WhoAmI() {
        return "Overriden Circle";
    }
}

public static class Extensions {
    public static string WhoAmIEx(this Shape obj) {
        return "Shape Extension";
    }
    public static string WhoAmIEx(this Circle obj) {
        return "Circle Extension";
    }
}

Running this produces:
 Overriden Circle
Shape Extension

Took me a little while to figure out this behavior (although I should have realized it from the start). I still ended up using an extension method to add one new call but after we merge the code I think I will change that back to a virtual method.

Wednesday, May 16, 2012

Fluent Smalltalk

The other week I wrote about creating HTML controls in ASP.NET MVC using the Fluent Interface pattern. I haven't been a Smalltalker since the 90s so it took awhile for it click where I had seen this pattern before. In Smalltalk the default return value from a message send (aka method) is the receiver (aka the object itself or 'self'). You are encouraged by the language to create readable chains of method sends.

I also had a thought about how the C# compiler could support this as a language level feature by allowing the 'this' keyword as the return type for a method.

So instead of having to write a method like:
public Image ImagePath(string path) {
  imagePath = path;
  return this;
}
You could just write:
public this ImagePath(string path) {
  imagePath = path;
}

Monday, May 7, 2012

Database table initialization

So how do you create, update and initialize the tables and data in your application's database. The product I currently work on supports both being installed locally by the customer or being hosted by us as a SaaS offerring. Additionally we support a customer using either Oracle or SQL Server as the backend database so we wanted our database initialization routines to be as database agnostic as possible (rather than using SQL scripts).

We've created classes and objects in C# that represent the tables and columns that the application uses. These classes generate the appropriate SQL needed to setup the database. The db setup code can be run to initialize a database from scratch or update an existing install to the lastest version. It also creates indexes and seeds the tables with the shipped data. To create a table we have code like:
    // Create a basic users table.
    DbTableBuilder table = new DbTableBuilder(dbType, "USERS", setupLog);
    table.AddKeyColumn("userid");
    table.AddCharColumn("username", 255);
    table.AddCharColumn("password", 50);
    table.AddCharColumn("email", 100);
    table.AddCharColumn("name", 100);
    table.CreateTable(connection);

    // Create an index on the username.
    index = new DbIndexStatement("IDX_USERS_NAME", "USERS", "username");
    index.CreateIndex(connection);

I've started experimenting with LINQ to SQL and the Entity Framework. It's nice how it will auto-create the tables (at least in SQL Server) but not sure the best way to seed the tables with shipped data or recreate both production and testing databases. I also ran across an open source project (but can't now find it again) that uses a JSON formatted file to specify the data table structure and used that to create the tables.

For now I think we will stick with the C# code but keep looking for other ideas.

Thursday, May 3, 2012

Fire and Forget Index Creation

Recently we added a new index that we realized was going to take a really long time to create on some of our existing tables with lots of data (it would of course be fine on new installs with no data). So we started to look for options to tell the database to create the index in the background and return control to the setup code (aka fire and forget). We didn't really care when the index creation finished as long as it did at some point.
using (SqlCommand cmd = sqlConnection.CreateCommand()) {
    cmd.CommandText = "CREATE INDEX index_name ON table_name (column_name)";
    connection.Open();
    cmd.BeginExecuteNonQuery();
}

This failed almost universally (a couple of indexes were created before the connection closed but basically it didn't work). As this similiar question in StackOverflow highlights, if the connection closes the query ends. We looked at implementing the ThreadPool option outlined in that question or just adding an EndExecuteNonQuery method to close the connection but both ran into a roadblock. Our Network Operations team runs the setup code from a Windows application. During testing we found that closing the Windows application also closed the connection meaning that the index creation fails. We could add a progress indicator to the application and have them wait to close the app but then we were essentially back to the blocking issue.

So somewhat relutcantly we decided to use SQL ServerAgent Service to create the indexes. The downside being if we wanted to do the same for Oracle we would have to write separate code (we work with both SQL Server and Oracle). It only works on SQL Server when the service is installed and running (which excludes SQL Server Express).

To kick off a job we added code similiar to the answer to this question. We also added the following test to see if SQL Agent is running otherwise we fallback to the old blocking code.

SELECT spid FROM MASTER.dbo.sysprocesses WHERE program_name = N'SQLAgent - Generic Refresher'

Ideally it would be nice if we could have just kicked this off from C# code but at least this allowed us to remove a roadblock and move on to the next issue.

Wednesday, May 2, 2012

Fluent HTML Wrappers

I've just started doing some small projects with ASP.NET MVC and the Razor View Engine. Coming from many years of Web Forms programming, there a number of things I really like about it but also some really confusing things. The HtmlHelpers seem nice for small examples but seem to break down when you want to configure a lot parameters or some of the non-standard properties.

    First Name: @Html.TextBox("firstname")<br />
    Last Name: @Html.TextBox("lastname", Request["lastname"])<br />
    Town: @Html.TextBox("town", Request["town"], new Dictionary<string, object>(){{ "class", "special" }})<br />
The text box for firstname is the simplest example but the parameter list can start to get really long or you have to resort to passing a dictionary of attributes. The dictionary is particularly troubling to me because the keys are the attributes you want to set and you lose Intellisense and compile time checking.

I've seen a number of posts decrying the lack of (or benefit of) server controls in MVC and while I can appreciate doing away with Viewstate and the various postbacks they were a nice way to encapsulate a bunch of properties and custom html generation. So looking at ways to accomplish this without adding a ton of HtmlHelper extensions I found the fluent pattern used by the Telerik and MvcContrib libraries. This seemed like a good compromise to wrapping the properties and complex html code.

So I set off to try and create a simple Image class just to understand how this might be applied to larger and more complex objects. Well something that should have been a lot simpler took me a day and a half to have a working example so hopefully this blog post might help someone avoid my same traps.

I wanted these controls to be reusable across projects so I created a solution with a C# class library. To test out the rendering I created a new web site by selecting "Add -> New Web Site..." and choosing "ASP.NET Web Site (Razor)" from the dialog (this would lead to one set of issues below). This creates a weird default web site with some basic account management features. I say weird because there are no 'Controllers' or 'Views' directories and everything is done inline on the pages. But anyway all I needed was some place to try out my code.

I started with the standard HtmlHelper extension example to verify that everything was working between the class library and the web site.
In the class library I added references to:
- System.Web
- System.Web.Mvc (3.0)
- System.Web.WebPages

using System.Web.Mvc;

namespace FluentTest {
    public static class ImageHelper {

        public static string RenderImage(this HtmlHelper html, string imagePath) {
            TagBuilder img = new TagBuilder("img");
            img.MergeAttribute("src", imagePath);
            return img.ToString(TagRenderMode.SelfClosing);
        }
    }
}

In the web site I added a reference to the class library and in default.cshtml I added:
@using FluentTest;
@Html.RenderImage("Resources/images/example.jpg")

Now I ran into my first issue. Trying to compile the code I got the following errors:
- Instance argument: cannot convert from 'System.Web.WebPages.Html.HtmlHelper' to 'System.Web.Mvc.HtmlHelper'
- 'System.Web.WebPages.Html.HtmlHelper' does not contain a definition for 'RenderImage' and the best extension method overload 'FluentTest.ImageHelper.RenderImage(System.Web.Mvc.HtmlHelper, string)' has some invalid arguments

I changed the HtmlHelper parameter to System.Web.WebPages.Html.HtmlHelper and the code compiled but when I viewed the output it was encoded as plain text:
<img src="Resources/images/example.jpg" />

After some more Googling I changed the return type from string to MvcHtmlString, added a reference to System.Web.Mvc in the web site and finally the images was correctly displayed.

Now I wanted to turn this into a class that I could use method chaining to set the properties. Both the Telerik and the MvcContrib libraries use the fluent pattern and are open source. Unfortunately, they are such large frameworks that make use of things like dependency injection, multiple base classes, factories and interfaces that it is hard to track where to start. This blog post ended up being a great starting point for a simple fluent example but I knew I wanted to follow the Telerik model where the only thing returned from the HtmlHelper extension was the object instance to start the chain. So I modified the code as follows:
namespace FluentTest {
    public static class ImageHelper {
        public static Image Image(this System.Web.WebPages.Html.HtmlHelper htmlHelper) {
            return new Image();
        }
    }

    public class Image {
        private string imagePath;
        public Image ImagePath(string path) {
            imagePath = path;
            return this;
        }

        public override string ToString() {
            TagBuilder builder = new TagBuilder("img");
            builder.MergeAttribute("src", imagePath);
            return builder.ToString(TagRenderMode.SelfClosing);
        }
    }
}

In the web page I changed the RenderImage call to:
@Html.Image().ImagePath("Resources/images/example.jpg")


This worked great in the sense that now I had an object I could use to set a bunch of properties (with Intellisense) and encapsulate the html rendering. The problem was the output was back to being encoded:
<img src="Resources/images/example.jpg" />

The Razor engine was automatically calling ToString() on my object which as we all know returns a string. And the Razor engine automatically encodes all strings to prevent cross-site scripting. That's well and good but I'm writing an Html control and want Html outputted. Lots of Googling and reading StackOverflow answers that all say return a MvcHtmlString or HtmlString but no examples of how to get the Razor engine to call something other than ToString() on my object.

Finally I stumbled across this StackOverflow question. It wasn't directly about my issue but I could see that he was trying to do same thing I was. The light bulb went off when I saw that he had implemented the IHtmlString interface on his class along with the required ToHtmlString() method. I knew this had to be it so I modified the Image class and added the IHtmlString interface and voilĂ  my image rendered.
    public class Image : IHtmlString {
        public string ToHtmlString() {
            return ToString();
        }
    }