CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Karl Seguin

.NET From Ottawa, Ontario - http://twitter.com/karlseguin/

May 2008 - Posts

  • Foundations of Programming - pt 8 - Back to Basics: Exceptions

    Exceptions are such powerful constructs that developers can get a little overwhelmed and far too defensive when dealing with them. This is unfortunate because exceptions actually represent a key opportunity for developers to make their system considerably more robust. In this chapter we'll look at three distinct aspects of exceptions : handling, creating and throwing them. Since exceptions are unavoidable you can neither run nor hide, so you might as well leverage.

    Handling Exceptions

    Your strategy for handling exceptions should consist of two golden rules:
    1 - Only handle exceptions that you can actually do something about, and
    2 - You can't do anything about the vast majority of exceptions

    Most new developers do the exact opposite of the first rule, and fight hopelessly against the second. When your application does something deemed exceptionally outside of its normal operation the best thing to do is fail right then and there. If you don't you won't only lose vital information about your mystery bug, but you risk placing your application in an unknown state, which can result in far worse consequences.

    Whenever you find yourself writing a try/catch statement, ask yourself if you can actually do something about a raised exception. If your database goes down, can you actually write code to recover or are you better off displaying a friendly error message to the user and getting a notification about the problem? It's hard to accept at first, but sometimes it's just better to crash, log the error and move on. Even for mission critical systems, if you're making typical use of a database, what can you do if it goes down? This train of thought isn't limited to database issues or even just environmental failures, but also your typical every-day runtime bug . If converting a configuration value to an integer throws a FormatException does it make sense continuing as if everything's ok? Probably not.

    Of course, if you can handle an exception you absolutely ought to - but do make sure to catch only the type of exception you can handle. Catching exceptions and not actually handling them is called exception swallowing (I prefer to call it wishful thinking) and it's a bad code. A common example I see has to do with input validation. For example, let's look at how not to handle a categoryId being passed from the QueryString of an ASP.NET page.

    int categoryId;
    try
    {
      categoryId = int.Parse(Request.QueryString["categoryId"]);
    }
    catch(Exception)
    {
      categoryId = 1;
    }

    The problem with the above code is that regardless of the type of exception thrown, it'll be handled the same way. But does setting the categoryId to a default value of 1 actually handle an OutOfMemoryException? Instead, the above could should catch a specific exception:

    int categoryId;
    try
    {
       categoryId = int.Parse(Request.QueryString["categoryId"])
    }
    catch(FormatException)
    {
       categoryId = -1;
    }

    (an even better approach would be the use the int.TryParse function introduced in .NET 2.0 - especially considering that int.Parse can throw two other types of exceptions that we'd want to handle the same way, but that's beside the point).

    Logging

    Even though most exceptions are going to go unhandled, you should still log each and every one of them. Ideally you'll centralize your logging - an HttpModule's OnError event is your best choice for an ASP.NET application or web service. I've often seen developers catch exceptions where they occur only to log and rethrow (more on rethrowing in a bit). This causes a lot of unnecessary and repetitive code - better to let exceptions bubble up through your code and log all exceptions at the outer edge of your system. Exactly which logging implementation you use is up to you and will depend on the criticalness of your system. Maybe you'll want to be notified by email as soon as exceptions occur, or maybe you can simply log it to a file or database and either review it daily or have another process send you a daily summary. Many developers leverage rich logging frameworks such as log4net or Microsoft's Logging Application Block.

    Cleaning Up

    In the previous chapter we talked about deterministic finalization with respect to the lazy nature of the garbage collector. Exceptions prove to be an added complexity as their abrupt nature can cause Dispose not to be called. A failed database call is a classic example:

    SqlConnection connection = new SqlConnection(FROM_CONFIGURATION)
    SqlCommand command = new SqlCommand("SomeSQL", connection);
    connection.Open();
    command.ExecuteNonQuery();
    command.Dispose();
    connection.Dispose();

    If ExecuteNonQuery throws an exception, neither our command nor our connection will get disposed of. The solution is to use Try/Finally:

    SqlConnection connection;
    SqlCommand command;
    try
    {
       connection = new SqlConnection(FROM_CONFIGURATION)
       command = new SqlCommand("SomeSQL", connection);
       connection.Open();
       command.ExecuteNonQuery();
    }
    finally
    {
       if (command != null) { command.Dispose(); }
       if (connection != null) { connection.Dispose(); }
    }

    or the syntactically nicer using statement (which gets compiled to the same try/finally above):

    using (SqlConnection connection = new SqlConnection(FROM_CONFIGURATION))
    using (SqlCommand command = new SqlCommand("SomeSQL", connection))
    {
       connection.Open();
       command.ExecuteNonQuery();
    }

    The point is that even if you can't handle an exception, and you should centralize all your logging, you do need to be mindful of where exceptions can crop up - especially when it comes to classes that implement IDiposable.

    Throwing Exceptions

    There isn't one magic rule to throwing exceptions like there is for catching them (again, that rule is don't catch exceptions unless you can actually handle them). Nonetheless throwing exceptions, whether or not they be your own (which we'll cover next), is still pretty simple. First we'll look at the actual mechanics of throwing exceptions, which relies on the throw statement. Then we'll examine when and why you actually want to throw exceptions.

    Throwing Mechanics

    You can either throw a new exception, or rethrow a caught exception. To throw a new exception, simply create a new exception and throw it.

    throw new Exception("something bad happened!");
    //or
    Exception ex = new Exception("somethign bad happened");
    throw ex;

    I added the second example because some developers think exceptions are some special/unique case - but the truth is that they are just like any other object (except they inherit from System.Exception which in turn inherits from System.Object). In fact, just because you create a new exception doesn't mean you have to throw it - although you probably always would.

    On occasion you'll need to rethrow an exception because, while you can't handle the exception, you still need to execute some code when an exception occurs. The most common example is having to rollback a transaction on failure:

    ITransaction transaction = null;
    try
    {
      transaction = session.BeginTransaction();
      // do some work
      transaction.Commit();
    }
    catch
    {
      if (transaction != null) { transaction.Rollback(); }
      throw;
    }
    finally
    {
      //cleanup
    }

    In the above example our vanilla throw statement makes our catch transparent. That is, a handler up the chain of execution won't have any indication that we caught the exception. In most cases, this is what we want - rolling back our transaction really doesn't help anyone else handle the exception. However, there's a way to rethrow an exception which will make it look like the exception occurred within our code:

    catch (HibernateException ex)
    {
      if (transaction != null) { transaction.Rollback(); }
      throw ex;
    }

    By explicitly rethrowing the exception, the stack trace is modified so that the rethrowing line appears to be the source. This is almost always certainly a bad idea, as vital information is lost. So be careful how you rethrow exceptions - the difference is subtle but important.

    If you find yourself in a situation where you think you want to rethrow an exception with your handler as the source, a better approach is to use a nested exception:

    catch (HibernateException ex)
    {
      if (transaction != null) { transaction.Rollback(); }
      throw new Exception("Email already in use", ex);
    }

    This way the original stack trace is still accessible via the InnerException property exposed by all exceptions.

    When To Throw Exceptions

    It's important to know how to throw exceptions. A far more interesting topic though is when and why you should throw them. Having someone else's unruly code bring down your application is one thing. Writing your own code that'll do the same thing just seems plain silly. However, a good developer isn't afraid to judicially use exceptions.

    There are actually two levels of thought on how exceptions should be used. The first level, which is universally accepted, is that you shouldn't hesitate to raise an exception whenever a truly exceptional situation occurs. My favorite example is the parsing of configuration files. Many developers generously use default values for any invalid entries. This is ok in some cases, but in others it can put the system in an unreliable or unexpected state. Another example might be a Facebook application that gets an unexpected result from an API call. You could ignore the error, or you could raise an exception, log it (so that you can fix it, since the API might have changed) and present a helpful message to your users.

    The other belief is that exceptions shouldn't just be reserved for exceptional situations, but for any situation in which the expected behavior cannot be executed. This approach is related to the design by contract approach - a methodology that I'm adopting more and more every day. Essentially, if the SaveUser method isn't able to save the user, it should throw an exception.

    In languages such as C#, VB.NET and Java, which don't support design by contract mechanism, this approach can have mixed results. A Hashtable returns null when a key isn't found, but a Dictionary throws an exception - the unpredictable behavior sucks (if you're curious why they work differently check out Brad Abrams blog post). There's also a line between what constitutes control flow and what's considered exceptional. Exceptions shouldn't be used to control an if/else-like logic, but the bigger a part they play in a library, the more likely programmers will use them as such (the int.Parse method is a good example of this).

    Generally speaking, I find it easy to decide what should and shouldn't throw an exception. I generally ask myself questions like:
    1 - Is this exceptional,
    2 - Is this expected,
    3 - Can I continue doing something meaningful at this point and
    4 - Is this something I should be made aware of so I can fix it, or at least give it a second look

    Perhaps the most important thing to do when throwing exceptions, or dealing with exceptions in general, is to think about the user. The vast majority of users are naive compared to programmers and can easily panic when presented with error messages. Jeff Atwood recently blogged about the importance of crashing responsibly.:
    1 - It is not the user's job to tell you about errors in your software!
    2 - Don't expose users to the default screen of death.
    3 - Have a detailed public record of your application's errors.

    It's probably safe to say that Windows' Blue Screen of Death is exactly the type of error message users shouldn't be exposed to (and don't think just because the bar has been set so low that it's ok to be as lazy).

    Creating Custom Exceptions

    One of the most overlooked aspect of domain driven design are custom exceptions. Exceptions play a serious part of any business domain, so any serious attempt at modeling a business domain in code must include custom exceptions. This is especially true if you believe that exceptions should be used whenever a method fails to do what it says it will. If a workflow state is invalid it makes sense to throw your own custom WorkflowException exception and even attach some specific information to it which might not only help you identify a potential bug, but can also be used to present meaningful information to the user.

    Many of the exceptions I create are nothing more than marker exceptions - that is, they extend the base System.Exception class and don't provide further implementation. I liken this to marker interfaces (or marker attributes), such as the INamingContainer interface. These are particularly useful in allowing you to avoid swallowing exceptions. Take the following code as an example. If the Save() method doesn't throw a custom exception when validation fails, we really have little choice but to swallow all exceptions:

    try
    {
       user.Save();
    catch
    {
       Error.Text = user.GetErrors();
       Error.Visible = true;
    }
    //versus
    try
    {
       user.Save();
    }
    catch(ValidationException ex)
    {
       Error.Text = ex.GetValidationMessage();
       Error.Visible = true;
    }

    The above example also shows how we can extend exceptions to provide further custom behavior specifically related to our exceptions. This can be as simple as an ErrorCode, to more complex information such as a PermissionException which exposes the user's permission and the missing required permission.

    Of course, not all exceptions are tied to the domain. It's common to see more operational-oriented exceptions. If you rely on a web service which returns an error code, you may very wrap that into your own custom exception to halt execution (remember, fail fast) and leverage your logging infrastructure.

    Actually creating a custom exception is a two step process. First (and technically this is all you really need) create a class, with a meaningful name, which inherits from System.Exception.

    public class UpgradeException : Exception
    {
    }

    You should go the extra step and mark your class with the SerializeAttribute and always provide at least 4 constructors:
    1 - public YourException()
    2 - public YourException(string message)
    3 - public YourException(string message, Exception innerException)
    4 - protected YourException(SerializationInfo info, StreamingContext context)

    The first three allow your exception to be used in an expected manner. The fourth is used to support serialization incase .NET needs to serialize your exception - which means you should also implement the GetObjectData method. The purpose of support serialization is in the case where you have custom properties, which you'd like to have survive being serialized/deserialize. Here's the complete example:

    [Serializable]
    public class UpgradeException: Exception
    {
      private int _upgradeId;
      public int UpgradeId { get { return _upgradeId; } }
    
      public UpgradeException(int upgradeId)
      {
        _upgradeId = upgradeId;
      }
      public UpgradeException(int upgradeId, string message, Exception inner) : base(message, innerException)
      {
        _upgradeId = upgradeId;
      }
      public UpgradeException(int upgradeId, string message) : base(message)
      {
        _upgradeId = upgradeId;
      }
      protected UpgradeException(SerializationInfo info, StreamingContext c) : base(info, c)
      {
        if (info != null)
        {
          _upgradeId = i.GetInt32("upgradeId");
        }
      }
      public override void GetObjectData(SerializationInfo i, StreamingContext c)
      {
        if (i != null)
        {
          info.AddValue("upgradeId", _upgradeId);
        }
    }

    Conclusion

    It can take quite a fundamental shift in perspective to appreciate everything exceptions have to offer. Exceptions aren't something to be feared or protected against, but rather vital information about the health of your system. Don't swallow exceptions. Don't catch exceptions unless you can actually handle them. Equally important is to make use of built-in, or your own exceptions when unexpected things happen within your code. You may even expand this pattern for any method that fails to do what it says it will. Finally, exceptions are a part of the business you are modeling. As such, exceptions aren't only useful for operational purposes but should also be part of your overall domain model.

    Posted May 29 2008, 08:02 PM by karl with 25 comment(s)
    Filed under:
  • Signing Requests (or anything) with Hashes

    It's common practice to use hashes for signing purposes. For example, you can take a hash (MD5, SHA1, it doesn't really matter) of a file's byte contents and use that hash to ensure the file hasn't been modified. It's also quite common to hash a user's password - which is essentially creating a signature with a single variable.

    Hashes are ideal because they are relatively fast to compute, available for virtually every known language, secure (especially if you add a random value in order to prevent a dictionary attack), and easy to implement. Using a hash signature with a web service is a great and simple way to provide a certain degree of authentication. It's the approach Facebook, as well as other platforms, use as the basis of their security model for third party applications.

    At a high level, our strategy is to define a simple algorithm for creating a hash based on parameters and a previously agreed upon secret value. Then, every request a client makes include a hash. The server takes the request and computes its own hash. If the server's hash and the client's hash match, the request is authentic and execution can proceed. Pretend we have a web service for a video game which accepts a score, it's simplest implementation might look something like:

    savescore.axd?sessionId=323393&score=500

    Of course, this isn't very secure. Anyone with basic knowledge of HTTP can easily resubmit the request with whatever values they want:

    savescore.axd?sessionId=323393&score=01123581321345589

    However, if we sign the request based on the two input values (sessionId and score) as well as a secret value, it is no longer possible to simply change the score:

    savescore.axd?sessionId=323393&score=500&h=albkk309sk

    So how exactly do we create this signature? Well, sticking with Facebook as our guide, we'll take each key=>value parameter, sort them alphbetically, concatenate them together along with our secret value. So, given them above query and a secret value of "itsover9000", we'd create a hash of:

    score=500sessionId=323393itsover9000

    (notice how we aren't separating our pairs with & - that's just how facebook does it, you can add an extra separator or anything else if you want)

    Here's a function that does just that:

    public static string CreateSignature(IDictionary<string, string> parameters, string secret)
    {
       SortedDictionary<string, string> sortedParameters = new SortedDictionary<string, string>(parameters);                  
      StringBuilder sb = new StringBuilder();
      foreach (KeyValuePair<string, string> kvp in sortedParameters)
      {
         sb.Append(string.Concat(kvp.Key, "=", kvp.Value));               
      }        
      sb.Append(secret);
      byte[] bytes = new MD5CryptoServiceProvider().ComputeHash(Encoding.UTF8.GetBytes(sb.ToString()));
      StringBuilder data = new StringBuilder();
      for (int i = 0; i < bytes.Length; ++i)
      {
         data.Append(bytes[ i ].ToString("x2"));
      }
      return data.ToString();
    }

    (The last part turns our hashed byte array into a hexadecimal format)

    Depending on exactly what it is you're trying to do, you can integrate that directly into your framework. For example, if you're using HttpHandlers, you could easily build a base class to handle all of this:

    public abstract class SignedRequestHandler : IHttpHandler
    {
       public void ProcessRequest(HttpContext context)
       {
          NameValueCollection parameters;
          if (string.Compare(context.Request.HttpMethod, "GET", true) == 0)
          {
             parameters = new NameValueCollection(context.Request.QueryString);
          }
          else
          {
             parameters = new NameValueCollection(context.Request.Form);
          }
          string clientHash = parameters["h"];         
          parameters.Remove("h");
          if (string.IsNullOrEmpty(clientHash) || clientHash != CreateSignature(parameters, "itsover9000"))
          {
             throw new RequestSignatureException();
          }         
       }
    
       public abstract bool IsReusable { get; }
    
       private static string CreateSignature(NameValueCollection parameters, string secret)
       {
          SortedDictionary<string, string> sortedParameters = new SortedDictionary<string, string>();
          foreach (string item in parameters.AllKeys)
          {
             sortedParameters.Add(item, parameters[item]);
          }
          return CreateSignature(sortedParameters, secret);
       }
       private static string CreateSignature(SortedDictionary<string, string> parameters, string secret)
       {         
          StringBuilder sb = new StringBuilder();
          foreach (KeyValuePair<string, string> kvp in parameters)
          {
             sb.Append(kvp.Key);
             sb.Append("=");
             sb.Append(kvp.Value);
          }
          sb.Append(secret);
          byte[] bytes = new MD5CryptoServiceProvider().ComputeHash(Encoding.UTF8.GetBytes(sb.ToString()));
          StringBuilder data = new StringBuilder();
          for (int i = 0; i < bytes.Length; ++i)
          {
             data.Append(bytes[ i ].ToString("x2"));
          }
          return data.ToString();
       }
    }

    Here we've just hardcoded the secret. If you're targetting multiple client/users, you need to assign them an application id (most open web services call these an APIKey) and have it passed along with each query. You can then use it to lookup the appropriate secret, something like:

    string apiKey = parameters["apikey"];
    string secret = Account.GetSecretFromApiKey(apiKey);

    And, that's it :)

  • Beyond Mocks - the Partial

    It's a litlte easy to get carried away with mocking when you first start doing it in conjuction with unit tests. Mocks are unbelivably flexible/powerful. However, to get the most out of them, you really need to be able to inject your mock into the class you're testing. Conincidentaly trying to leverage mocks in your code will lead you to the first tangible benefit of unit testing - you'll notice how tightly coupled your code is (i.e., impossible to test) and learn a wonderful aresenal of strategies to help you write better code.

    There comes a point though where injection just doesn't make a whole lot of sense for anything _but_ testing and even then the value might not be that evident. Let's see how we can use RhinoMock's Partials to solve the problem. Take for example an LoginHttpHandler that looks something like:

    public void ProcessRequest(NameValueCollection parameters)
    {
       string userName = parameters["username"];   
       string password = parameters["password"];
       User user = User.CreateFromCredentials(userName, password);
       //do something with our user
    }

    This code is impossible to unit test because the call to User.CreateFromCredentials can't be mocked/faked/anything. The best we can do is mock whatever dependencies User.CreateFromCredentials has - but that's just a nightmare not worth exposing ourselves to.

    What we can do is use a partial instead of a mock. A mock is a dumb object that only records expectations and actuals. A partial on the other hand exposes all of the behavior of the class we've created a partial for - except those we've explicitly set expecations on. For example, if we rewrite the above code as:

    public void ProcessRequest(NameValueCollection parameters){
       string userName = parameters["username"];
       string password = parameters["password"];
       User user = GetUser(userName, password);
      //do something with our user
    }
    public virtual User GetUser(string userName, string password)
    {
       return User.CreateFromCredentials(userName, password);
    }

    We can use a  partial to write a useful unit test (well, there isn't much to test for, but you get the idea):

    [Test]
    public void LoginHandlerInteractsWithUser()
    {
       MockRepository mocks = new MockRepository();
       LoginHttpHandler handler = mocks.PartialMock();
       User fakeUser = new User();
       NameValueCollection nvc = new NameValueCollection();
       nvc.Add("userName", "un");
       nvc.Add("password", "pass");
       using (mocks.Record())
       {
          Expect.Call(handler.GetUser(nvc["userName"], nvc["password"])).Return(fakeUser);
       }
       using (mocks.Playback())
       {
          handler.ProcessRequest(nvc);
       }
       mocks.VerifyAll();
    }

    We create a partial of our LoginHttpHandler which behaves exactly like a new instance of it - except for those methods (like GetUser) which we've set an expectation on. Just like with a mock, if ProcessRequest called GetUser twice, we'd have to record that call twice. Unlike a mock however, if we didn't, the 2nd call would call the real implementation of GetUser.

    The only other thing to keep in mind is that the method you plan on mocking must be virtual - this allows  RhinoMocks to override the method and do it's own thing (implement the record/playback logic).

    In my recent trip down Java land, I actually have a hell of a problem doing this seemingly simple thing. I tried a number of mocking frameworks, eventually settling on EasyMock but the amount of work needed to set up partials was insane. If I missed something, I'd love to know about it.

  • What arrays and pointers have in common - or - Why arrays start at zero

    Every now and again someone asks why do array indexes start at zero rather than one? It's a good question, because the 1st item in the array is...well..the 1st, not the 0th. Although the answer is easily found on the web, I figured it would tie in nicely with where we left off last time - looking at memory in .NET. Not surprisingly this stems from the good old C days (and perhaps even earlier). You see, back then, arrays and pointers were highly related. I actually remember thinking of them as identical things with different syntax. Once you see that arrays are just syntactical sugar to pointers, you'll understand why everything starts at zero. I'm not sure how much of this still applies in modern languages. I assume that arrays and pointers are still closely correlated - but I'm pretty sure zero now sticks around for convention. Everything below is in C.

    When you create an array, you cut out a continuous chunk of memory  which stores each value in its own block. So:

    int values[3] = {1, 20, 30};

    will cut out 96bits of memory (3*32) and place the values "1" in the first, "20" in the second and "30" in the third blocks

    Since this (like everything else) is just memory, we can assign it to a pointer:

    int *p = &values[0]; //the & operator returns the address of a value

    The above code assigns the address of the first value to our pointer. So we can say that p points to the start of our array. From our pointer, we can get the 2nd element or 3rd element:

    printf("%d\r\n", *(p+1));
    printf("%d\r\n", *(p+2));

    As you can see, *(p+i) is the exact same thing as saying values[ i ]. So, if you look at it from the point of view of pointers, the zero index makes perfect sense because it signifies the memory offset from the start of the array. To prove how tightly coupled the two concepts are, we can actually change:

    int *p = &values[0];

    to 

    int *p = values;

    since values array is nothing more than a pointer to the first element.

    While we're here, let's look at a crazy example that shows the danger (and power) of C. Even though we've only allocated 3 spots for our array, we can easily peak at what's next with either of these statements:

    printf("%d\r\n", *(p+3));
    printf("%s\r\n", values[3]);

    Worse, we can write an arbitrary value there:

    *(p+3)  = 0;
  • Foundations of Programming - pt 7 - Addendum

    I've made two additions to Part 7. The first is based on a suggestion by Greg to talk about a common cause of memory leaks - events and delegates. The second is about deterministic finalization

    Memory Leaks with Events
    There's one specific situation worth mentioning as a common cause of memory leaks: events. If, in a class, you register for an event, a reference is created to your class. Unless you de-register from the event your objects lifecycle will ultimately be determined by the event source. In other words, if ClassA (the listener) registers for an event in ClassB (the event source) a reference is created from ClassB to ClassA. Two solutions exists: de-registering from events when you're done (the IDisposable pattern is the ideal solution), or use the WeakEvent Pattern or a simplified version.

     


    Deterministic Finalization

    Despite the presence of the garbage collector, developers must still take care of managing some of their references. That's because some objects hold on to vital or limited resources, such as file handles or database connections which should be released as soon as possible. This is problematic  since we don't know when the garbage collector will actually run - by nature the garbage collector only runs when memory is in short supply. To compensate, classes which hold on to such resources should make use of the Disposable pattern. All .NET developers are likely familiar with this pattern, along with its actual implementation (the IDisposable interface), so we won't rehash what you already know. With respect to this chapter, it's simply important that you understand the role deterministic finalization takes. It doesn't free the memory used by the object. It releases resources. In the case of database connections for example, it releases the connection back to the pool in order to be reused.

    If you forget to call Dispose on an object which implements IDisposable the garbage collector will do it for you (eventually). You shouldn't rely on this behavior however as the problem of limited resources is very real (it's relatively trivial to try it out with a loop that opens connections to a database). You may be wondering why some objects expose both a Close and Dispose method, and which you should call. In all the cases I've seen the two are generally equivalent - so it's really a matter of taste. I would suggest that you take advantage of the using statement and forget about Close. Personally I find it frustrating (and inconsistent) that both are exposed.


    Finally, if you're building a class that would benefit from deterministic finalization you'll find that implementing the IDisposable pattern is simple. A straightforward guide is available on MSDN.

    Posted May 04 2008, 04:28 PM by karl with 5 comment(s)
    Filed under:
More Posts

Our Sponsors

Free Tech Publications