CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Ian Cooper [MVP]

December 2007 - Posts

  • Registration for altnetconf opens January 2nd

    A short message to let you know that we plan on opening registration for the UK altnetconf (to be held on February 1st and 2nd) on January 2nd. So you will have time to recover from your New Year hangovers before needing to hover over your computer to try and sign up. We will do registration in two batches, so if you cannot sign in on the 2nd do not panic, we will announce a second sign-up date later.

    The total number of attendees is limited (75 on the day) both for logistics, but also to preserve the 'big conversation' feel of the event. We will endeavor to record what happens, so if you cannot make it, you should still be able to pick up on what transpires after the event.

    Because we are getting our feet wet with this one, I wanted to point out that the Friday is likely to be a later-afternoon/early evening activity where we build the agenda and go out for drinks/eats with the conference taking place on Saturday proper. One item we can discuss at this event is extending the time period to be closer to the length of the original Austin conference.

     

     

     

  • Mocks and the Dangers of Overspecified Software

    I'll be back on LINQ architecture after the holidays, but in the meantime, I wanted to share some of the bad, some of the places where we have had bitter experiences.

    When NMock first appeared we embraced the behaviour verification style that it supported. We liked the idea that for 'unit tests' we should not interact with other concrete classes. We liked the way that mocks had to derive from interfaces, abstract classes, or override virtual methods. We wanted to depend upon abstractions, not details, and we liked the way that mocks gave us an emergent design that exhibited this quality.

    One of our first pushes was against slow tests that talked to the Db. Writing our tests against any kind of shared fixture was painful (note that in NUnit, MbUNit, MSTest et al. class variables are shared state for all tests in the fixture; xUnit tries to fix this but writing tests against shared state in the Db was especially painful. Either tests influenced each other, or we wrote complex setup and tear down code. Even using tricks like OleTx transactions we still had to pre-populate the Db or setup everything each time. And the tests were slow...

    So we mocked out our DataMappers and, freed from the dependency on the Db, testing our Domain proceeded like a dream.

    The project inherited a byzantine legacy Db schema that was not amenable to an mapping via an ORM tool (at least at the state-of-the-art for that time), so we had to roll our own DataMappers. Due to the limited re-use options outside this context, instead of rolling our own reflection and generics approach, with its attendant complexity, we opted to aim for a solution that we could eventually just code-gen.  Keen to build via TDD, then code-generate once we were sure it worked, we wanted to drive development of our DataMappers via TDD. Inferring from our wins with mocks in the domain, we expected to be able to gain similar benefits by mocking out our DataMappers interaction with the Db (mea culpa).

    So, to persist, we created an abstraction of the Db, an IDatabase. Then mocked that IDatabase in our unit test and created expectations of the behaviour that of our mappers, by expecting calls to the Db to execute sql. This enabled us to check that we created the stored procedure calls we would expect. To create parameter lists for those procedures we created classes (which we intended to eventually generate) that mapped domain objects into SQL parameters.

    To materialize objects back out we created classes (which we would again auto-generate) that gave us the ordinals needed to read the fields from the row corresponding to the class. We used a dependant mapping strategy, so that our domain worked with a DataMapper for a root class but we loaded any child entities and value objects.

    Of course, we still needed integration tests to see if they would actually work, and of course, we found that some of the SQL code we were generating passed unit tests, but failed when run against the Db, but overall we were pretty proud of how we were testing the Db. There were some warnings (and one member of the team expressed doubts) but it seemed to go pretty well. As an aside, as we were only mapping out a few classes, the pressure to code generate never hit us. Believing that we needed to build 2 or 3 mappers before we would see our mapper design emerge (by refactoring to remove duplication) we never reached the point where we needed to push on from there to code generation to complete our persistence requirements. Simply put, we did not need to persist enough items that the cost of writing code generation became less than the cost of implementing the remaining mappers by hand.

    The problems began to hit us in maintenance (and that can hit quite early on an agile project with frequent releases). We had a number of issues:

    Mocking tools were not strongly-typed

    At that time the mocking tools just used strings, there was none of the record-and-replay style seen within tools like Rhino Mocks. This is not only important because the compiler can no longer help you find errors, but because refactoring tools stop helping you make changes when a string is the method call. So Rename Method to express intent, became search-and-replace. Unit tests would pass, but integration tests failed, because the names had changed.

    The test code was over-coupled
    A lot of our test code contained a dozen lines of set up for mocking along the lines of:

    database.Expect("AddInParameter", dbCommand.MockInstance, "@Username", DbType.String, person.UserName);
    database.Expect("AddInParameter", dbCommand.MockInstance, "@FirstName", DbType.String, person.FirstName);
    database.Expect("AddInParameter", dbCommand.MockInstance, "@MiddleInitial", DbType.String, person.MiddleInitial);
    database.Expect("AddInParameter", dbCommand.MockInstance, "@Surname", DbType.String, person.Surname);

    The tests are coupled not only to the domain model but the schema, and represented a point of resistance to change for us.

    Tests predicted the implementation rather than letting it evolve through refactoring
    The tests specified the implementation and as such writing the test constrained the implementation. This broke the more normal TDD pattern of make it pass then refactor. Writing the specification for the implementation in the tests is expensive, and error prone.

    Changes to implementation were Shotgun Surgery
    When you change the implementation of a method under test, mocks can break because you now make additional or different calls to the dependent component that is being mocked. For us, if you needed to add an extra parameter to a domain class under our model for example, you had to create the expectation for that parameter in the test. After a while the process of adding a new field became expensive, and the number of changes required to add a new method began to smell of shotgun surgery. The trouble had become that our tests not only specified the inputs and outputs but also how the method under test was implemented: the order and number of calls.

    The mocks began to make our software more resistant to change, more sluggish, and this increased the cost to refactoring. As change becomes more expensive, we risked becoming resistant to making it, and we risk starting to build technical debt. A couple of times the tests broke, as developers changed the domain, or changed how we were doing persistence, without changing the test first, because they were frustrated at how it slowed their development. The mocks became an impedance to progress.

    Mocks had become, for us, fragile tests.

    Red, Green, Refactor
    Agile methodologies allow a just-in-time design approach because you can refactor existing code at low-cost and risk. Unit tests enable this scenario, because they protect against changes in behaviour of the system under test. You can change the implementation, provided the behaviour remains the same. However when mocks are fragile, and risk becoming an obstacle to change, because they can break even when the behaviour remains consistent they can increase the cost of refactoring.

    Maybe we should have just done integration testing?
    By contrast the effort to check the classes using integration tests turned out to be quite small in this instance, because we only needed to check our ability insert, update, and delete on each mapper. We had gained a lot by removing our dependency on the Db from the domain with the DataMapper, but our desire to mock out the Db on the implementation of the DataMapper, looked as though it cost us more than it saved. It was a bridge too far.

    Fragility and Mocks
    When I look around now, I see a lot of people using mocks to replace all their dependencies. My concern is that they will begin to hit the Fragile Test issues that mocks present. Gerard Meszaros identifies the issues we hit as two specific smells: Overspecified Software and Behaviour Sensitivity.

    Test Doubles
    Gerard Meszaros classifies any object we use to stand in for another object during a test as a Test Double. It is worth reading what Gerard has to say either on the web site or in his book. The key is to understand that you are replacing a dependency to isolate the object under test from either Indirect Inputs or Indirect Outputs. Mocks are really only a sweet spot for testing indirect outputs. If you have indirect inputs, a Test Stub or Fake Object may be a more maintainable approach than a mock. Even for indirect ouputs it is worth considering a Test Spy (we find the Self-Shunt variation particulary simple to use) or Test-Specific Subclass before looking at a mock.

    Switching to fakes and stubs
    Since that project we have weaned ourselves from our mock dependency and try to use what Fowler calls a classicist approach to TDD more. Where we do replace a depended upon component, we try to use the appropriate technique, depending on whether our concern is an indirect input or an indirect output in the dependency. In addition when talking to the outside world we weigh up the point at which the 'last mile' should be checked with an intergration test over a unit test. So while I want to isolate my domain, I may make different jhdgements in the service layer. Mocking frameworks are powerful, but 'with great power comes great responsibility'.


  • Alt.Net.UK Jeremy wins fastest finger

    Wow, Jeremy is on the mark, he beat me to an announcement about Alt.Net.UK. You can check out Ben's blog, where Jeremy saw it.

    Following Bill and Jeff's call to arms, we are going to put on an altnetconf in London on Friday 1st and Saturday 2nd February 2008. The conference will be Open Spaces, just like the original  Austin altnetconf and focused on the theme Bill identifies: "AltNetConf's are open spaces conferences where DotNetters get together to discuss how to build better .Net software." Conchango will be hosting the event. We do not have registration yet, but it will be coming. Expect the numbers to be of the same order as alnetconf in Austin, about 100 people. We understand there may be more demand than that, much as there always is for BarCamp, but this number looks manageable to us, this time around. We may follow others lead and release tickets in phases, rather than all at once.

    As well as Ben and myself, Alan Dean, Robert Grigg, are helping put this together, ably assisted by Michelle Flynn. Feel free to get in touch if you have any questions. We will try to answer them as best we can.

  • Architecting LINQ to SQL applications, part 4

    Dynamic Queries

    One question that arises from time to time is: how do I do dynamic queries in LINQ. The problem usually stems from allowing the users to generate search criteria, perhaps through a filter for a list for example. Because we cannot predict the combinations of values that the user will choose, LINQ, where we define our statements at compile time cannot really help us. So in the example below while we can search for all the customers in London, we don’t have any syntax like that in the bottom box, which somehow lets us evaluate an expression at run time.LINQ helps us with compile time expressions, but it is less help with run-time expressions.

    So how do we deal with dynamic expressions?

     First, LINQ to SQL does have a get out of jail card, it’s called ExecuteSQL and it is provided by DataContext to let you do pass through against the Db. LINQ also helps alleviate some pain here: object materialization is supported, so we can still get LINQ to pull back objects for us.

    TestDataContext context = new TestDataContext();

    StringBuilder buffer = new StringBuilder(
     "select * from Customer where ");
               
    for (int i = 0; i < listDictionary.Count; i++)
    {
           buffer.AppendFormat("{0} == {1}", listDictionary.Keys[ i ], listDictionary.Values[ i ]);
    }
    IEnumerable<Customer> customer = context.ExecuteQuery<Customer>(buffer.ToString()); 
     

    Of course the danger here is that we now have some SQL, which is expressed not in terms of the domain, but of the relational schema. So if we do this, we need to wrap out interaction with the DataContext here within a Data Mapper, or perhaps in this sub-case a service, and push it all the way out into the infrastructure layer.Whatever I say next, a lot of people will take this pragmatic way out. Go for your life.

    Specification

    NHibernate gives us a little language, HQL, which allows us to query our objects. The advantage is that HQL is expressed as a string, so we can compose it, and thus use it to provide dynamic query support. This gives it an advantage in this instance over LINQ. The downside is that it is never checked by the compiler, so errors are more expensive to find.

    There a way to gain the benefits of composition, with the comfort of type checking. One solution is the Specification pattern, which I have blogged about before. To summarize that more lengthy post: a specification is a pattern for expressing a rule which you want to use to test an object. A specification is ultimately used to separate two orthogonal concerns: testing objects and the objects themselves.

    What we want is something like below, where we combine criteria into an expression that we use to test items in a repository to produce a filtered result set.

    ISpecification<Customer> customersInLondon =  new Specification<Customer>(c => c.City == "London");
    ISpecification<Customer> customersInUK  = new Specification<Customer>(c => c.City == "UK");
    ISpecification<Customer> customersInLondonUK  = customersInLondon.And(customersInUK); List<Customer> matchingCustomers  = customerRepository.FindBySpecification(customersInLondonUK);

    We construct individual queries, or specifications, in a strongly typed fashion by using generics and lambda expressions. We make them composable, via the composite pattern, so we can and/or/not expressions as we build them.

    public interface ISpecification<T>
    {
        Expression<Func<T, bool>> Predicate {get;set;}
        ISpecification<T> And(ISpecification<T> other);
        bool IsSatisfiedBy(T customer);
        ISpecification<T> Not();
        ISpecification<T> Or(ISpecification<T> other);

    We define an abstract base class that handles the work of combining specifications. The hard work of combining the lambda expressions depends on some expression tree magic over at my previous blog post:

    abstract public class AbstractSpecification<T> : ISpecification<T>   
    {       
        protected Expression<Func<T, bool>> predicate;
        public abstract bool IsSatisfiedBy(T value);
        public ISpecification<T> And(ISpecification<T> other)
        {
            return new AndSpecification<T>(this, other);
        }
    }
    This combined expression can replace the predicate in the where clause of our LINQ expression, because we are just passing an expression tree through to LINQ to SQL for evaluation:

    public IEnumerable<Customer> FindBySpecification(ISpecification<Customer> specification)
    {           
        IQueryable<Customer> customerQuery = from c in Customers select c;
        IQueryable<Customer> restrictedCustomerQuery  = customerQuery.Where<Customer>(specification.Predicate);
        return restrictedCustomerQuery.ToList();
    }

    But, like I say, a lot of folks will prefer the ExecuteQuery approach over messing with expression trees.

    But my second piece of advice would be:

    You can use ExecuteQuery to exercise your dynamic queries, but the call should be encapsulated in the infrastructure layer to keep SQL out of the domain. Where you have a significant requirement for dynamic queries, consider implementing the Specification pattern and using Expressions to integrate with LINQ

    Next up, we'll talk about how to build from the domain, rather than from the data, and how to markup your domain objects for persistence.

     

  • Architecting LINQ to SQL applications, part 3

    DAOs and Repositories

    One of the concerns we want to separate our domain from, is how we persist the domain model. The domain should not need to where: file, database, service etc. or how. In the specific context of an RDBMS, because we do not want the domain to be relational database or its schema we do not want any SQL statements or ADO.NET objects within our domain. It is not just that the external schema may change but also that there may a mismatch when mapping between the two, the impedance mismatch problem.

    DAO

    A Data Access Object (DAO) is a pattern for encapsulating data access. A DAO is an abstraction that provides services for retrieving, inserting, updating, and deleting objects from a persistent store.  Fowler calls a DAO a Data Mapper. We will use this term from now on, because the term DAO means different things to different people.

    Data Mapper

    Because the Data Mapper knows the details of RDBMS, how to connect to a DB, relational schemas, SQL etc. it cannot live in the domain, instead it is an infrastructure service. Because we do not want our domain to depend on concrete classes, we need to provide an interface for the Data Mapper, so that we can depend on an abstraction from the domain. We want to be insulated from change in either the persistence mechanism or medium.

    Test-first development will drive out this abstraction, because we want to replace the dependency on the DB at runtime (to both prevent issues with shared fixtures and slow tests).The domain, or an application service, uses the Data Mapper for persistence.

    Most hand-rolled or code-generation Data Mappers are the simplest form using hard-coding, configuration files, or reflection to map between a class in the domain and a table in the RDBMS. Within .NET, a Data Mapper often contains ADO.NET code, using dynamic SQL or stored procedures. Internally a Data Mapper tends to use a DataReader for performance. Of course the Data Mapper does not have to be abstracting a DB, and we might be using XMLDocument etc. under the hood.

    The biggest advantages to the Data Mapper pattern are  
    • Simplicity: it leaves us with a clean domain model where the orthogonal concern of persistence is separated out of from the domain classes.  
    • Extensibility:  By depending on an abstraction, instead of a concrete type, we can swap implementations (through a factory or IoC container) which allows us the flexibility to meet new needs, such as support for multiple vendors SQL implementations. 
    • Maintainability: Having all of the data access code for a class in one place, reduces duplication and shotgun surgery when we need to update our data access code.
    There are a number of complimentary patterns we can add when we use a Data Mapper:
    • Identity Map: When we load an entity we want future loads of that same entity to return the same reference. This is not for performance, a happy side effect, but ensures that we don’t lose any pending changes next time we reload, and allows us to compare for equality by address.
    • Unit of Work: Records the deltas between an object when loaded and when persisted so that we only update with changes not everything. The unit of work also allows us to bundle up changes and make them together, so as to improve performance.
    • Query Object: We don’t want to pollute our domain space with SQL when we want to perform dynamic queries so we need an alternative way of representing a query.Instead of rolling-our-own for a Data Mapper we can adopt the expedient of buying an off-the-shelf product in the RDBMS space called an Object Relational Mapping Tool.

    Repository

    If we have a hand rolled Data Mapper, or ORM, although we can make calls to it from the domain, we still want to encapsulate that interaction.

    repository 

    There are a number of reasons for this but the most important are:

    • DRY: We don’t want repeated code, setting up a Query object to do a common query for example, so by encapsulating into a single method we prevent that. 
    • Shotgun Surgery: We don’t want to have to modify our calls to the Data Mapper in many places, so by adding the code to a single class we don’t have to go for searching it out within our application. 
    • Gateway: We might want to change our ORM at some point in the future. Encapsulating the calls to the ORM allows us to limit the modifications we have to make to the classes that are implemented using the ORM.
    The Repository pattern looks and feels like a collection to the domain. This makes it clear and simple to work with.  The unit of work often remains outside of the repository, because we may want to update multiple elements on a repository or multiple repositories at once. The repository is implemented by calls to the Data Mapper. Building a repository with LINQ is almost trivial:

    public class NorthwindRepository
    {
        private IQueryable<Customer> customers;
        public NorthwindRepository(DataContext context)
        {           
            customers = context.GetTable<Customer>();       
        }

        public Customer FindCustomer(string customerId)
        {           
            return (from c in customers
                where c.CustomerID == customerId
                select c).Single<Customer>();
        }
    }

    There are two alternatives to testability with a repository: the simplest is to have the repository implement an abstraction (i.e. an interface), which you then provide an in-memory stub for, for testing purposes. The other is to provide an abstraction for the Data Mapper that your repository depends upon and provide a test stub for that. The advantage here is if you want to confirm querying logic within your repository.

    public class CustomerRepository
    {
        private IUnitofWork workspace;
        public CustomerRepository(IUnitofWork workspace)
        {
            this.workspace = workspace;
        }
       
        public Customer FindCustomer(string customerId)
        {
            return (from c in workspace.Customers
                where c.CustomerID == customerId
                select c).Single<Customer>();       
        }
    }

     Jimmy Nilsson shows an example of this in his book: Applying Domain Driven-Design and Patterns and I talk more about testability in Being Ignorant with LINQ to SQL.

    Roles in LINQ to SQL

    LINQ to SQL provides an ORM tool for use with .NET. Components within LINQ to SQL map to ORM patterns: LINQ to SQL’s DataContext contains an Identity Map that holds objects already loaded from the database. It also acts as a unit of work: you call the DataContext to submit your changes and it figures out the SQL statements necessary to figure out what has changed from the version last loaded. Note that LINQ to SQL optimizes queries that are against the primary key, by returning directly from the map rather than re-querying. For anything else, it cannot know what the returned set will be in advance, so it has to return themMy first piece of advice around LINQ to SQL, based on existing ORM practice, would be: LINQ to SQL is an ORM and may be safely called from within the domain layer. Calls to LINQ to SQL do  not need to be placed in the infrastructure layer (sometimes called data access layer in this role). LINQ to SQL fulfils the role of a Data Mapper within the infrastructure layer already, so this would just be a repetition of the abstraction. However, do consider wrapping interaction with the LINQ to SQL ORM within a Repository, instead of using it throughout the domain to simplify testing and maintenance.

    Next time we'll talk about dynamic queries.

     

     

    Posted Dec 02 2007, 11:31 PM by Ian Cooper with 16 comment(s)
    Filed under: ,
More Posts

Our Sponsors