CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Jeremy D. Miller -- The Shade Tree Developer

Under the hood and working with .Net, TDD, Software Design, and Agile Stuff

October 2005 - Posts

  • Balancing Technical Improvements versus New Business Features

    At the last Agile Austin lunch we were talking a little bit about the infrastructure improvements we’d all like to be making, if only there was time.  We talked about improving our continuous integration builds, retrofitting legacy code with more automated tests, and getting rid of VSS.   One of my friends commented that there were so many things he wanted to improve, but there wasn’t enough time.  I feel the same. 

     

    We want to improve our code and architecture for better reliability and improved efficiency.  It’s not that simple though, because we really need to be creating new business features.  The marketplace is competitive and we’ve got to keep ahead or at least abreast of our competition.  Many of our clients are very large corporations that inevitably have unique requirements, and they need them now!  All the same though, my team is going to have to spend some time dealing with configuration management and access to a shared database or we’re going to create a support monster.  It doesn’t create any apparent business value, but we’re jeopardizing our ability to continue making changes to the code if we don’t get it under control.  The point being, we’ve got some technical debt that needs to be paid down because it decreases our efficiency while increasing the risk of any modification to the code – and the code is always changing.

     

    So we’ve got a common dilemma.  How do we make improvements in the existing code to make us more efficient and reliable while delivering new stuff?  Some times you really have to stop and clean things up first.  How exactly do you communicate this to senior management?  How in the world do you convince management that it’s a necessary task?  How do you quantify the risk of not doing some sort of infrastructure improvement so management can make an informed decision versus new functionality? 

     

    In a prior life as an enterprise architect I failed to convince management to create a proposed naming repository for vendors across our manufacturing, inventory, and supply chain systems.  Without that structural improvement the security model continued to be haphazard and wrongly allowed access to very sensitive data to the wrong folks.  The problem was pretty obvious to me and my peers, but we didn't manage to communicate the impact of their decision to forego the new naming repository.

     

    Life’s a lot easier when there is a good trust relationship between the development team and the program management.  If that trust isn’t there you might be forced to make the situation worse by ignoring problems or do the right things under the cover of darkness (leading to people burnout).  It seems to pretty natural for project/program managers to mistrust software developers, maybe because they’ve known developers to waste time on unnecessary “goldplating.”  My experience has been that most project managers really don’t have a very good understanding of the technical situation, and I’ve found this to be every bit as true for project managers that are ex-developers.  We’ve got to be feeding them the right information in a way that they can comprehend and they’ve got to be able to have some sort of faith in what we’re saying.  You really need to guard your own credibility by being honest and at least somewhat accurate with management.  Spending months on a failed attempt to rearchitect some subsystem is going to result in a shorter leash.  You might also inherit the mistrust of your management for your predecessors.  That's a bit unfair, but it's happened so many times to me that I almost expect it.

     

    Sometimes it’s obvious that you’ve got to clean something up or that doing a refactoring first will make the new functionality easier.  If the situation is really clear, I say you just do it.  You really shouldn’t need to ask permission from a project manager in these cases.  It’s simply a matter of you coming up with the best mechanical path to your end goal.  Anything that is big enough to affect velocity probably needs to be justified to a project manager and tracked as a user story.  That doesn’t have to stop the idea in its tracks, but it makes it harder to get it done.

     

    Other times you just have to bide your time and look for opportunities to make the refactoring work as you’re building new functionality in the same area.  You take an attitude that anytime you make changes in an area of code that you try to leave it in a better state when you’re finished.  Kind of a more active version of the Hippocratic Oath for code.  You make incremental changes to break coupling, increase cohesion, speed up the build time, and increase the unit test coverage whenever the opportunity is presented.  We spend a lot of time being frustrated because we have many more ideas of how to improve things than we really have time to accomplish.  I say that it’s still important to think and talk about the improvements that you’d like to do.  You’ll never make any kind of improvement if you aren’t creating a vision of some kind, and you never know when an opportunity will present itself.

     

    Of course the real answer is not to allow technical debt to build up in the first place.  You simply cannot cut corners in design, coding, or build management (damnit!) and sustain a rapid pace – not even in the short term.  Slacking off on good development practices will make you slower over time.  Another way to say this in MBA language for the suits is “Opportunity Cost.”  When you write badly factored code or skimp on test or build automation you’re making future endeavors more difficult and costly.  Agile development includes the practice of Refactoring.  I’ve often seen an apt analogy comparing refactoring work to washing the dishes in your kitchen (I think it originates with Ron Jeffries, but don’t quote me on that).  It gets a lot more difficult to cook in the kitchen if you didn’t wash the dishes.  Washing the dishes is a lot easier if you do it right after using them. 

     

    It does help if you can just be omniscient and never make any mistakes ;)  Thank you for listening, I feel better now.

  • The Worst Songs Ever (way off topic)

    Not even remotely software related.  In the past day or two I swear I've heard all of the worst songs ever recorded. 

    • "Super Bowl Shuffle" - Chicago Bears 1985 team.  The only stain on the best single season football team in my memory.  I'm pretty sure we did some kind of act around this at my 6th grade talent show.
    • "Convoy" - Go check out the lyrics if you're brave.  This loser was an actual hit in the 70's.  That, shag carpet, and puke green and brown decor give the 70's my vote as the worst American decade
    • "Escape (The Pina Colada Song)" - Ouch is all I can say.  Cheeziest song of all time.

     

  • Overthrowing the Tyranny of the Shared Database

    My team is making some structural improvements to plug some holes in our B2B messaging services.   The problem we’re starting to run into is a single database structure that is shared by at least a half-dozen applications, administrative applets, and assorted web services.  It’s not just the database structure itself, the stored procedures are also shared somewhat between applications (we think).  There does seem to be a bit of naming convention to identify the stored procedures by application, but I know that some of the stored procedures are shared by different applications.  It’s to the point where we’re somewhat afraid to make changes to the code or data in one place because of the risk of rippling effects to other pieces of code.  Houston, we have a problem (literally, our main office is in Houston).  Actually, I’d say that we have two major problems to deal with in order to make the family of systems easier and safer to change –

     

    1. Establishing some sort of configuration management of the database
    2. Eliminating code duplication and tight coupling

     

    Configuration Management

     

    So here’s our dilemma – where and how do you store the custom T-Sql code and DDL scripts?  How can you propagate database changes to other copies of the database?  How do you test database changes against the other applications to keep from breaking something else?  How do I know if I’m using the most current version of the database?

     

    As far as we know there isn’t a “master” copy of the database structure somewhere.  To some degree we’ve relied on copying the database with Sql Server Enterprise Manager.  That’s not good enough in the long term.  One of the enablers of Continuous Integration is the “Single Source Point.”  There should be one and only one well known place where you go to get the latest, greatest version of the database structure.  Hopefully, there is also a reliable way to build the master database quickly before you do any serious development against the database structure.  I’ve been burned badly when doing integration to a shared database that had far too many structural differences between development, testing and production versions of the database.  I’m never going through that again.

     

    My colleague is investigating creating a single Subversion repository for only the database structure and code.  It’ll also include a NAnt script to automate the setup of the database structure with a modicum of data.  I think we’re going to unilaterally declare this new repository to be the master copy of the database.  We’re hoping to use the multi-source configuration block in CC.Net to get the database setup and build integrated into all the projects that touch the shared database.  Any change to the shared database should result in an automated build of all the dependent parts.  Of course, we’ve got the little problem of not knowing all of the little one-off tools that rely on the shared database.  We also don’t have any kind of CI or automated testing infrastructure for any of the older projects.  One problem at a time is going to have to be the mantra for a while.

     

    Configuration management of databases is not easy, and I doubt that very many shops do it well.  I’ve got a slide for database configuration management in our Continuous Integration presentation that basically says “Database CI is really important, but I don’t know the best way to do it -- good luck to you.”  You simply cannot shirk the database configuration.  The database is part of the code and really needs to be versioned *with* the rest of the code.  I’ve often seen a drag on project efficiency as we dealt with “false” bugs that were just a result of the database version being out of synch with the middle tier version.  That problem goes away fast when the database is updated and rebuilt with the same automated build that produces the middle tier and user interface build products. 

     

    It’s easy enough to have every single piece of Data Definition Language code run as part of the CI build, but how best to handle large amounts of test data?  You certainly can’t drop and recreate a production database, so how do you deal with making delta scripts to apply new changes to a shared database that is already in production?  These aren’t easy questions, and I really don’t think anybody has a definitive answer.  Pramod Sadalage has a good paper on evolutionary database techniques, but I think the database is one of the remaining frontiers in Agile development.

     

    Shared Service Yes, Shared Database No

     

    This is a perfect situation to apply Service Oriented Application (not sure about it being a web service yet though).  The longer term solution is to get an honest to goodness service layer in front of the database.  Any application, applet, or little administrative tool has to go through the service layer.  Soon as we establish the common service we can greatly reduce the surface area of the code that is intertwined with the database structure.  Add a comprehensive battery of automated FitNesse and unit tests against the new service layer, and we’ll be able to modify the behavior of the system with confidence.  Exposing the business workflow that is represented by the shared database will lower the cost of creating new client functionality because we won’t have to recreate the database access code.

     

     I strongly believe that business logic should be mated to data access code.  Interpreting the business meaning of data is business logic.  SQL “WHERE” clauses often contain meaningful business logic.  Consolidating the business logic that interprets this data and the data access code that reads the shared database will allow us to eliminate a lot of duplication.  Eliminating duplication makes a system easier and safer to maintain and modify.  At a minimum we need to get the administrative tools using the application service layer instead of the end around to the database so we can keep the two pieces synchronized.

     

    How did we get here?

     

    We’re certainly not unique.  I’ve seen several IT shops that were effectively held hostage by a shared database.  Having zero encapsulation around raw database data makes all of the applications sharing the database very strongly coupled.  Changes in one system will ripple into other applications far too frequently.  That’s an avoidable mess. 

     

    The problem is that application integration has historically been difficult.  Many, if not most, applications don’t have a publicly accessible service strategy for easy integration.  Almost every enterprise software developer is comfortable writing low level data access code, so the temptation is always there to just go around an application into its underlying database for integration.  Opening up a database connection and grabbing or writing data is mechanically easy.  Like so many things in life doing the easy thing is very often the wrong thing.

     

    I’m very frequently skeptical of much of the SOA and web service hype, but there is definitely some solid fundamental thought behind SOA.

     

     

    How are you handling database configuration management?  I’d love to hear suggestions.

  • My Last Rant about Stored Procedure Abuse

    At the request of one of my colleagues --

    "There should be more classes than stored procedures"

            -- Gary Williams

    EDIT 10/28/2005 - Just to make this more specific, let's say there should really be more LOC in C#/Java/VB.Net than in T-SQL or PL/SQL

  • Deconstructing “The Simplest Thing That Could Possibly Work”

    I was reading Steve Eichert’s post on “The Simplest Thing != The Easiest Thing” and it struck a nerve with me.  The old KISS philosophy is formalized in Extreme Programming and its ilk as “The Simplest Thing That Could Possibly Work,” closely followed by “You Aren’t Gonna Need It.”  It’s a great philosophy, but it’s awfully hard to follow when the simplest thing isn’t obvious and the definition of “work” is subject to debate.  Following a literal or simplistic interpretation of these principles can lead to bad things.  I struggle quite a bit with these principles, and expect to for the rest of my career.

     

    Simple isn’t a License to Hack Away


    One of the biggest questions with “The Simplest Thing” is what exactly is simple to you?  I like Martin Fowler’s discussion in “Is Design Dead?  Kent Beck says a simple system:

    • Runs all the Tests
    • Reveals all the intention
    • No duplication
    • Fewest number of classes or methods

    An anti-pattern that I’ve commonly seen in Agile development is to focus solely on the first bullet point and ignore the rest.  Duplicating code or just writing more code than is necessary is a cause of project inefficiency.  Spotting duplication isn’t always obvious.  It’s pretty easy to spot copy/paste code, but harder to see the patterns that are emerging in your code.  I strongly feel that judicious usage of abstraction and design patterns can drastically reduce the amount of code and make the code more intention-revealing.  My opinion is that copious amounts of “IF/THEN/ELSE” constructs in code are nasty.  It’s easy to write this code (at least for a little while), but not simple to read and extend.

     

    One attitude is to very purposely write code as crudely as possible.  Only after a pattern is obvious do you make any kind of refactoring to introduce an abstraction to eliminate rework.  The thinking is that any kind of premature abstraction is more harmful than writing a lot of simple code.  I think that this ends up being inefficient because you end up writing more code than you need to.  Introducing an abstraction late leads to rework that could have been avoided.  Something to keep in mind is that there really isn’t any such thing as a large refactoring task.  That’s just redesign or rework.  I say you should pay attention to your code for cases where an abstraction can help you and apply them as soon as you’re sure that they’re necessary.  I guess it’s just a judgment call, but I tend to be more aggressive than some.  The automated refactoring tools are changing this equation by making it simpler mechanically to create abstract super classes or interfaces after the fact and making it cheaper to do continuous design.

     

    In case you haven’t caught this, I hate it when people equate simple code with writing butt ugly code that doesn’t use anything more complicated than “IF/THEN” constructs.

     

    What do you think is Simple?

     

    An issue that every developer will run into during their career is that a technique or trick that’s simple to you is incomprehensible to your peers, or vice versa.  Case in point, I don’t have any significant background in PERL or Unix/Linux command line utilities like grep.  I never think to reach for regular expressions to solve string problems because they seem like black magic to me.  I’ll go a longer way around to solve the problem.

     

    I’ve worked with plenty of developers that were either ignorant of design patterns or just extremely hesitant to use any kind of OO abstraction.  On a project last year I used a Composite* pattern to solve a related series of stories.  I structured the database tables so I could easily use a built in Oracle function for retrieving the “n-level” object tree by searching for any leaf object.  I thought the solution was fairly elegant and certainly didn’t require very much code.  Another developer coming off a vacation threw a fit over the design and refused to have any part of it.  He even went so far as to say we should recode the whole thing, even if it meant writing more code.  In the end it turned out that a lot of his objection was because he had no familiarity with the Composite pattern and not much comfort level with any kind of database coding.

     

    I frequently criticize what I consider to be foolish or wrong-headed usages of stored procedures for business logic.  On the other hand, when you need some sort of complicated data correlation I’d much rather use the database engine that’s been optimized for 20+ years to do just that than try to write my own code (LINQ might change this equation).  I’ve seen developers write several hundred lines of procedural code in a middle tier language to avoid writing a dozen lines of stored procedure code. 

     

    Having a limited box of tricks, be they language idioms, technologies, or design patterns is a lot like my tennis game.  My backhand swing is decidedly lacking, so I’ll try to run farther just to try to hit a forehand instead.  Hitting the forehand is the “simplest” solution for me, but if my backhand didn’t suck I could make the shots with less work.  It’s obviously not possible to know everything, but it’s an awfully good idea to not be too specialized.

     

    The question I’ve wrestled with over the years is whether or not you should change your solution to accommodate other developers’ comfort and knowledge level when it means writing more code.  I think the answer is the dreaded “It Depends.”  Personally, I’m usually pretty happy to have another developer show me an easier solution to a task.

     

    What exactly does “Work” mean?

     

    Here are three examples from an XP project last year where I wrestled with the “Simplest Thing” philosophy.  I definitely think I was right in one of these cases, wrong in another, and I’m still not sure about the third.  I’ll leave it to you to decide which one is which.  The project team had a couple of vociferous XP zealots that cared a great deal about doing things the XP way – so I was bludgeoned with XP rhetoric and theory every single day.

     

    I had a requirement in the system for a pessimistic offline lock on a business entity in a workflow system to prevent one user from editing an issue that was being addressed by another user.  One of the technical issues with this pattern for concurrency checking is what to do when the locks aren’t released in a timely fashion.  In this case a user could lock the issue and then walk out the door to go on a week long vacation.  When I suggested that we build some sort of mechanism to release aged locks, the PM and analysts cried YAGNI and told me that the customer hadn’t asked for that functionality.  I felt that you most certainly did need some mechanism to release the lock.  Yes, the pessimistic locking code would have passed all of the acceptance tests, but in production it could have easily caused problems.

     

    Early on we worked under the theory that we would just pull user roles and role memberships out of Active Directory.  We’re always told that AD is more efficient for reads and it seemed silly to duplicate the user information and administrative screens somewhere else.  The code did work, but it was causing developer pain and environmental dependencies on AD that were inconvenient.  At some point we finally bit the bullet and created a little background task to write the role membership information to a database table where it was easier to deal with.  In an interview once I was challenged on why I didn’t use AD instead of the database for user permissions on a project.  I responded with something like “Yeah it might be faster, but every developer understands SQL.  How many developers know how to query AD?”  By the end of our AD adventure, I was right back into the “just use a database” camp.  I’m not sure I really could have known upfront that using a database would be simpler than AD turned out to be, but I bet I could have guessed.

     

    The users used CRUD screens to edit a business entity in a WinForms client that called web services to do the actual persistence.  I made the statement that we needed to perform validation on the server side.  An analyst jumped down my throat screaming that the user experience would suffer if we did validation on the server.  A couple days later I managed to get a few words in edge-wise to explain that the validation could and probably should be done on the client also, but that we absolutely had to do the validation at the server.  The web services were only going to be used by that particular WinForms client, but they were publicly exposed web services throughout the LAN.  Maybe you call YAGNI on that, but I still think that you shouldn’t have left those web services exposed without validation and security authorization logic.  In our case we were able to do the validation both client side and server side with shared code, so no big deal.  The simplest thing to do would have been to make the web services dumb, but does that really work?  See why I don’t think following the “Simplest Thing…” is all that simple?

     

    I think the key is to focus just as much on what “work” means as we spend on “simple.”  It’s easy to get tunnel vision around just making the tests pass.  The issue for me is that the code can’t necessarily be said to “work” just because it passed all of the tests that you thought to write.  You always have to be thinking about non-functional requirements and the impact of your code.  The requirements are typically defined by the customer or requirements analysts, but it’s our job as developers and architects to keep the rest of the team aware of the technical and production ramifications of the requirements.  Things like supportability, maintainability, and scalability are our responsibility.  The customer isn’t going to be aware of these issues.  It might be that the answer is just to make sure that the production support folks are included in the projects as “customers.” 

     

     

    The Hippy Corollary

     

    I don’t know the exact origin of this one, but I had a chance to pair a bit last year with a very experienced XP developer named “Hippy.”  He told me that you mentally add another clause to make the rule “the simplest thing that could possibly work - that you can stand.”  It’s one thing to speculate on what you’ll need later and add things or hooks to the code that turn out to be wasted effort.  It’s another thing to create code that you know will cause problems later.  I’m starting a little user story this afternoon that might be most easily coded by adding an “after update” trigger to a database table.  Technically I’d say that that’s the “Simplest Thing” to do (at least mechanically), but it feels so wrong that I’m not going to do that.  I know from experience that database triggers are hidden maintenance traps for later developers.  Whatever path I do take won’t be the simplest thing, but it won’t keep me up at night feeling bad about the code.  I like the way Bob Koss put it in Refrigerator Code.  If you wouldn’t want another developer to see your code then the code is wrong.  The most effective judgment about the quality of your code may be how easy it is for the next guy to work with your code.

     

     

    * The “Composite” pattern is a bit complex, but it occurs so often in software development that you almost have to be familiar with it.  The Xml DOM model, the DHTML DOM model, the file system, any TreeView control, and WinForms controls come to mind right off the bat.  You might not build your own composite class structure very often, but when you do need it, the composite can make otherwise difficult coding simpler.  Either way, you will have to work with a composite pattern of some kind.

  • Haacked on TDD and Jeremy's First Rule of TDD

    Unit Testing Loves Beta Testing And Vice Versa

    Phil Haack has a great post up in response to some folks unhappy with their unit testing experience.  I think he makes several good points, including:

    1. Unit testing is only one layer of testing (duh)
    2. Automated unit tests are much, much simpler to write when you're writing the tests upfront and purposely creating testability

    Like many people in the blog trail is saying, there are some types of code that are just too difficult for automatic tests.  User interface code is often an example, both heavy clients and web applications.  Testing the actual presentation code is difficult (but certainly not impossible).  Before you throw the baby out with the bath water and abandon TDD altogether for UI projects to rush in new features, consider this:

    Jeremy's First Rule of TDD - "Isolate the ugly stuff "

    Take the things that are truly difficult to test automatically and wrap them behind abstracted interfaces and separate any other code and functionality away from the difficult to test code.  Typically I think this would be access to external systems, user interface windowing, or anything involving a web server.  A picture perfect vision of my Grandmother boiling turkey bones after Thanksgiving comes to mind.  The bones aren't very edible, but you can do something to get the rest of the meat off the bones for endless post-Thanksgiving sandwiches. 

    Circling back to user interfaces, the Model View Presenter pattern is a perfect example.  Actual user presentation code (System.Web.UI and WinForms) is harder to test, so make that code smaller.  Slice the "ugly" View classes as thin as possible.  Move the user interface flow logic into "plain old object" classes (Presenter) where it's much easier to test.  It probably is more code, but it'll make the UI code much easier to test and therefore easier to make work.  Since UI flow control code can become very complex and scary to change, it's awfully nice to be backed up with the automated unit tests.  You might decide to forgo unit testing the actual UI view code and call it a tolerable risk, but at least the controller logic will be unit tested.

    The other example that comes to mind is a system I did last year that had some custom Active Directory queries in the authorization code.  We punted pretty early on trying to write automated unit tests for the actual Active Directory access code.  Instead, we put the AD access code into a thin Gateway class and interface so we could test the rest of the authorization logic without stepping into the Active Directory queries.  The only place that had any actual interaction with AD was the Gateway class itself. 

    I would say that a lot of doing TDD is learning how to divide the class responsibilities up in a way to make the code easier to test (3 years so far and counting).  The fact that you often write code differently just for easy unit testing doesn't bother me a whole lot because it contributes to finishing the code sooner.  You can't just write code without regard to testability and expect to be able to retrofit unit tests on later.  I made the classic newbie mistake of trying to just write the code, then write the unit tests later only to find that my code didn't allow for easy unit tests.  TDD goes a lot smoother as soon as you realize that you need to change the way you build and structure code to take advantage of TDD.  Learning to write the unit tests first helps a tremendous amount too.  I think the book on best TDD practices is still being written, but there's a lot of experience out there already.  Before you throw your hands up in frustration over TDD, take a look at how other people are writing code for testability. 

    P.S. - I don't know about Swing, but you can efficiently TDD a great deal of WinForms client code.  Maybe not everything is worth an automated unit test, but anything you can do helps.  Even without using something like NUnitForms a lot of WinForms classes can be treated as just "POO" and tested in NUnit/MbUnit without too much hassle.

  • Universal Truths in Software Design, but We’re All Different

    There’s been a lot of blog traffic lately about the differences and merits of both upfront and evolutionary design.  There are a large number of seemingly contradictory design philosophies, techniques and tools out there.  You have to assume that most of these techniques have worked in some situation for somebody or they’d never have been published in the first place.  I think that there aren’t very many absolutes in techniques for software design because the effectiveness of any technique is governed by the project, technology, and learning style of the designer.  The few absolutes that I’d list are so general that they’re not really useful in crafting a specific approach to design.  It’s probably worth reminding ourselves that there is still no Silver Bullet technology or methodology that magically solves every design problem.  Here’s what I think though:

     

    • Design incrementally.  Eat the elephant one bite at a time.  This isn’t to say that thinking ahead in the design is a bad thing.  Like many people I think that it’s best to design in detail and code in smaller chunks.
    • Challenge the coding and design every day.  Complacency is evil.  Pay attention to what you’re coding.  Recognize a design that isn’t working and spot opportunities to improve the design as you work.
    • Talk about the design with your coworkers.  Socializing a design is a major part of doing design.
    • Reflect on your design during and after the project
    • If something about your design feels wrong it probably is (“Let go your conscious self and listen to your feelings”)
    • The more design tools, knowledge, and approaches that you’re familiar with the more successful you’ll be.  I think many Agilists hamstring themselves by unnecessarily dismissing any technique from outside the Agile canon.  I routinely supplement TDD with UML modeling or Responsibility Driven Design concepts (but I always feel silly when I try to use CRC cards).  Every so often I even draw up a database structure first as an exploratory activity. 
    • Architecture, design, and coding are too intrinsically intertwined to be done by separate people or teams (non-coding architects + spec coders = bad results).  You can’t design what you can’t code
    • Every developer is a more effective coder with design knowledge.  “Spec” coding sucks.  Coders that design too can make the small adjustments that lead to good systems.  A coder with no design skills just muddles through or perfectly follows the designs created by the all too human architect.  I don’t know if the 10-1 productivity ratio between good developers and bad developers is true, but I agree with Joel that bad developers just don’t produce good code.

     

     

    How and When I do Design Might Not Work for You

     

    I promise to never ever write a book about my personal design methodology because it won’t work as well for you unless you have a nearly identical learning style and personality type as me.  I’m a big fan of “UmlAsSketch” at a whiteboard because my learning style is clearly visual, and kinematic (explaining my predilection for arm-waving in front of a whiteboard).  The act of drawing UML diagrams on a whiteboard or using CRC-like notation is really just a concentration aid for me, much like school children learning to count on their fingers. 

     

    I also like to keep a design notebook as a scratchpad for design ideas.  It’s an interesting exercise to go back through older notes at the end of a project or even years later to see how your design (thinking) has evolved.  It also leads to quite a few “what was I thinking?” moments.  I can easily trace my evolution from a database-centric VB6 guy to an OO-centric with TDD .Net guy.  It’s also been a way to recycle design ideas later.  I’ve managed to use concepts that were discarded on one project successfully on another.

     

    I’ve learned the hard way that many perfectly capable and productive developers simply don’t work the same way I do.  I’ve been around strong developers that cannot derive any meaning from even the simplest UML drawing.  It might as well be hieroglyphics to them, even when you’re talking about code that they’re familiar with.  On the other hand, I usually can’t just sit down in front of a computer screen and start banging out code.  I’m usually much faster when I do a little bit of notepad sketching or even just make a list of things to do for my coding task.  I’ve been around other developers who do best when they’re looking at code, so they want to jump right into the IDE, write some code, and then start figuring out which way to go.  Not all of their code is bad; they just have a different way to get started.  Pair programming brings out another contrast in personal style.  Extroverted people are energized and think better when they are talking things through with someone else.  I’m the exact (introverted) opposite, and it makes pairing difficult for me at times.  I like to keep a notepad near me when I’m pairing to jot down notes or share off-the-cuff drawings with my pairing partner.

     

    The true masters of TDD can sit right down and quickly write that first unit test to get them going.  That first unit test on a task is usually the hardest and it’s what I see people struggling with.  I have to go to some other activity first to break the ice.  One thing I tell TDD newcomers is to just make a list of unit tests you think you’ll have to write, and start with the easiest unit test first.

     

    Another huge difference between developers is how well they can visualize and think through a problem in advance.  Some folks can easily handle and create abstract mental models.  Other people need to keep things more concrete.  My mechanical engineering background actually helps me with abstractions because engineering is often an exercise in creating a simplified model of reality for easier calculation.  Pattern matching is a huge variance across developers.  Some developers can anticipate a repeated function in the code and create an abstraction or reusable class collaboration earlier.  Other people need to see the pattern in code a couple of times first.

     

    My only real point here, assuming I have one, is that because we’re all different in our learning styles it’s worth your time to investigate multiple techniques (and never assume that Joe Bob's nirvana design technique will work for you).  If you’re aware of your own learning style you can fine tune your personal approach to play to your strengths.  It’s also smart to be cognizant of your coworkers’ learning and communication style so that you can adjust to them.  It’s absolutely useless to draw UML models around folks that aren’t visual.  That was a painful lesson I learned the last time I was a technical lead.

     

     

     

  • Repressed Memory

    Josh Flanagan (very smart guy and good basketball player who should start updating his blog again) took me to task a bit for my last two posts.  Yesterday I slipped and made a bit of a repeat post about one of those gnarly this calls this which calls that situations.  It jogged loose a memory this morning about an equally horrible mess I created on a project several years ago (which according to karma was rewritten by Josh this year).

    Phase 2 of the project called for my newly constructed system to integrate to some B2B infrastructure via a pair of MQSeries queues.  I had assumed all along that I would use a Java Stored Procedure inside Oracle 8i to access the MQSeries code with JMS.  The code to do this was trivial and took about 15 minutes to do.  Easy money.  Then somebody thought to try the code inside the Oracle database and found out really quick that it wouldn't run inside that environment.  Some later research suggested that it would work as long as we configured the database engine differently.  Since the system was in production we decided not to make the configuration change.

    The obvious answer is to just write a windows service that polls the MQSeries queue for incoming messages and relays them to the system.  My problem was that we were all VB6 and PL/SQL developers without enough C++ experience, so nobody really had the skillset to write a custom service.  Plus I had an unnatural fear of custom windows services from my support days.  The next best, and probably easiest, approach would have been just to create a batch program and have it kicked off by our existing scheduling infrastructure.  I shied away from this too because the infrastructure team was stretched way too thin to support us in the timeframe we had available and I really wanted to avoid interacting with this infrastructure.  Again, because we were all VB6 coders, so we struggled with the MQSeries access (at the time it was several dozen Win32 calls for a single read).  We had problems with the way the infrastructure guys deployed MQSeries.  Because MQSeries wasn't very VB friendly, we had to use a component that our central architects had written for standard access to MQSeries from middle tier code.  That component had plenty of its own problems and could only function on the same box as the MQSeries queue.  As I recall, the resulting architecture looked something like this:

    • PL/SQL stored procedure running at intervals from the internal Oracle scheduler doing a,
    • HTTP GET request to an ASP page hosted from the MQSeries machine that,
    • Called a VB6 DLL that called into MQSeries, pulled out the XML messages, and pushed them right back to the very same database for processing

    I know that there was another link (HTTP POST?) in the chain somewhere, but I don't remember what it was.  It mostly worked and stayed in production in some form or another for at least a couple years until we started using webMethods A2A for communication.  You can probably make the correct guess that it was fragile and caused no small amount of production support problems.

    I learned a couple of hard lessons from the experience.

    • That whole "keep it simple, stupid" thing (work in progress)
    • Watch out for technical risks.  One of the things that scares me about XP/Scrum iteration planning is scheduling stories strictly by business value instead of technical risk.  I think there's a little room to negotiate scheduling for cases like this as long as the team, the PM, and the customer trust each other.
    • Challenge your assumptions early.  Earlier prototyping or spiking would have exposed or eliminated the problem.
    • Be very cautious about new technology that you have not used before.  Understand a technology first instead of just copying example code from a book somewhere.
    • Either avoid technology that is completely new to your team or organization when there's an existing alternative, or prepare to invest some time in familiarizing the team with the technology
    • Don't design what you can't code.
    • Don't be afraid to ask for outside help.  I was very uncomfortable in asking for external help on that project.  Partly because I knew the organizational sluggishness wouldn't give us the help we needed in a timely manner anyway, and partly because I was afraid the project would be taken away from me.  In a large organization you have to do more looking ahead to identify where you'll need help from shared functional teams (DBA's, integration teams, etc.) because there always overbooked.  You simply can't just walk up to them and get something done right then and there.  We also had a bit of an issue where a much larger, higher priority project was hogging all of the shared resources.  Since they were sleeping at the office in cots in the training room, I can't complain too much about that one.
    • When your business model derives much of its profitability from an efficient supply chain, don't hand a mission critical supply chain automation project to an inexperienced development team with a green technical lead (me).

    Of course, after this was all over an Oracle DBA told me that the configuration changes to the database server would have been no big deal.  If I'd been able to do it my original way the chain of components would have been much, much simpler.

  • Really, really tight coupling

    It's time for another episode in the continuing saga of "doing stupid things in stored procedures." 

    My colleague is looking at some code that we need to modify with a temporary fix.  The legacy system calls into a stored procedure which then takes the arguments coming in and passes them to a COM object (via DCOM) to send an email (it also decides whether to send the email).  The COM object promptly opens up a database connection to go back to the very same database to get more information in order to create an email.  And of course there's the issue that the stored procedure is called from middle tier code to begin with.  WTF.  There are so many things wrong with this scenario that I don't even know where to begin.

     

  • My Strawman Can Beat Up Your Strawman

    There's a good discussion going on right now on the Yahoo XP board about Jim Shore's post on design

     

    The pro-Agile and anti-Agile arguments can get pretty silly because they’re mostly using very stupid implementations of either XP or heavy processes as the examples.  Just like politics I'd have to imagine that most of us live somewhere in the middle.  On one end you have the argument that Agile development is nothing but cowboy coding that doesn’t have *any* design or architecture activity.  After all, there’s no line item on an Agile project schedule that explicitly says “do design” or “publish the finalized Software Architecture Document.”  I’ll freely admit that developers often call their process “Agile” as cover for a “code’n fix” method.  We’re trying to hire a QA analyst for my team and most of the candidates (suck) make faces when they hear that we’re an Agile shop because their experiences have been negative. 

     

    There’s also a great deal of reasonable doubt that refactoring is cost effective.  My view is a bit mixed.  I think that refactoring is smart because you're acting upon real knowledge from the code.  On the other hand I think you better be aggressively keeping up with the refactoring or you might just building up a bunch of crud code.  I think there's a healthy mix of upfront and reflective design that's probably a bit different for every situation.  Experienced Agilists will do “Design All The Time.”  They learn as they go and apply these lessons.  They opportunistically create abstractions to eliminate duplication only after they understand where the abstractions should be.  They constantly use refactoring to make small adjustments as they go.  They do not allow technical debt to build up.

     

    The other extreme is the “Big Design Upfront” stereotype that is the boogeyman Agile advocates use to scare people away from traditional processes.  BDUF is described by Agilists as an attempt to specify all of the design to an excruciating detail before allowing any coding.  The coding is then a mindless implementation of this big bang design with no room for adaptation.  The fear is that the design will be unnecessarily complicated and bad designs can’t be corrected midstream because developers don’t have the power to challenge the design or change the design specifications.  There’s also the fear that BDUF leads to teams that will be complacent towards the workability of the design and the code.

     

    It’s an awfully good thing that almost nobody really does BDUF.  Some pointy hair types might think that their development teams are doing BDUF, but under the covers the technical team is working incrementally and adapting as they learn because that’s the only way to succeed.  I really don't think many teams will keep walking down an obvious path to failure, unless management makes them.  As one of my colleagues likes to say, “Agile is just telling ourselves the truth” and "waterfall is just a reporting tool for management."  I’ve been officially responsible for designing software for about five years now.  I put a lot more effort into designing for testability and less on big pluggable frameworks than I used to, but otherwise I’d say that my approach to design isn’t that much different today inside a Scrum process as it was 5 years ago at an MSF waterfall shop.  I still do a little bit of design before a little bit of coding and constantly keep notes on longer term design strategies.  The only real difference is that now my design activities don’t clash with the officially proscribed process.  Arguably I get to spend more energy on design today in an Agile process because I’m not wasting as much time complying with non-value added process activities.

     

    So What Are You Afraid Of?

     

    To a large degree I think that the arguments between continuous design and traditional upfront design are largely a waste of oxygen because we’re all talking past one another.  I think our assumptions and beliefs about software development are fundamentally different.  The differences between heavy, prescriptive design processes and continuous design advocates might be partially explained by our fears and assumptions. 

     

    Predictive Waterfall Mentality

    • Coding rework is expensive.  It’s dangerous and labor intensive to modify existing code. 
    • Upfront design is necessary and efficient because it eliminates extensive rework.  The design should address all the requirements before coding.
    • The best way to mitigate risks is to create and document the design approach early.  If we can adequately nail down the requirements and design early, the rest of the project is just mechanics.
    • Coordinating a team requires documentation.
    • As soon as the analysis and design phases are over, scope and design changes should be minimized.  Requirements change after the initial phases will jeopardize the project.
    • Developers can’t be trusted and need structure and process to do things well.

     

    Agile Mentality

    • Simplicity in design is important.  Overly complex designs lead to coding inefficiency and delay the delivery of code.  The requirements of the project are assumed to be changing, so don’t spend any effort that does not directly relate to the immediate requirements within a project iteration.
    • Heavy design before coding leads to unnecessarily complex designs.
    • Technical and project risks are only be mitigated by proving a design in working code and getting customer feedback.
    • It’s more efficient to get and apply feedback from coding to evolve a design than predictive design.  Predictive design is too difficult.
    • Technical work is best coordinated by collaboration and communication
    • Scope change is inevitable.  Maximize your ability to handle change by working iteratively and incrementally.

     

    You could just boil down the Agile vs. everything else argument down to their particular views on the “Cost of Change Curve.”  Personally, I do believe that the change curve can be flattened through a rigorous combination of Test Driven Development, Continuous Integration, and an investment in testing automation in general.  Part of that is my experience that TDD leads you to create code that is high quality in terms of cohesion and coupling.  Well written code with good architectural qualities can be modified much easier than code that is poorly structured with low cohesion and tight coupling. 

     

    For enterprise systems it's basically a guarantee that the system will always be changed after the initial project is finished.  You should probably build a system that can adapt to changes anyway.  There's two ways to go about that.  You can speculatively build in lots of plugin spots and create metadata driven code (been there, got the tee shirt).  Some SOA enthusiasts/architects force application teams to expose web services on systems just in case they're valuable later.  I think this is a stupid waste of resources that doesn't result in any immediate business value.  The better approach in my mind is to keep the application code limber by rigorously adhering to good coding practices (separation of concerns, high cohesion, low coupling, blah, blah, blah) and back it up with TDD and CI.  If you need to expose a web service later, you'll be able to do it -- when and exactly how you actually need to do it.

     

    The choice of development platform and tools will greatly impact your view of the change curve as well.  I think that it’s no coincidence that the Agile movement is originally a product of the Smalltalk community, a language and environment apparently well-suited to evolutionary development.  If I were still developing server side code in VB6 with all the COM versioning and deployment baggage I’d still believe more in heavier upfront design.  .Net is much more forgiving than COM for evolutionary development and we’ve got tools like TestDriven.Net, CruiseControl.Net, and NUnit to give us more rapid feedback.  You also cannot discount the advantages of an automated refactoring tool like ReSharper.  Being able to safely do refactorings quickly takes a great deal of the risk and overhead out of evolutionary approaches.

     

    I’ve also observed a marked difference in attitudes toward the technical staff between Agile and non-Agile development organizations.  Agile shops make an implicit assumption that the development team as a whole should be capable of thinking while doing.  Agile processes explicitly and correctly call out the importance of having talented individuals on a project team.  I’ve seen a lot of criticism of Agile based on the dependence on strong developers that have a very good design sense.  There is certainly some truth to that.  On the other hand, have you ever seen a big upfront specification from an architect or technical lead with weak design skills?  I have and it’s not pretty (think of a UML class diagram with one box that has one method called “Execute” and you’re not very far off).

     

    In my waterfall days my management truly believed that all you needed was one or two strong people who could create all of the direction for the rest of the developers.  Preferably the direction would be created in formalized in specification documents so that the stronger developers could be quickly allocated to another project.  There seems to be a very strong desire to treat developers like a commodity (JustAProgrammer).  I generally assign nefarious thinking to management in this case (outsourcing, low pay, interchangeable parts, keep the rabble down), but it’s probably an unfortunate reality for many big IT shops.  I certainly feel that Agile processes appeal to stronger developers and it’s easier to hire and retain strong developers in a good Agile shop.  It’s also impossible for a bad developer to hide in an Agile team.

  • Joel On Software Discovers Agile Planning

    Joel Spolsky needs no particular introduction.  He's effectively "the" blogger on software development and definitely one of my inspirations for doing the Shade Tree Developer.  He's also occasionally full of crap, especially when he's making uninformed pronouncements on the effectiveness of agile processes (the apallingly ignorant "BDUF is good*" post comes to mind) or the superiority of shrink-wrapped developers over anybody else.  The amusing post in the office today is http://www.joelonsoftware.com/articles/SetYourPriorities.html where Joel describes their release planning activities.  Their process sounds fine to me, and it should, because it's pretty close to how we do release planning in our Scrum process.  The implication that they have a separate testing phase at the end of coding (waterfall testing is even dumber than waterfall coding) is a little scary though.

     

     

    * Like many others, my criticism isn't with most of his contentions in that post.  If he thinks his high level requirements document helped him then it did, but that's not BDUF by any means.  My criticism is that he clearly misunderstands the meaning of the term BDUF (Big Design Upfront).  All Joel describes with his spec is the equivalent of an Agile story backlog.  BDUF is the stereotypical "let's create tons of detailed UML diagrams before any coding is allowed because coding is hard and we can make perfect UML."  We thought that our user stories were easily as detailed as his spec, and that's really not saying much.  Jon Tirsen had a good take on the blogstorm over Joel's BDUF post a couple months back.

    There's also the issue that BDUF is mostly a straw man argument against people doing really stupid versions of a waterfall.  Almost nobody does a pure waterfall.  Arguably one of the biggest problems with a waterfall is that your reality doesn't really match up with what the waterfall schedule says you're doing.

  • I can't believe I had to log a bug for this

    In the course of rewriting a new application to do *exactly* what the old system does I've hit an odd bug today.  We're running messages through both the old and new systems to compare the results.  We had a message go through that had a date of 1/1/1899.  The old system let it through but the new system treated the date as a null, causing a validation message that rejected the message.  Yes it's a bug but come on.  We apparently don't handle junk input the same way.  I logged the bug this afternoon with the description "<System XYZ> doesn't party like it's 1899."

    Sigh.

  • Unit Testing Business Logic without Tripping Over the Database

    A fairly common topic with TDD practitioners, both newbie and experienced, is how the heck to unit test business logic with a database hanging around.  I’ve had several conversations lately about this so I thought I’d get a post about it.  Before I try to talk about some strategies for this, here are a couple good reasons to avoid data access calls inside unit tests for the business logic:

     

    • Tests with database calls will execute significantly slower than tests that stay within an AppDomain.  Fast feedback cycles are an absolute necessity for effective TDD.  Slow tests directly impact the velocity of a development team in a negative manner.  When I was first out of college I was doing large engineering calculations on an old 486.  Each calculation run could take 5 minutes plus.  When we moved up to P5-100’s our productivity jumped significantly.  Slow unit tests are like working with 486’s.  I do miss napping in my cube though.
    • Tests with database calls are more labor intensive to write.  You have to create SQL statements to create a known database state before you run your tests.  Checking the test results probably involves scraping data back out of the database and coercing into .Net types just to do the assertions.  If you can stay inside the world of strongly typed .Net or Java code then “Intellisense” and code completion can speed you along.
    • Damned if you do, damned if you don’t. 
      • Embedding a bunch of SQL statements into the test fixture classes can make the tests harder to read and understand (trust me on this one)
      • Putting the database setup in another location can make the unit tests harder to understand and troubleshoot because the test data is external to the test fixture class.  “ALT-TAB Hell.”
    • The intent of the unit tests can be unclear.  Verifying a successful unit test by checking a status column in the database isn’t always the height of clarity.

     

    You might not agree with the above, but if you do, the rest of the post is about the various ways I’ve used or observed to isolate the testing of the business logic from persistence infrastructure.

     

    Mock the Database, but Where?

     

    One of the best ways to get the database out of your way is to hide data access behind abstracted interfaces that can be mocked in business logic testing.  All you’re doing here is treating data access as yet another service that is invoked from your application.  Simply take all of your data access and put them into some sort of Gateway pattern class.  Use some sort of Dependency Injection with the business class to substitute the data access implementation with a mock.  In the case below, you would create a mock object for IDataAccessGateway.

     

          public class BusinessClass

          {

                private IDataAccessGateway _gateway;

     

                public BusinessClass(IDataAccessGateway gateway)

                {

                      _gateway = gateway;

                }

     

                public void PerformSomeSortOfAction(DataSet dataSet)

                {

                      // Manipulate the DataSet in some way

                      _gateway.SaveSomething(dataSet);

                }

          }

     

          public interface IDataAccessGateway

          {

                void SaveSomething(DataSet dataSet);

          }

     

    Do not mock happy, fun ADO.Net.  By necessity ADO.Net is a low-level API.  As a general rule I would advise anyone to avoid mocking a low-level API under almost any circumstances.  Mocking even a simple ADO.Net call would involve several steps and objects for getting a connection, creating a command object, attaching said command to connection, creating a bunch of parameters, etc.  Just don’t go there.  I noticed a junior-junior pair having some trouble with a coding task last year.  When I finally looked over their shoulder I discovered that they were trying to write a unit test by using NMock to create dynamic mocks for the IDb* interfaces.  The unit test code was about 9 parts NMock.Expect() calls and 1 part performing the actual test.  They changed testing strategies and their work started to move again.

     

    This same stricture applies to any version of Microsoft’s Data Access Application Block (it’s all static methods anyway) or Enterprise Library.  These tools are still just thin veneers over ADO.Net and suffer from the same sort of mocking overhead that raw ADO.Net does.  Our internal analogue to EntLib has a dedicated static mock mechanism for unit testing low level data access code.  We’ve barely used it because it’s still not that convenient.

     

    To put this bluntly in a rule of thumb, a business or even service layer class should never have any reference to any of the System.Data.* namespaces.  I would allow an obvious exception for DataSet’s with the caveat that DataSet’s aren’t the best choice for business entities and not particularly suitable for Data Transfer Object’s either  (go ahead and argue, I’ll just sick Bellware on you).  I’d be really uncomfortable about referencing an IDataReader in business or service code too.  That smells awfully wrong to me.

     

    Invert the Control

     

    The section above talks about mocking data access if your business logic follows a Transaction Script pattern for organizing business logic.  If you’re starting from scratch on a system that is business logic intensive you’re probably better off to organize your business logic as a Domain Model anyway.  In this case your domain classes (business logic layer) are completely independent of any kind of persistence mechanism.  For persistence I like to use the Data Mapper pattern to load and persist the business objects (this is what tools like NHibernate do behind the covers).  The mapper classes are aware of the business domain classes instead of the business classes calling the data access classes (Inversion of Control).  A Service Layer class of some sort would be responsible for calling.  Here’s a sample of what I mean:

     

          public class OrderServiceClass

          {

                public void SendPendingOrder(SendOrderMessage sendOrderMessage)

                {

                      OrderMapper mapper = new OrderMapper();

     

                      // Find the correct Order object and call Send(Destination)

                      Order order = mapper.FindOrder(sendOrderMessage.OrderId);

                      order.Send(sendOrderMessage.Destination);

     

                      // Persist the new Order state

                      mapper.Save(order);

                }

          }

     

    The reason why this advantageous for unit testing is that you can test the business logic behind sending an order without any database interaction.  The business logic can be verified by checking the state of the Order object and it’s children before and after the call to the Send(Destination) method.

     

    I Do Not Like the Active Record Approach

     

    Another way to handle persistence of business domain classes is the Active Record pattern.  In this case each domain class is responsible for its own persistence.  Each domain class will have a signature like this:

     

          public class Order

          {

                public void Save(){}

                public void Delete(){}

                public void Insert(){}

                public void Load(long id){}

          }

     

    This came up recently at the Austin Agile lunches because one of our members was evaluating Rocky Lhotka’s CSLA.Net framework for one of his projects.  I’ll admit that I’m very biased against CSLA because the single worst codebase I’ve ever seen was a VB6 monster that used CSLA.  I could be wrong, but I still would not recommend something like CSLA.Net for TDD projects because I think the Active Record style of domain objects binds the business logic too tightly to the database.  I think that this style of data access makes unit testing without the database harder. 

     

    Several of the O/R tools for .Net are really Active Record patterns (usually with code generation).  Testability has to be a major concern when you choose a persistence strategy.  I definitely prefer a Domain Model approach with external mapping, but I’ve spoken with people who swear by Active Record classes with internal mapping.

     

    Point of View – the Database is the Application vs. the Database is Merely the Persistence Mechanism

     

    How you personally answer this question largely determines how you layout software systems.  At one extreme, database-centric folks obsess over relational models and treat the middle tier code as merely a conduit to get data in and out of the database.  This point of view seems to be much more prevalent in the Microsoft world and definitely among older developers and requirement analysts (flame away but you know its true).  I’ve often seen requirement specifications from business analysts that amounted to “get data from this table and go insert it over here in a different way.”  This kind of thing leads to some nasty design and architecture smells like gross duplication of code, zero encapsulation, scalability issues, and nightly batch jobs that run for 30 hours at a time locking transactional tables left and right (true story).  In one instance it also led to a PM making a waterfall schedule that had 75% of the development man-hours to logical and physical data modeling on a system that probably had only 4-5 database entities but a complex user interface and a bunch of integrations to legacy systems (thar be dragons).

     

    The other issue with a database-centric development philosophy is that automated unit tests are less effective.  Toss in largish stored procedures and you’ve got a mess.  Procedural code can be more difficult to unit test than well-factored Object Oriented code because it’s more difficult to isolate pieces of functionality.  Embedding this procedural logic within a database just compounds the problem.  Yes, there are xUnit toolsets for database access now but they’ll never be as easy to use as NUnit or JUnit.

     

    No Business Logic in Stored Procedures!

    <