CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Raymond Lewallen

Professional Learner

Introduction to Refactoring

Evolution.  It is inevitable.  Software succumbs to evolution, like everything else.  Almost all software goes through a process of revisions and changes between the time it was born a wee little prototype, to its inevitable death.  Object oriented programming is by far, IMO, the easiest code design to deal with when having to make changes during the life of a software product.  However, just because you might be using OOP, doesn’t mean you have optimal design.  Last week I talked about analyzing code metrics, which play a big part in determining how good, usable and maintainable your design is.

This is where refactoring comes in.  Refactoring can be defined as a process where developers examine existing code and improve the design of that code by means of modifications.  While there is no specific model for the process of refactoring, there are certainly some common areas where you see most problems in design, which is where refactoring takes place.

Refactoring is a very in-depth subject, and I’m only going to skim the top of it for you beginners out there.  I’m not going to get into test driven development or patterns of software design, but simply explain 4 common tasks which are encapsulated in refactoring.

Identify methods that can be moved.

This means to look for methods that are encapsulated in the wrong class.  Methods should perform a task relevant to the class in which they belong.  If a method is making frequent calls to another class, consider moving that method to the other class.  Look at the following example:

Namespace Refactoring

 

    Public Class Person

 

        Private _dateOfBirth As Date

        Public Property DateOfBirth() As Date

            Get

                Return _dateOfBirth

            End Get

            Set(ByVal Value As Date)

                _dateOfBirth = Value

            End Set

        End Property

 

    End Class

 

    Public Class VoterRegistration

 

        ' Used to determine if the person is old enough to vote

        Public Function CalculateAge(ByVal voter As Person) As Int32

            Dim years As Int32 = DateTime.Now.Year - voter.DateOfBirth.Year

 

            If DateTime.Now.Month < voter.DateOfBirth.Month OrElse (DateTime.Now.Month = voter.DateOfBirth.Month AndAlso _

                DateTime.Now.Day < voter.DateOfBirth.Day) Then

                years = years - 1

            End If

 

            Return years

 

        End Function

 

    End Class

 

End Namespace

Hopefully it is clear that “CalculateAge” is in the wrong class.  It is taken a single argument of type “Person” and acting entirely upon that argument.  This is a clear case where a method should be moved, in this case from the “VoterRegistration” class to the “Person” class so you have code like below for the “Person” class:

    Public Class Person

 

        Private _dateOfBirth As Date

        Public Property DateOfBirth() As Date

            Get

                Return _dateOfBirth

            End Get

            Set(ByVal Value As Date)

                _dateOfBirth = Value

            End Set

        End Property

 

        Public Function Age() As Int32

            Dim years As Int32 = DateTime.Now.Year - _dateOfBirth.Year

 

            If DateTime.Now.Month < _dateOfBirth.Month OrElse (DateTime.Now.Month = _dateOfBirth.Month AndAlso _

                DateTime.Now.Day < _dateOfBirth.Day) Then

                years = years - 1

            End If

 

            Return years

 

        End Function

 

    End Class

 

Identify new methods.

Common mistakes amongst developers is to create methods that are too complex and accomplish too much themselves.  This leads to methods that are hard to build upon, maintain and debug.  Cyclomatic complexity is a common code metric to use to determine if a method is too complex.  Dividing methods into smaller, more easily managed pieces improves upon simple design and improved clarity.  Repetative code is another clear indicator of where to create new methods.

Take a look at the following code:

A method that is reusing code within itself

 

Namespace Refactoring

 

    Public Class Foo

 

        Public Function GetData() As DataTable

 

            Dim dt As New DataTable

 

            Dim dcFirstName As New DataColumn

            dcFirstName.DataType = Type.GetType("System.String")

            dcFirstName.AllowDBNull = False

            dcFirstName.Caption = "FirstName"

            dcFirstName.ColumnName = "FirstName"

            dcFirstName.DefaultValue = Nothing

            dt.Columns.Add(dcFirstName)

 

            Dim dcLastName As New DataColumn

            dcLastName.DataType = Type.GetType("System.String")

            dcLastName.AllowDBNull = False

            dcLastName.Caption = "LastName"

            dcLastName.ColumnName = "LastName"

            dcLastName.DefaultValue = Nothing

            dt.Columns.Add(dcLastName)

 

            Dim dcDateOfBirth As New DataColumn

            dcDateOfBirth.DataType = Type.GetType("System.DateTime")

            dcDateOfBirth.AllowDBNull = False

            dcDateOfBirth.Caption = "DateOfBirth"

            dcDateOfBirth.ColumnName = "DateOfBirth"

            dcDateOfBirth.DefaultValue = Nothing

            dt.Columns.Add(dcDateOfBirth)

 

            Return dt

 

        End Function

 

    End Class

 

End Namespace 

Obviously, its staring you right in the face that there is reusable code here. This is where you create a new method.

Refactored to create a new method

Namespace Refactoring

 

    Public Class Foo

 

        Public Function GetData() As DataTable

 

            Dim dt As New DataTable

 

            dt.Columns.Add(BuildColumn("FirstName", Type.GetType("System.String")))

            dt.Columns.Add(BuildColumn("LastName", Type.GetType("System.String")))

            dt.Columns.Add(BuildColumn("DateOfBirth", Type.GetType("System.DateTime")))

 

            Return dt

 

        End Function

 

        Private Function BuildColumn(ByVal columnName As String, ByVal columnType As Type) As DataColumn

            Dim dc As DataColumn = New DataColumn

            dc.DataType = columnType

            dc.AllowDBNull = False

            dc.Caption = columnName

            dc.ColumnName = columnName

            dc.DefaultValue = Nothing

            Return dc

        End Function

 

    End Class

 

End Namespace

 

Identify inheritance.

Many times when you see “Select Case” or “Switch” statements, this is a strong indicator that the code should be refactored into an inheritance design.  Look at the following code:

Part of a carnival ride program.

 

Namespace Refactoring

 

    Public Class Foo

 

        Private baseTokenAmount As Int32 = 1

 

        Public Function GetNumberOfRequiredTokens(ByVal typeOfPerson As PersonType) As Int32

            Select Case typeOfPerson

                Case PersonType.Infant

                    Throw New Exception("Too young to ride")

                Case PersonType.Child

                    Return baseTokenAmount

                Case PersonType.Adolescent

                    Return baseTokenAmount * 2

                Case PersonType.Adult

                    Return baseTokenAmount * 3

                Case PersonType.Senior

                    Throw New Exception("Too old to ride")

            End Select

 

        End Function

 

        Public Enum PersonType

            Infant

            Child

            Adolescent

            Adult

            Senior

        End Enum

 

    End Class

 

End Namespace

The code is pretty clean and simple, but it makes it hard to build on and add other persontypes into the system.  If you have code like this everywhere, you’d have to go into a lot of different places in the program and change code to adjust for an added personType.  This is an example of code that should be refactored into inheritance.  The result would be the following code:

Conditional refactored to inheritance and polymorphism.

Public Class Foo

 

    Public Shared Function GetNumberOfRequiredTokens(ByVal person As IPerson) As Int32

        Return person.GetNumberOfRequiredTokens()

    End Function

 

End Class

 

Public Interface IPerson

    Function GetNumberOfRequiredTokens() As Int32

End Interface

 

Public MustInherit Class Person : Implements IPerson

    Private baseTokenAmount As Int32 = 1

    Protected ReadOnly Property Tokens() As Int32

        Get

            Return baseTokenAmount

        End Get

    End Property

 

    Public MustOverride Function GetNumberOfRequiredTokens() As Int32 Implements IPerson.GetNumberOfRequiredTokens

End Class

 

Public Class Child : Inherits Person

    Public Overrides Function GetNumberOfRequiredTokens() As Int32

        Return MyBase.Tokens()

    End Function

End Class

 

Public Class Adult : Inherits Person

    Public Overrides Function GetNumberOfRequiredTokens() As Int32

        Return MyBase.Tokens * 3

    End Function

End Class

 

Public Class Infant : Inherits Person

    Public Overrides Function GetNumberOfRequiredTokens() As Int32

        Throw New TooYoungException

    End Function

End Class

 

Public Class TooYoungException : Inherits Exception

 

End Class

Now, to add a new person type, we just add a new class for that person type that inherits from the class “Person” and we won’t have to change any existing code, we only added new code.

 

Fix your variable names.

What do you think td stands for?  You can probably come up with dozens of different things it could possible be.  What if I told you it stands for todaysDate?  This is a big problem in code.  Meaningless variable names.  In Visual Studio, sure Intellisense tells me its a date, but other than that, what the heck is it for?  Give your variables meaningful names, and not names where you leave out all the vowels either.  tdysDt is not very helpful either.  You are going to save somebody a lot of time and headache in the future if you use meaningful names.  I’ve even seen people go back to their own code before and have to decifer what the heck td stood for by search back through code.

 

I have explained 4 common design issues to look for when beginning your code refactoring.  Once you do it a little while, it becomes easier and you’ll be able to code to avoid these pitfalls, rather than having to fix them afterwards by refactoring.



Leave a Comment

(required)  
(optional)
(required)  

Enter the numbers above:
Add

About Raymond Lewallen

Working primarily in the public sector during his career, Raymond has designed and built several high profile enterprise level applications for all levels of the government. Raymond now works as a solutions architect for EMC. Raymond is an agile coach, Microsoft MVP C# and also president of the Oklahoma City Developers Group and Oklahoma Agile Developers Group. Raymond spends a lot of his time learning and teaching such things as Test Driven Development, Domain Driven Design, Design Patterns and Extreme Programming practices and principles, to name a few. Raymond is also an advocate of Alt.Net. Raymond is primarily a framework guy, so don't ask him anything about UI :) Check out Devlicio.us!

Our Sponsors