CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Super Models, Part 2: Avoid Mutators

A quick disclaimer: we're entering religious territory here. I feel strongly about this issue, but it's certainly my opinion. If you want to get the full sense of how passionate people are about this issue, check out this article at JavaWorld.

I've come to the point of view that Entities should not use set mutators ("setters"). Anything you represent as a setter can usually be better represented as a plain old method. Why?

Let's remember one of the fundamental guidelines of DDD: intention revealing interfaces. Which is more intention revealing?

Customer c = new Customers().Find(42);

// This...
c.Address = aNewAddressValueObject;

// or this?
c.ChangeAddress(aNewAddressValueObject);

That's a subtle point and it's admittedly open for debate. To firm up my argument let's consider the situation where I'm crunching knowledge into model and address change becomes a temporal concept. I now need to track what an address is of a certain date and time. Surely a setter will do the trick, but I think the command "change an address" is better expressed with a method. I can fiddle with the internals of that method and, because we're practicing command-query separation, accept that commands change the state of aggregate roots and entities.

For me, the biggest argument for avoiding setters is that it promotes thinking in terms of data in a paradigm (DDD) where data-first can be harmful and counter productive. I cannot stress enough: entities are behavioral objects and, for my money, Action<T> where T is some value object fits the notion of command rather elegantly.

An Amazing Introduction to NDepend

Andre Loker just published an amazing introduction to NDepend on its blog. Most of features are introduced with some methodology reminder about why it is useful. Very nice job!

Such introduction is welcomed. Indeed, something difficult in promoting a tool such as NDepend is to educate about what it can bring to your development shop in terms of agility. NDepend comes with a set of innovative features currently not supported by any other .NET tool. I like to think that what tools such as ReSharper or CodeRush are doing to your code at micro level (i.e methods' body structuring), NDepend does it at macro level (i.e class, namespace, assembly structuring). Hence, as a developer I personally use both kind of tools to automatically control every aspects of the code base I am working on.

It seems that the direction taken by NDepend is a promising one since the future Microsoft Oslo will support some similar features such as Architecture Explorer / Dependencies Matrix. Although this could be considered as a threat for the future of NDepend, my opinion is that it is a bless both because the Oslo schedule lets enought time to continue innovating and choose complementary directions, and because this will de-facto educate massively developers/architects about the usefulness of such tooling.

 

 

 
 

 

Scale Cheaply - Memcached

I generally subscribe to the attitude that premature optimizations are evil, but I strongly believe that a robust caching strategy should evolve alongside the rest of the system. Waiting too long makes it hard to cleanly and thoughtfully add caching. Besides, in my experience, a considered caching strategy generally means I worry less about performance in other areas - especially data access and data modelling. In other words, I can build those complex parts for maintainability, as opposed to having to worry about the cost of each individual query.

.NET developers are pretty cache-savvy - thanks largely in part to the powerful System.Web.Caching namespace and ASP.NET's simple to use OutputCaching capabilities. For that reason, and the fact that it tends to be very application specific, I don't want to go over how to decide what to cache, how to deal with synch issues, updates and so on. Instead, I specifically want to talk about Memcached.

You're probably already familiar with Memcached - it's a highly efficient distributed caching system. It's used generously by all the big web 2.0 players (In may 2007 it was revealed that Facebook relies on 200 16GB quad-core dedicated Memcached servers). Interest in Memcached from the .NET community has been relatively low (although over the last year more and more people are talking about it). Frankly, if you're doing anything that requires horizontal scaling you're seriously shooting yourself in the foot by overlooking it. It runs on windows - although we run it on Linux and there's really no reason for you not to learn that too!

Fundamentally, there are two problems with the built-in cache. First, it's limited to the memory of a single system which happens to be shared with the rest of your application domain. Secondly, if you have two servers, each with their own in-memory cache, users are likely to see very weird synching issues. Memcached isn't as fast as in-memory caching, but will scale to virtually unlimited amount of memory. There isn't any redundancy of failover, simply memory spread across multiple servers.

The best part is that it literally takes seconds to get it up and running. First, download a windows build onto your development machine here. (look for the win32 binary of memcached). Unzip the package somewhere, I put mine in c:\program files\memcached\. Next, from the command line, run memcached -d install. This will install memcached as a service. You can run memcached -h for more command lines options. You'll need to start the service (I also changed my startup type to manual, but that's completely up to you).

The next step is to install the client library. I use suggest Enyim Memcached from CodePlex. The project comes with a sample configuration file, which you should be able to easily incorporate into your web.config or app.config. While developing, only put one server 127.0.0.1 on port 11211 (which is the default). You also need to add a reference to the two dlls.

Aside from that, you basically program against a simple API. You create an instance of MemcachedClient (it's thread-safe so you can use a singleton, or re-create it since it's inexpensive to create), and call Store, Get or Remove (or a few other useful methods) like you would the normal cache object. As I've blogged about before (here and here), I'm a fan of hiding all of this behind an interface to ease mocking and swapping.

Here's an example:

MemcachedClient client = new MemcachedClient();
client.Store(StoreMode.Set, "Startup", DateTime.Now, DateTime.Now.AddMinutes(20));
DateTime startup = client.Get<DateTime>("Startup");
client.Remove("Startup");

Trying to access Windows Search from SQL Server: An Appeal

Had some time to review my search dilemma today and three billable hours later, I'm no further along in my task.

There is no shortage of documentation on the new Windows Search and I do so desperately want to try it out. Just not as a client.

To sum up my requirements, I have a search screen that combines meta data in a SQL database with a contents search. That is, the user types in a search word or phrase and/or selects meta data about the documents (e.g. country, document type, date, and so on and so forth).

At present, using Indexing Services, I am able to perform such a query within SQL Server with a single SQL Statement. That is what I would like to do with Windows Search.

Alas, all efforts to connect to Windows Search via SQL Server have failed. The nearest I've come is to find a couple of lost souls to commiserate with who are having the same problem. I've tried every possible combination of provider, datasource, and provstr I can fathom with the connection string, Provider=Search.CollatorDSO;Extended Properties=\"Application=Windows\", and have come up empty.

The issue appears to be with SQL Server as I am able to run pretty much every sample app under the sun and get search results back. I've even overcome my fear of C++ to read through some samples in the Search SDK.

So I'm appealing to you, generous reader(s), for some help. I know I've broken the unwritten rule of finding at least one workable solution, however hideous, before asking for help but the nature of Windows Search is such that any workable solution would be the one I glom onto.

Again, the criteria is relatively straightforward: I want to be able to search for documents (Word, PowerPoint, PDF, and Excel only for the moment) based on their contents as well as metadata.

I can do that with Windows Search now but it would involve retrieving a result set based on metadata, retrieving a second result set based on contents, then merging the two. Given the size of the repository (about 2600 documents), this is do-able but it's the kind of bastardized union that perpetuates the hillbilly stereotype. And I'm trying to be more PC.

I would consider SharePoint only if someone can convince me it is an elegant solution that specifically meets these requirements. I am also open to a third-party component that indexes files and provides an API. But again, I can do this already with Windows Search. It would have to be something that can combine with SQL Server.

And if I don't get a decent answer, then I'm gonna...well, swear a bit maybe but that's probably it.

Kyle the Unthreatening

Some RichTextBox tricks

I have recently been responsible for refactoring the Code Query Language query editor in NDepend to fix some imperfections.

 

 

 

The CQL query editor implementation is based on a class derived from the System.Windows.Controls.RichTextBox class. It was the opportunity to learn some tricks that I would like to share in the current post.

 

Text Coloring

If you google how to color the text displayed in a RichTextBox, you’ll certainly end up using the coloring selection trick, using the RichTextBox method Select() and properties SelectionColor, SelectionBackColor:

 
Using a search engine is very misleading here. We end up to the conclusion that this approach comes with extremely bad performance, even on short text with just dozens of word to color.

 

A much better way we found is to use the Rtf / Rich Text Format capabilities of the RichTextBox. You just need to format a rtf string and use this code:

 

I won’t detail the Rtf format here. The Rtf string for the query above looks like:

 

Notice that using the SelectedRtf property is also a good way to prevent improper formatted text copy/pasted from Microsoft Word or a Browse for example.

 

Avoid flickering problem

When you update the content of your RichTextBox, you’ll certainly notice some pesky flickering. Hopefully, an efficient solution to this problem can be found here. Basically the solution consists in disabling text redrawing by calling some win32 APIs:

 

Testing for ScrollBars’ visibility

After looking for a way to test if the RichTextBox’s ScrollBars are visible or not, the only way I found is to infer this information from the delta between this.ClientRectangle and this.Size. This is certainly not the cleanest way but it is working well in every context I tried:

 

Get/Set the ScrollBars’ positions

To achieve this I came to the conclusion that it must be done throught the good-old win32. Being able to get and set the ScrollBars’ positions is especially useful to avoid some pesky automatic RichTextBox content re-locating I notice in some circumstances, such as inserting or modifying a long text. Here is the code:

 

Url Detection

Something that I wasn’t aware: if you want to display Urls that can be clicked in your text box, just set the RichTextBox.DetectUrls property to true. You can then use the RichTextBox.LinkClicked event to handle the url click.

 

 

Recursing into Linear, Tail and Binary Recursion

In my previous post, I talked about some of the basics of recursion and why you might want to use it to your advantage.  Today, let's dive a little deeper into the different kinds of recursion, including linear, tail recursion and finally binary recursion.  This is in a series of back to basics covering recursion in some depth.

Starting Off

Where we left off is to take a simple imperative statement and make it not only recursive, but we could also hypothetically turn it into tail recursive as well.  Let's start off with the simple, yet overused factorial example in a very imperative way using looping:

C#
public static int Factorial(int n)
{
    var fact = 1;
    var i = n;

    while (i > 0)
    {
        fact = fact * i;
        i--;
    }

    return fact;
}

F#
let factorial_imperative n =
  let mutable fact = 1
  let mutable i = n
 
  while i > 0 do
    fact <- fact * i
    i <- i - 1

  fact

So, what we're left with is mutation galore.  That's perfectly ok in the imperative world, but it doesn't make sense to me in the functional world since we have such things as recursion.  Functional languages make you go out of your way to make values mutable, because by default they aren't mutable at all.

Linear Recursion

Linear recursion is by far the most common form of recursion.  In this style of recursion, the function calls itself repeatedly until it hits the termination condition. After hitting the termination condition, it simply returns the result to the caller through a process called unwinding.  Most of my samples I posted in the previous post followed this way of recursion.

C#
public static int Factorial(int n)
{
    if (n <= 1)
        return 1;
    else
        return n * Factorial(n - 1);
}

F#
let rec factorial_linear n =
  if n <= 1 then 1 else n * factorial_linear (n - 1)

As you noticed, the last call here is a calculation, which helps unwind itself to the termination condition of 1.  Of course it's also guarded against negative input as well which would cause an infinite loop, which would be pretty bad.  But with large input, this could be a problem.

Tail Calls

As I've mentioned before, it's pretty important to think about the stack when you do recursion.  For the reasons of stack overflows, it's pretty important to mention.  When you call an F# function, stack space is allocated and then freed when the function returns, or, when a tail call is performed.  We have to be aware of this, because a very deep set of nested function calls will cause a StackOverFlowException to be thrown.  Below is a simple example on how this can occur. 

#light

let rec recursiveFunc i : unit =
  if i >= 1000000 then
    ()
  else
    if i % 1000 = 0 then
      printfn "Recursing at %i" i
    recursiveFunc (i + 1)
    printfn "Just called the function with %i" i

recursiveFunc 100

The above function actually recurses, but that's not the last thing the function does.  It in turn calls a printf end line function to display the result.  If you let this run, you'll get a nice StackOverFlowException and it will take down your F# Interactive (fsi) session.  How do you fix this situation?  Well, it's a topic called tail recursion which will expand upon this.

Tail Recursion

Tail recursion is a specialized form of the linear recursion where the last operation of the function happens to be a recursive call.  The difference here is that in my previous samples, I've been calling functions which perform a calculation on the result of the recursive call.  That could lead to stack overflows should the recursion get too deep.  Instead, I don't want to do any work during the unwinding phase, just return the value from the function.

Let's take the above sample and make it tail recursive:

#light

let rec recursiveFunc i : unit =
  if i >= 1000000 then
    ()
  else
    if i % 1000 = 0 then
      printfn "Recursing at %i" i
    recursiveFunc (i + 1)

recursiveFunc 100

All I had to do was get rid of the last statement and now the last statement is simply a calling of the function again with no work performed.  If I run it through the F# interactive again, I get no problems at all, and it runs through smoothly.  Let's refactor our factorial to be tail recursive in F#, and then look at a C# solution.

F#
let rec factorial_tail n =
  let rec factorial_inner acc x =
    if x <= 0I then acc
    else factorial_inner (x * acc) (x - 1I)
  factorial_inner 1I n   

As you can see, the last calls to my recursive inner function is indeed tail recursive as no work is being done to the result of any calls.  From there, my outer function should be able to call passing in the accumulator of 1 and the number we're passing in.  Since I'm using BigInt, I don't have a problem with overflow either in regards to really large input.  This is something that the .NET BCL was working on, but kept it internal in the System.Core.dll it seems.  But, F# was nice enough to give us BigNum and BigInt implementations, as F# tends to be rather math centric.

The Problem with C# and Tail Calls

Now turning our attention to the C# counterpart to this, let's try our implementation of the above functionality in it.  Instead of just another function, I'll inline the accumulator function inside as an anonymous function:

C#
public static ulong Factorial(ulong n)
{
    Func<ulong, ulong, ulong> factorial_acc = null;
    factorial_acc = (acc, x) =>
    {
        if (x <= 0) return acc;
        return factorial_acc(x * acc, x - 1UL);
    };

    return factorial_acc(1UL, n);
}

But the problem is, is that this won't work on a 32 bit machine.  Why you might ask?  Well, as Jomo Fisher, an F# team member, points out in his post about recursion in three languages, the C# compiler does not do tail call optimization.  Instead, all managed languages have a second opportunity to optimize through either NGEN or the JIT compiler.  As it turns out, the x64 version of the JIT will optimize the tail call right now, whereas the 32 bit compiler will not.  By default on your C# projects, it's targeted at AnyCPU which in my case since I'm running on a VPC, targets x86, will cause the overflow, whereas if I had been running on my base machine, it would have been ok.

This is another in the line of reasons why F# is the better language for recursion as well as functional programming fundamentals.  C# will get you partially there, but there are things such as these which will trip up developers and leaving themselves nowhere to go.  To which I say, pick the best language for the job at hand, whether it be F#, C#, Ruby, Python, Erlang, Haskell, JavaScript, etc.

Binary Recursion

Another form of recursion is binary recursion.  This form of recursion has the potential for calling itself twice instead of once as with before.  This is pretty useful in such scenarios as binary trees as well as the trite and overused Fibonacci sequence.  Such an example is below:

let rec fibonacci n =
  if n <= 2 then 1
  else fibonacci (n - 1) + fibonacci (n - 2)

But what's interesting is that we could do this better through tail recursion through the use of our inner auxiliary function, such as this:

let fib n =
  let rec fib_acc a b x =
    if x <= 1 then b
    else fib_acc b (a + b) (x - 1)
  fib_acc 0 1 n

But, back to the point, here's another use of binary recursion to print out the values in a tree.

type Tree<'a> = | Leaf of 'a | Node of Tree<'a> * Tree<'a>

let rec printBinaryTreeValues t =
     match t with
     | Leaf x -> printfn "%A" x
     | Node (l, r) ->
         printBinaryTreeValues l // called once
         printBinaryTreeValues r // called twice
 
printBinaryTreeValues (Node ((Node (Leaf "jeden", Leaf "dwa")), (Node (Leaf "trzy", Leaf "cztery"))))

As you can see from this example, I printed out the values from the binary tree, leaf by leaf using binary recursion. 

Wrapping It Up

As you can see here, I've covered a bit of ground on more recursive algorithms.  There is still yet more to be covered including processing of lists, unbalanced trees, continuations and so on.  More on that soon!  In the mean time, I hope you rediscover recursion and what it can do for you in terms of condensing code.  But, it's also important to see where it makes sense and where it doesn't.

I'm opening a new Palermo blog - subscribe now!

That's right, I'm opening a second blog.  I'm also keeping this one.  So, here's the rundown. 

More Posts Next page »



What's New

CodeBetter.Com Blogs

Dave Laribee
54 Posts | 348 Comments | 119 Trackbacks
Patrick Smacchia [MVP C#]
57 Posts | 228 Comments | 270 Trackbacks
Karl Seguin
123 Posts | 1,091 Comments | 301 Trackbacks
Kyle Baley - The Coding Hillbilly
85 Posts | 457 Comments | 145 Trackbacks
Matthew Podwysocki
31 Posts | 92 Comments | 78 Trackbacks
Jeffrey Palermo [MVP]
654 Posts | 2,476 Comments | 620 Trackbacks
Jeremy D. Miller -- The Shade Tree Developer
545 Posts | 4,002 Comments | 1,800 Trackbacks
James Kovacs
34 Posts | 115 Comments | 34 Trackbacks
Glenn Block
43 Posts | 140 Comments | 81 Trackbacks
Peter's Gekko
536 Posts | 2,309 Comments | 579 Trackbacks
Ian Cooper [MVP]
32 Posts | 173 Comments | 127 Trackbacks
David Hayden [MVP C#]
281 Posts | 757 Comments | 570 Trackbacks
Aaron Jensen
10 Posts | 33 Comments | 18 Trackbacks
Greg Young [MVP]
92 Posts | 534 Comments | 240 Trackbacks
Jeff Lynch [MVP]
226 Posts | 489 Comments | 99 Trackbacks
Rod Paddock
37 Posts | 143 Comments | 40 Trackbacks
Jacob Lewallen
5 Posts | 40 Comments | 6 Trackbacks
Steve Hebert's Development Blog
309 Posts | 540 Comments | 96 Trackbacks
Raymond Lewallen
298 Posts | 1,917 Comments | 343 Trackbacks

Other Blogs

CodeBetter.Com Events
3 Posts | 0 Comments | 13 Trackbacks
CodeBetter.Com Link Blog
178 Posts | 9 Comments | 2 Trackbacks
All About Products
2 Posts | 4 Comments | 2 Trackbacks
Featured Articles
1 Posts | 4 Comments | 1 Trackbacks

CodeBetter.Com Alumni

Jean-Paul S. Boodhoo
136 Posts | 459 Comments | 132 Trackbacks
Don Demsak
5 Posts | 14 Comments | 1 Trackbacks
Jay Kimble -- The Dev Theologian
423 Posts | 1,431 Comments | 167 Trackbacks
Eric Wise
290 Posts | 1,482 Comments | 210 Trackbacks
Brian Peek [MVP C#]
15 Posts | 15 Comments | 3 Trackbacks
Mark DiGiovanni
88 Posts | 322 Comments | 38 Trackbacks
Paul Laudeman
113 Posts | 231 Comments | 22 Trackbacks
Ben Reichelt's Weblog
172 Posts | 514 Comments | 70 Trackbacks
Ranjan Sakalley
41 Posts | 272 Comments | 6 Trackbacks
Public Class GeoffAppleby
300 Posts | 1,943 Comments | 108 Trackbacks
DonXML - Live From PDC
3 Posts | 4 Comments | 3 Trackbacks
Grant Killian's Blog
171 Posts | 473 Comments | 8 Trackbacks

CodeBetter.Com Emeritus

Darrell Norton's Blog [MVP]
726 Posts | 2,380 Comments | 329 Trackbacks
Brendan Tompkins [MVP]
405 Posts | 2,718 Comments | 345 Trackbacks