CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Patrick Smacchia [MVP C#]


Iterators, LINQ deferred execution and prime numbers computation

There is something that I am proud of. In my book Practical .NET2 and C#2 I presented the concept of deferred execution and deferred execution is a central concept in LINQ in-memory querying. The C# team made deferred execution easy to implement thanks to the concept of iterators introduced with C#2.

 

At that time of C#2, 2 years ago, the community was wondering why the hell the C# team invested so much in such a tricky compiler artifact, that is basically only useful to save you a dozen lines of code when you implement your own collection classes. But thanks to the Wesner Moise thoughts in its blog, I realized that iterators are a good way to implement some funny stuff such as some pipeline patterns. This is illustrated by the example 47 of the Chapter 14

 

Here is this example:

 

using System.Collections.Generic;

class Program {

   static public IEnumerable<int> PipelineIntRange(int begin, int end) {

      System.Diagnostics.Debug.Assert(begin < end);

      for (int i = begin; i <= end; i++) {

         System.Console.WriteLine("Yield:" + i);

         yield return i;

      }

   }

   static public IEnumerable<int> PipelineMultiply(int factor,

                                               IEnumerable<int> input) {

      foreach (int i in input)

         yield return i * factor;

   }

   static public IEnumerable<int> PipelineFilterModulo(int modulo,

                                               IEnumerable<int> input) {

      foreach (int i in input)

         if (i % modulo == 0)

            yield return i;

   }

   static public IEnumerable<int> PipelineJoin(IEnumerable<int> input1,

                                               IEnumerable<int> input2) {

      foreach (int i in input1)

         yield return i;

      foreach (int i in input2)

         yield return i;

   }

   static void Main() {

      IEnumerable<int> pipeline = PipelineJoin(

         PipelineIntRange(-4, -2), PipelineFilterModulo(3,

                                      PipelineMultiply(2,

                                      PipelineIntRange(1, 10))));

      System.Console.WriteLine("Begin foreach:" + i);

      foreach (int i in pipeline)

         System.Console.WriteLine(i);

   }

}

 

This program outputs:

 

Begin foreach:


Yield:1
Yield:2
Yield:3
6
Yield:4
Yield:5
Yield:6
12
Yield:7
Yield:8
Yield:9
18
Yield:10
Yield:-4

-4
Yield:-3
-3
Yield:-2
-2 

 

Here is a small explanation why the pipeline output the sequence  6 12 18 -4 -3 -2:

 

PipelineIntRange(1,10)    1  2  3  4  5  6  7  8  9  10

PipelineMultiply(2)       2  4  6  8  10 12 14 16 18 20

PipelineFilterModulo(3)         6        12       18

PipelineIntRange(-4, -2)                                -4 -3 -2  

PipelineJoin                    6        12       18    -4 -3 -2

 

What is funny is the way we obtain the sequence. The order the Console.WriteLine() we put into the program are executed shows 2 tricky things:

 

  • First, no computation happen before the foreach() instruction. The pipeline object actually represents a tree of pipeline node objects.

 

  • Second, each integer goes through all chain pipeline nodes (except if a node discard it) before the next integer is treated.

 

This is deferred execution!

 

Now, it is easy to rewrite this example with C#3 extension methods to make the code more LINQ compliant:

 

 

using System.Collections.Generic;

static class Program {

   static public IEnumerable<int> PipelineIntRange(int begin, int end) {

      System.Diagnostics.Debug.Assert(begin < end);

      for (int i = begin; i <= end; i++) {

         System.Console.WriteLine("Yield:" + i);

         yield return i;

      }

   }

   static public IEnumerable<int> PipelineMultiply(

                                               this IEnumerable<int> input,

                                               int factor) {

      foreach (int i in input)

         yield return i * factor;

   }

   static public IEnumerable<int> PipelineFilterModulo(

                                               this IEnumerable<int> input,

                                               int modulo) {

      foreach (int i in input)

         if (i % modulo == 0)

            yield return i;

   }

   static public IEnumerable<int> PipelineJoinWith(

                                               this IEnumerable<int> input1,

                                               IEnumerable<int> input2) {

      foreach (int i in input1)

         yield return i;

      foreach (int i in input2)

         yield return i;

   }

   static void Main() {

      IEnumerable<int> pipeline =

         PipelineIntRange(1, 10)

         .PipelineMultiply(2)

         .PipelineFilterModulo(3)

         .PipelineJoinWith(PipelineIntRange(-4, -2));

 

      foreach (int i in pipeline)

         System.Console.WriteLine(i);

   }

}

 

This time, the way the pipeline object is defined makes it much more look alike a LINQ query.

 

 

 

More fun with prime number computation!

 

The example 50 of the Chapter 14 is even more fun.

 

This time, the deal is to chain iterators to compute prime numbers. This is possible thanks to an easy algorithm to compute prime numbers named Sieve of Erathostène. The following schema shows how the algorithm works:

 

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |

  3   5   7   9    11    13    15    17 | 3 is prime, 

                                        | 4,6,8,10,12,14,16,18 and

                                        | 20 are multiples of 2.

                                        |

      5   7        11    13          17 | 5 is prime, 9 and 21

                                        | are multiples of 3.

                                       

          7        11    13          17 | 7 is prime, the list doesn’t

                                        | contain any multiple of 5.

                                       

                   11    13          17 | 11 is prime, the list doesn’t

                                        | contain any multiple of 7.

                         13          17 | 13 is prime, the list doesn’t

                                        | contain any multiple of 11.

                                       

                                     17 | 17 is prime, the list doesn’t

                                        | contain any multiple of 13.

Here is the code:

 

using System;

using System.Collections.Generic;

class Program {

   static public IEnumerable<int> PipelineIntRange( int begin, int end ) {

      System.Diagnostics.Debug.Assert( begin < end );

      for ( int i = begin; i <= end; i++ )

         yield return i;

   }

   static public IEnumerable<int> PipelinePrime( IEnumerable<int> input ) {

      using ( IEnumerator<int> e = input.GetEnumerator() ) {

         e.MoveNext();

         int prime = e.Current;

         // The first number of the list is a prime.

         Console.WriteLine( prime );

         if ( prime != 0 ) {

            while ( e.MoveNext() ) {

               // Remove all multiple of the found prime.

               if ( e.Current % prime != 0 )

                  yield return e.Current;

            }

         }

      }

   }

   const int N = 100;

   static void Main() {

      // Apply the approximation of Gauss/de la Vallée Poussin

      // to get the number of iterators.

      int N_PRIME = (int)Math.Floor( ((double)N)/Math.Log(N) );

 

      // Build a pipeline of N_PRIME PipelineIntegerRange chained with

      // a PipelineIntegerRange. Each call to PipelinePrime yield an

      // iterator object that we store.

      List<IEnumerable<int>> list = new List<IEnumerable<int>>();

      list.Add(PipelinePrime( PipelineIntRange( 2, N ) ) );

      for( int i=1 ; i<N_PRIME ; i++ )

         list.Add( PipelinePrime(list[i-1]) );

 

      // Cascade the computation among iterators by yielding every

      // numbers between 2 and N.

      foreach ( int i in list[N_PRIME-1] );

   }

}     

 

 

Here also it is possible to make the example more clear thanks to C#3 extension methods. This time we don’t need a list anymore and we are directly chaining filters:

 

using System;

using System.Collections.Generic;

static class Program {

   static public IEnumerable<int> PipelineIntRange( int begin, int end ) {

      System.Diagnostics.Debug.Assert( begin < end );

      for ( int i = begin; i <= end; i++ )

         yield return i;

   }

   static public IEnumerable<int> PipelinePrime(

       this IEnumerable<int> input ) {

      using ( IEnumerator<int> e = input.GetEnumerator() ) {

         e.MoveNext();

         int prime = e.Current;

         // The first number of the list is a prime.

         Console.WriteLine( prime );

         if ( prime != 0 ) {

            while ( e.MoveNext() ) {

               // Remove all multiple of the found prime.

               if ( e.Current % prime != 0 )

                  yield return e.Current;

            }

         }

      }

   }

   const int N = 100;

   static void Main() {

      // Apply the approximation of Gauss/de la Vallée Poussin

      // to get the number of iterators.

      int N_PRIME = (int)Math.Floor( ((double)N)/Math.Log(N) );

 

      // Build a pipeline of N_PRIME PipelineIntegerRange chained with

      // a PipelineIntegerRange. Each call to PipelinePrime yield an

      // iterator object that we store.

      IEnumerable<int> pipeline = PipelineIntRange(2,N);

 

      for( int i=1 ; i<N_PRIME ; i++ )

         pipeline = pipeline.PipelinePrime();

 

      // Cascade the computation among iterators by yielding every

      // numbers between 2 and N.

      foreach ( int i in pipeline);

   }

}     

 

This example is a good start for understanding how to build LINQ queries dynamically.

 

In case you want to dig more in iterators under the hood, the entire section of my book I wrote about iterators is available as an article here. I explain what the C# compiler is doing under the hood to mimic the notion of co-routine.  There is also another tricky example that shows how iterators can be used to model the notion of continuation.

 

Apart dealing with NDepend development I enjoy being a trainer on .NET, C# and also doin design and quality consultancy with NDepend. It is an excellent way to keep contact with real developers that have real problems that can be solved with tools such as NDepend. I am working closely with a bank with thousands of developers that is migrating most of its legacy to .NET. Hence, students I cop with are motivated developers with a solid background in C++ and Java. Thanks to LINQ I am now happy to show that things like iterators, co-routine or closures and more are not just advanced geeky things but are in fact a dramatic weapon to make code more expressive. This makes the Java guys, generally reluctant to learning .NET, more open minded about what a multi-language platform bring to the mainstream development. Some could say that Ruby was the initial starter but nobody can deny that LINQ is really really elegant.

 

Also I would like to precise that I had the chance to proof-read a great book about LINQ  (LINQ in Action, Manning Publications) co-written by my friend Fabrice Marguerie. I plan to review the book soon. For now I can’t say much more than Go buy it! Fabrice and its 2 colleagues just made an awesome work at explaining new C#3 stuff, LINQ concepts, LINQ use cases (in-memory querying, LINQ2SQL, LINQ2XML…), LINQ under the hood tricky things, the multiple ways to extend LINQ for your own needs and best practices to use LINQ.

 

 



Comments

DotNetKicks.com said:

You've been kicked (a good thing) - Trackback from DotNetKicks.com

# February 22, 2008 5:23 PM

Fabrice said:

Thanks for the reference and for the work you did during proof-reading.

I'd like to point out that the chapter that covers deferred query execution is available as a free download. You can find it at http://manning.com/marguerie where you'll be able to learn more about the book and find two more free chapters!

# February 22, 2008 5:29 PM

Christopher Steen said:

ASP.NET Group Enabled Web Form Control Extensions [Via: andrewrea ] Sharepoint An easier way to Create...

# February 23, 2008 12:32 AM

Christopher Steen said:

Link Listing - February 22, 2008

# February 23, 2008 12:32 AM

Daily Bits - February 23, 2008 | Alvin Ashcraft's Daily Geek Bits said:

Pingback from  Daily Bits - February 23, 2008 | Alvin Ashcraft's Daily Geek Bits

# February 23, 2008 12:06 PM

Top Bookstores Online said:

It can many times become difficile to sort the insightful black book stores evidence from the inadequate.

# April 14, 2008 11:49 AM

Voyforums work at home moms. said:

Soy tarts work at home moms. Moms work from home. Wahm com the online magazine for work at home moms.

# May 20, 2008 12:15 PM

subhash said:

Dear

  sir  i want to become a programer with  the help of you. thanks sir

                             subhash

# September 8, 2008 1:43 AM

Leave a Comment

(required)  
(optional)
(required)  

Enter the numbers above:
Add

About Patrick Smacchia

Patrick Smacchia is a Visual C# MVP involved in software development for over 15 years. After graduating in mathematics and computer science, he has worked on software in a variety of fields including stock exchange, airline ticket reservation system as well as a satellite base station at Alcatel. He's currently a software consultant and trainer on .NET technologies as well as the lead developer of the tool NDepend which provides numerous metrics and caveats on any compiled .NET application. He is the author of Practical .NET2 and C#2, a .NET book conceived from real world experience with 647 compilable code listings. Check out Devlicio.us!

Our Sponsors