CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Patrick Smacchia [MVP C#]

July 2007 - Posts

  • I want non-nullable types in C#4

     
    There is no-doubt that the C#2 nullable-types is a cool feature. However I regret that C# don't support the other half of the paradigm: the non-nullable types.

     

    The same way as nullable-types allow null values for value-types, non-nullable types forbid null references for reference types. In the following example the method AcceptNonNullString(string!) takes a non-nullable string as parameter.

     

    static void AcceptNullString(string s) {

       AcceptNonNullString(null); // <- Here the compiler emit an error.

       AcceptNonNullString(s);    // <- Here the compiler emit an error

                                  //    because 's' might be null.

    }

     

    static void AcceptNonNullString(string! s) {

       // Here we have the garantee that the reference 's' is not null.

       int length = s.Length;

    }

     

    The ! syntax comes from the an extension of C# named Spec# . I think it is an elegant and concise syntax that allows to get rid of a problem programmers face every day: NullReferenceException. This power comes at a cost. There exist some cases where the compiler should be tweaked in order to ensure the non-nullable condition, for example:

    • ·         There is problem to check non-nullable instance fields:

          public class Foo {

              string! m_String;

              public Foo(string! s) {

                  // Here the compiler must understand that m_String

                  // hasn't been assigned yet and is null.

                  int length = m_String.Length;

                  m_String = s;

             }

          }

    • ·          There is problem to check non-nullable static fields:

          public class Foo {

              static string! s_String;

              static Foo() {

                  // Here the compiler must understand that s_String

                  // hasn't been assigned yet and is null.

                  int length = s_String.Length;

                  s_String = "hello";

             }

          }

    • ·         There is problem to check arrays of non-nullable elements:

          void Method() {

              string![] array = new string![2];

              // Here array[0] and array[1] are null references.

              array[0] = "hello";

              array[1] = "hello";

          }

     

    More information on these problems can be found here on these excellent blog posts [1] [2] [3] [4] by Cyrus Najmabadi, a software design engineer on the C# team.

     

    I had the chance during the last MVP summit in March 2007 to talk about non-nullable types with the C# team. It seems that the biggest hindrance to the adoption of non-nullable types is that it is so powerful that it would disturb programmers’ habits and code base. Indeed, we agreed that something like 70% of references of C# programs are likely to end-up as non-nullable ones. This underlines the fact that null references are the exception and not the rule.Indeed, null references is a trick inherited from the good old days with C++. Nowadays, programmers are used to rely on null references to avoid writing too much code for simple check such as field not initialized or optional parameters. The NullReferenceException problem is a high price to pay for this facility.

     

    In this context, adding the ! syntax to C# is a bit awkward since the vast majority of references would use this extra syntax. Personally, I could live with that. We now don’t have the choice. We spend our time inserting numerous non-null checks and asserts in our methods and we pray that it is enough to avoid the pesky NullReferenceException experience to our users. It is never too late to do things well and non-nullable types should be added to C#4 (it is clearly too late to add such a feature to C#3).

     

  • Code defensively: Continuously check for corrupted installation

    A common problem when an application goes to production comes from deployment issues. Concretely, your client’ admin, your installation script or any other actor that can access the production machine can potentially mess up and corrupt the installation. If you are lucky, your code will likely crash by raising an explicit exception such as FileNotFoundException (because some assemblies are missing or have a wrong version) or such as MissingMethodException (because there is a versioning issue). If you are unlucky, your code will crash because of a versioning issue with a random exception that has nothing to do with deployment. Hence, you won’t likely think to check for deployment versioning issue and will spend precious time in vain, trying to exhibit a bug that doesn’t exist.

     

    The first defensive step to prevent such problem is to sign your assemblies. When an assembly is signed, it gets a strong name that will contain the version of the assembly. The version of an assembly is set by tagging your assembly with the System.Reflection.AssemblyVersionAttribute such as:

     

    [assembly: AssemblyVersion("2.3.0.1092")]

     

    Let’s say that A is an assembly signed and versioned. If the assembly B references A, the strong name of A is hard coded in the assembly manifest of B (no matter B is signed or not). At runtime when the execution of B needs A, the CLR will look for A taking account of the A strong name. In other words, if B references the version 1 of A, the CLR won’t load another version of A than version 1, even if A version 2 can be found. If A version 1 cannot be found, the CLR will then raise a FileNotFoundException. This safe behavior strongly advocates for signing your assemblies to detect corrupted installation. Unfortunatly, I often see teams only signing assemblies that must be installed in the GAC because the GAC don't accept unsigned assembly.

     

    While developing and supporting NDepend, we experienced the hard way that this cool CLR behavior was not enough to anticipate thoroughly the problem of corrupted installation.

    • First, not all of our executable assemblies use all our library assemblies, meaning that a corrupted installation can run seamlessly as long as the wrong versioned library assemblies are not involved.
    • Second, as the CLR loads the referenced assemblies on demand, the FileNotFoundException can be raised at any time. Because of the Murphy law, anytime often mean when it will highly disturb the user, provoking a loss of data.
    • Third, our installation is spawned on 2 folders one for the executable assembly and one for the library assemblies. It allows our users to focus only on executable assemblies. We implemented a mechanism with the AppDomain.AssemblyResolve event and the Assembly.LoadFrom() method to load manually our library assemblies. The method Assembly.LoadFrom() takes a file path as parameter and not a strong name. Thus, a wrong versioned assembly library could be loaded even if it is signed.

    We then did some code that checks that all assemblies are present in their respective folders with the correct version. This check is triggered anytime an executable assembly is started. If the check fails, a popup window appears, describing in plain english the corrupted installation problem and advice the user to download the latest available version.

     

    We didn’t use System.Reflection to check for assemblies’ versions because System.Reflection forces to load the entire assembly in-memory in order to get its version and once loaded into an appdomain, an assembly cannot be unloaded. That’s quite a high price to pay to just get the version, especially taking account of the fact that not all exes depends on all assemblies. Instead we rely on the open-source Mono.Cecil framework (developed by Jb Evain). Here is the piece of code that just loads an assembly manifest and gets its version:

     

     

    Mono.Cecil.AssemblyDefinition assemblyCecil = Mono.Cecil.AssemblyFactory.GetAssemblyManifest("C:\MyDeploymentPath\MyAssembly.dll");

    System.Version version = assemblyCecil.Name.Version;

     

     

    Internally, this code triggers the load of the entire assembly’ module(s)' image(s) in memory because Cecil checks for the assembly correctness. However there is no JIT compilation and fusion checks. More importantly, the module(s)' image(s) raw data is garbage collected once the AssemblyDefinition corresponding object gets garbage collected. Practically, the loading time remains acceptable and there is no wasted memory.

     

    As a bonus, we also continuously check that the assembly strong names hardcoded in the assembly manifests are correct. Indeed, we also experienced buggy build process that references wrong version of referenced assemblies (like Professional version that references Community version, sounds familiar?). Here is the code skeleton to get assemblies references:

     

    Mono.Cecil.AssemblyNameReferenceCollection assembliesReferenced =

                assemblyCecil.MainModule.AssemblyReferences;

     

    foreach (Mono.Cecil.AssemblyNameReference assemblyNameReference in assembliesReferenced) {             

       System.Version version = assemblyNameReference.Version;

    ...

    }

  • An unexpected CLR behavior : Loading 2 times the same assembly in an AppDomain

    While developing a feature of NDepend we found out that the CLR can load 2 times the same assembly in an AppDomain.

     

    The feature consists in including CQL constraints directly in the source code thanks to the attribute NDepend.CQL.CQLConstraintAttribute found in the NDepend.CQL.dll. This possibility represents a self-described way to express the architectural intentions and to make sure that your design remains clean. For example, in the following piece of code we make sure that we will be advised if the class NamedPipeHelper is used outside the classes ServerHost and ClientBase:


    namepsace NDepend.AddIn.Common.InterProcessCommunication {

       [NDepend.CQL.CQLConstraint(@"// <Name>NamedPipeHelper restreint use</Name>

    WARN IF Count > 0 IN SELECT TYPES WHERE

    IsDirectlyUsing ""NDepend.AddIn.Common.InterProcessCommunication.NamedPipeHelper""

    AND

    !FullNameIs ""NDepend.AddIn.Common.InterProcessCommunication.ServerHost""

    AND

    !FullNameIs ""NDepend.AddIn.Common.InterProcessCommunication.ClientBase"" ")]

       public static class NamedPipeHelper {  }

    }

     

    We use both the framework System.Reflection and Mono.Cecil to analyze assemblies. Here is the code we used to extract CQLConstraint attributes on the type referenced by the variable typeReflection and then, to get an XML representation of the constraints:


    System.Type typeReflection = null;

    object[] listOfCQLConstraints =typeReflection.GetCustomAttributes(     typeof(NDepend.CQL.CQLConstraintAttribute), false);

    string stringXml = (listOfCQLConstraints[0] as NDepend.CQL.CQLConstraintAttribute).ToXml();

    // … here use stringXml

     

    This code can lead to load 2 times the assembly NDepend.CQL.dll in the current AppDomain. Indeed, the JIT compiler triggers the load of $NDependInstallationPath$\Lib\NDepend.CQL.dll the first time it figures out that the type NDepend.CQL.CQLConstraintAttribute is used by a method. Then the call to the method Type.GetCustomAttributes(Type,bool) forces to load the assembly $assembliesAnalyzedPath$\NDepend.CQL.dll because it is the one that is referenced by the analyzed assemblies (the one that contains typeReflection). Here is a screenshot of the corresponding VisualStudio > Debug > Module window:

     

     

     

    While this behavior seems reasonable because the 2 versions of NDepend.CQL.dll can be different (which is not the case here btw), it provokes a subtil bug in our code. There are 2 versions of the type CQLConstraintAttribute that are living in the AppDomain and the version we pass to the method Type.GetCustomAttributes(Type,bool) is not the same as the version that has been used to tag the types of the assemblies analyzed. Hence, the method Type.GetCustomAttributes(Type,bool) won’t return any CQLConstraintAttribute object.

     

    Hopefully, we were able to correct this bug by using reflection. The idea is to fetch all attributes tagging an analyzed type with the method Type.GetCustomAttributes(bool) and then to find which attribute is tagged with a type named “NDepend.CQL.CQLConstraintAttribute”. Then we just try to obtain a method named “ToXml” and then we invoke it.  The extra bonus we get with this code is that it works even if the version of NDepend.CQL.dll that is referenced by the analyzed assembly is different than the version of NDepend.CQL.dll loaded from the NDepend installation assemblies.

     

    object[] listOfCustomAttributes = typeReflection.GetCustomAttributes(false);

     

    foreach (object obj in listOfCustomAttributes) {

       Type type = obj.GetType();

       if (type.FullName != "NDepend.CQL.CQLConstraintAttribute") { continue; }

     

       MethodInfo methodToXml = type.GetMethod("ToXml");

       if (methodToXml == null) { continue; }

     

       object objectStringXml = methodToXml.Invoke(obj, new object[] { });

       if (objectStringXml == null) { continue; }

     

       string stringXml = objectStringXml as string;

       if (stringXml == null) { continue; }

       // … here use stringXml

    }

     

     

     

       

  • A simple trick to rationalizing your code environment and build process

     

    My consulting job mainly consists in auditing real-world project structure. To analyze properly a tier code base, I prefer to install and recompile it on my laptop instead of working on a machine that might not have installed tools I need. I figured out that this first step represents a good way to give advices in order to rationalize the code environment and build process.

     

    How long does it takes to muster all the code base into a single zip file (including source code, resources, debug/release compiled assemblies, pdb, tests code and resources, factory files, tiers assemblies, xml config files…)? Is there any file residing outside the root path that needs to be manually copied?

     

    What is the weight of the zip file? Unless the project has especially big resources (videos, large bitmaps, sounds…) the size of the zip file should be limited. For example we spent time rationalizing the NDepend code base. It currently weights 38MB once zipped for around 44K Lines Of Code. Thus 1 KB per LOC seems to be a decent value, easy to remember. Are there multiple occurrences of the same file that artificially load the zip file? For example using the referenced assembly copy local option of VisualStudio can lead to huge zip file since most of assemblies are unnecessarily duplicated.

     

    How long does it take to successfully re-compile on my machine the code base once the files get unzipped? This duration can be quite long (>1h) if I encounter unexpected problems to solve manually such as absolute paths hard-coded that need to be relative ; broken VisualStudio assemblies/projects references ; missing tiers assemblies or resources ; outdated (or absent) build scripts ; build steps that require manual work ; build steps that relies on third parties tool that need to be installed (obfuscator, code generator…) ; build steps that require admin rights ; build tools that require VisualStudio or another special process running ; build script that doesn’t immediately check if all outputted files can be created, destroyed or overridden ; special build tasks such as delay-signing that require external resources…

     

    The ice on the cake would be to readily run all automatic tests (or even better, to automatically ran them after the build) and get a comprehensive # test passed/code coverage report.

     

    We found out that making sure that this simple test works as seamlessly as possible on our code base make us de-facto more productive.

     

More Posts