Monday, January 16, 2012

Measuring Cyclomatic Complexity

This post is about the differences in the way Visual Studio 2010 calculates the Cyclomatic Complexity and the way NDepend does it. There is a difference between the way these two tools calculate the same metrics which is used as one of the standard code quality metrics. I will share my experience in using both these tools.

If you have followed my previous post, you would realise that I personally like to pay  bit more attention to the code quality. From my previous experiences, I have seen that a good quality code is not just the one that is easy to read and self explanatory, but also the one which is structured correctly. I had the experience of working on a project which was in development and enhancement for more than 8 years. You can imagine how difficult it will be to maintain such codebase. I would say our team was fortunate to have some of the best developers  who ensured that the code was of high standards. My definition of quality code is the one which is easier to maintain over a long period of time.

On a recent project our team was using Cyclomatic Complexity to ensure that the code is not very complex to maintain. We had certain thresholds defined for the complexity value. After running the Visual Studio Code Analysis, we encountered certain methods which were breaching the thresholds. We discussed among team members and found out that the developers who developed these methods had used NDepend as the tool for measuring the Cyclomatic Complexity values. As per NDepend analysis, the values were acceptable within the defined range. So why is that the tools used to measure the same metrics (Cyclomatic Complexity) report different values on the same piece of code?

An Example

Lets look at an example. Lets build a small program which displays a person details. We have a list of persons and we can filter them based on multiple criteria. The Person class is the simplest of all and has properties for storing FirstName, LastName and Age. We have a helper or a utility class which creates a static list of persons. We can name this as PersonDataFactory.

    internal static class PersonDataFactory

    {

        public static IList<Person> CreatePersons()

        {

            return new List<Person>

                {

                    new Person { FirstName = "James", LastName = "Bond", Age = 50 },

                    new Person { FirstName = "Harry", LastName = "Potter", Age = 20 },

                    new Person { FirstName = "Bill", LastName = "Gates", Age = 70 },

                };

        }

    }

In our main program we do the magic of filtering and displaying the person details.

        public static void Main(string[] args)

        {

            IList<Person> persons = PersonDataFactory.CreatePersons();

 

            WriteDetails(persons.Where(x => x.Age > 35));

        }

 

        private static void WriteDetails(IEnumerable<Person> persons)

        {

            foreach (Person person in persons)

            {

                Console.WriteLine("Name : {0} {1}", person.FirstName, person.LastName);

                Console.WriteLine("Age : {0}", person.Age);

                Console.WriteLine();

            }

        }

In the above code, we use a Lambda to filter the persons above the Age of 35 years. In the WriteDetails function we just enumerate the persons and print their details to the console. At this point lets run the Visual Studio 2010 Code Metrics and see the results.

image

At the class Program level we get Cyclomatic Complexity as 7. We see that the Main method has the Cyclomatic Complexity of 3 and the WriteDetails method has the CC of 3. Lets run NDepend analysis tool on the same code and compare the results.

image

At the class level, NDepend shows CC as 4 and Main method as 1 and WriteDetails as 2.

image

image

As we can see there is a difference of 3 points at the class level and a difference in the points at method level as well between Visual Studio and NDepend. Which one of the two is correct in this case? We can’t really gauge this without understanding how the two tools calculate the complexity.

How is Cyclomatic Complexity Calculated?

Lets look at how Visual Studio 2010 calculates it first. As per the MSDN article on code metrics values, it is defined as

  • Cyclomatic Complexity – Measures the structural complexity of the code. It is created by calculating the number of different code paths in the flow of the program. A program that has complex control flow will require more tests to achieve good code coverage and will be less maintainable.

Here is another Code Project article which describes in more detail what factors constitute the measure of Cyclomatic Complexity. If you read through completely, we know the method itself starts at 1 and every decision point like if, for, while etc. contribute to the complexity.

Now lets look at how NDepend calculates the same metrics. As per NDepend’s definition of Cyclomatic Complexity, is calculates as

Concretely, in C# the CC of a method is 1 + {the number of following expressions found in the body of the method}:
if | while | for | foreach | case | default | continue | goto | && | || | catch | ternary operator ?: | ??
Following expressions are not counted for CC computation:
else | do | switch | try | using | throw | finally | return | object creation | method call | field access

NDepend specifies little bit in detail about things which do not contribute  to the complexity as well. But in general both the tools use similar methods to calculate the complexity. So lets add few of these decision points to our method and see how it impacts.

We refactor the WriteDetails method to add a message if no persons are found after filtering. Here is the refactored code which displays a different message if the count of elements is zero.

        private static void WriteDetails(IEnumerable<Person> persons)

        {

            if (persons.Any())

            {

                foreach (Person person in persons)

                {

                    Console.WriteLine("Name : {0} {1}", person.FirstName, person.LastName);

                    Console.WriteLine("Age : {0}", person.Age);

                    Console.WriteLine();

                }

            }

            else

            {

                Console.WriteLine("No matching records found.");

            }

        }

Lets run the analysis again and compare the values. This is Visual Studio 2010 output

image

The Cyclomatic Complexity for WriteDetails method is re-evaluated to 4 points.

And this is NDepend output

image

NDepend re-evaluates the complexity to 3 points.

So adding the If condition added 1 point to both the analysis done in Visual Studio as well as NDepend. So both the tools are consistent when it comes to evaluating the complexity of the If condition. That still leaves us with the initial question of why are they different before we added the if condition. NDepend is ignoring some condition which Visual Studio takes into account in its calculation. So lets try to refactor the WriteDetails method little bit more and see the impact.

I extracted the 3 lines of code which writes the details into a method and compared the results again.

        private static void WriteDetails(IEnumerable<Person> persons)

        {

            if (persons.Any())

            {

                foreach (Person person in persons)

                {

                    WritePersonDetails(person);

                }

            }

            else

            {

                Console.WriteLine("No matching records found.");

            }

        }

There is no change in the values. I have not shown the screenshot here, but you can run the analysis again to compare the results. Both the tools do not take into account the method call. Let me try to refactor the Main method and add one more condition to the filter criteria. Along with the Age, I also want to filter the person’s whose names begin with letter “B”. So here is the refactored code

        public static void Main(string[] args)

        {

            IList<Person> persons = PersonDataFactory.CreatePersons();

 

            //WriteDetails(persons.Where(x => x.Age > 35));

 

            WriteDetails(persons.Where(x => x.Age > 35 && x.FirstName.StartsWith("B")));

        }

Now running the analysis in Visual Studio shows the complexity as 4 for the Main method. And the same method in NDepend shows as 1. This means that Visual Studio takes into account the Lambda expression and the complexity associated with it. Whereas NDepend does not count it in the complexity of the method.

image

Observe carefully in the above screenshot, there is a separate method which is highlighted. This is named as <Main>b__0(Person). This is the anonymous method generated automatically as a result of defining the Lambda expression. NDepend calculates its complexity as 2. This is not added to the complexity of the parent method. This is the reason Visual Studio differs in the value for the measure of the Cyclomatic Complexity.

Conclusion

In comparison to NDepend, Visual Studio gives a very limited set of Code Metrics. NDepend can give up to 82 different code metrics as of this writing. But since Cyclomatic Complexity is used quite commonly in many projects, it would have been better if the metrics was consistent among the tools. Referring back to the classic definition of the Cyclomatic Complexity, I believe the tool should take into account all the decision points for a method. From the experiments in this post, I can say that Visual Studio does it and NDepend doesn’t. When I look at the NDepend value of the Cyclomatic Complexity for a method, it does not give me the complete picture. For a real world scenario, it would be really difficult to maintain and monitor all the anonymous methods and delegates used as Lambdas. From the point of maintainability, I would say that Visual Studio does a decent job. It can help me see the impact of my refactoring on the complexity of the method which uses Lambdas. I can’t do it with the same ease while using NDepend.

As always when it comes to standards and best practices, people have their own preferences. I leave it to the developers to decide whether Lambdas should be considered in their complexity points or not. On a personal note, I prefer to include them because a Lambda in itself is a decision point. Imagine a method with 10-15 lines of code. If these 15 lines are mostly consisting of complex Lambdas, you can easily imagine the actual complexity of this method. Just as an exercise you can try running the Visual Studio and NDepend on the same code. You’ll realise the value of what i am trying to suggest here.

Visual Studio Code Analysis has a limitation that it cannot be run as part of the continuous build. This is where NDepend scores over Visual Studio. NDepend can be integrated nicely with the automated build process. If that is the reason you or your team is using NDepend then we can have a difference in the values reported by Visual Studio  and NDepend. You can decide to choose one of these metrics as the basis.

Apart from build integration, NDepend can also be run as a separate tool. You can choose to integrate it with the Visual Studio IDE or run it as a standalone application. I personally think that NDepend has lots to offer with its complete set of metrics. It is always better to have multiple options to choose from.

As always the complete working solution is available for download at Dropbox.

Until next time Happy Programming Smile

4 comments:

  1. On a side note, NDepend proposes also the ILCyclomatic Complexity that proposes to compute CC even if the source code is not available.

    Also, NDepend can let write rules to couple the CC with, forexample code coverage:

    Here we want to detect complex methods not covered 100% by tests.
    WARNIF Count > 0 IN SELECT METHODS WHERE
    CyclomaticComplexity > 15 AND PercentageCoverage < 100

    or here we want to detect complex methods added or refactored since a particular point in the past (the baseline):
    WARNIF Count > 0 IN SELECT METHODS WHERE
    CyclomaticComplexity > 15 AND
    (WasAdded OR CodeWasChanged)

    ReplyDelete
    Replies
    1. Patrick Smacchia I fully agree with you on the features provided by NDepend and also regarding the IL Cyclomatic Complexity. I have personally used NDepend metrics like "method too complex" and others which are really helpful.

      In the context of calculating Cyclomatic complexity for a method I feel Visual Studio is more relevant.

      Delete
    2. Anonymous4:45 AM

      Hi Nilesh,
      Should we look into Class level complexity or method complexity for any action?

      Delete
  2. Anonymous4:44 AM

    Should we use class complexity or method complexity to analyse the code?

    ReplyDelete

How Travis CI saved my time?

Background Some time back I created an Ansible playbook to install software and setup my Mac Book Pro . I put the code for this on GitHub . ...