A Scrum take on "Metrics and Analysis in Companies at Different Maturity Levels of the CMMI"

April 23, 2009 Eric Pugh
Category: Conference

Last month Scott, Arin, and myself road tripped to attend the Richmond SPIN meeting, where Kris Puthucode of the Software Quality Center gave a talk on “Metrics and Analysis in Companies at Different Maturity Levels of the CMMIModel”. The focus on the talk was what results you can expect out of metrics if you do the work of performing the analysis required.

I was attending the presentation with a certain sense of trepidation… I consider myself a hard core developer( despite my “Test Obsessed” armband) who doesnt have time or patience for pointy headed manager paperwork. But I am also someone who focuses on process improvement and honing my craft of software development, so where can metrics inspired by CMMI be useful to a fast moving Agile team cranking out functioning software every three weeks? So I listened to Kris and tried to think about what he said in the context of Scrum.

The first slide pointed out that there are three kinds of lies: “lies, damn lies, and statistics”. You need to be careful about what your data says. When you are measuring velocity in Scrum, you need to have a couple of sprints, at least 3 to have any sense of your progress. If you say “we get X done” based on the first sprint, well often the first one is very conservative sprint. And, as you look at your average burndown, you need to have a couple of days of information before you can get a sense of average burndown as the first days often you find as many tasks as you accomplish. And, over the 15 days of your iteration, progress can be pretty spiky… Many teams have pretty flat burn down at the beginning, and then some steep drops… Ideally you are looking to see progress per day become flatter, less spiky, which would indicate that your estimating is improving. Or your team is being less affected by external factors that might hamper their productivity.

He talked about whether to use average or median numbers in looking at series of numbers, such as looking at your burndown.. If you have a lot of outliers, then use median to get a better view, otherwise use average. So if you burndown 20, 25, 22, 40, 25 then using the average is good. But if you have 8, 20, 25, 22, 40, 12 then maybe median would be better.

Kris talked about the cultural challenge of convincing people to provide the data required to build metrics. People need to know why the numbers matter, and even better why it will help then. Its why developers hate filling out TPS reports! And why I like Scrum and its low overhead, as well as more passive measurement tools like HackyStat and 6thSense(now part of RallyDev). I feel like mandating one metric, say your basic time tracking is viable, but if you add more on you start getting more push back or gaming of the metrics.

He talked about getting metrics like Project Start Date and Project End Date. A local company splits up its year as 3 week iterations, and then apportions iterations across competing projects. These iterations then feed the project start and end dates.

He stressed that you need to have a shared vision on metrics, and a shared vision of what “on time, in budget, with quality” really means. I know we the other day had a scrum team debate how to track found tasks. And this was a set of people that had worked together previously on different project having different visions of tracking found tasks and how they should affect the “ideal burndown” line! Covering periodically what that shared vision, and ensuring your team and stakeholders are all on the same page is very valuable.

He stressed that your metrics need to be actual “measurable” things. You need to be able to quantify the metrics that you are using using a shared basis so that when you compare two things using the same metric that you are doing an apples to apples comparison, not an apples to kiwi comparison! For Scrum, it means you need to use the same time frame for iterations, and you can compare one sprint to another for a specific team on a specific project, but not across teams or projects.

Scrum for us provides a very standarized metrics across the OSC organization, regardless of client or specific technology. As long as we are sharing the same vision for our metrics!

When we do a retrospective, and look back at our burndowns, we are doing “Progress Indicators” that are lagging indicators. One of the things that the speaker was advocating was to look at forward looking indicators that predict into the future where we will be. But of course, identifying and seeing a leading indicator is difficult, and takes a lot more analysis. In scrum we would have to tie our tasks to various sprint goals which are appropriately estimated against to provide our velocity.

Did highlight that you need multiple projects happening to be able to gather the variety of data points to be able to compare data points. Compare two scrum teams together and its tough to compare them because you cant see the outliers. But, if you have 10 scrum teams, who have a shared vision of the metrics, then you can start comparing them together. You may have to normalize across the teams, but with enough iterations you can compare and predict.

So some things to show would be a histgram of how much the team burns down a day. Highlight what kind of deviation we have in our progress per day. Over multiple sprints you can maybe see what the first third, middle third, and final third look like. Can we characterize “At OSC, we typically see this kinda of result in the thirds of the project?” Hey, does this feed into our Waterfall projects in 3 weeks?

First week is requirements solidification. Second week is development. Third week is testing and polish.

Can we figure out how to predict the results for sprint 3? We could do this for sprint 1 and 2 and predict sprint 3.

We measure progress per day as a ratio, and then sum it over a week. With that progress per week, then we can see what a sprint 3 would do.

A Scrum take on “Metrics and Analysis in Companies at Different Maturity Levels of the CMMI”