Estimation: who is your enemy?

Estimation: who is your enemy?

What is the estimation? For me, estimation is efforts, it is the amount of work, maybe lines of code, modules. However, it is never a time — it is not convertible. Why? Because I’m an engineer, I know that engineering work is pretty much unpredictable. Obviously, for anybody else working outside of software engineering, this statement may sound wrong. Usually, this is how product managers and product owners tend to think — for them estimation means time and money.

IMHO: if it were true and engineering work could be estimated — we would still use Waterfall.

Yes, there are some best practices of estimation that could help to map work onto time. For everyone working in Scrum — we know this magic word velocity, but how many of us really measure it? 10% or even less. Why? Maybe because of the growing complexity of projects, cross-functional and overseas teams or lack of tools and knowledge. It becomes messy and hard to measure and adapt the process accordingly to the received data. Probably you’ll need someone like Scrum master, working on that full time and still it doesn’t guarantee results.

Even more, many companies already don’t use Scrum, at least in a canonical way. For example, you can check this podcast with Ryan Singer — co-founder of BaseCamp (spoiler alert: 6 weeks cycle. Cycle != Sprint).

I’ll describe the most common cases that probably happened with all of us. We receive a new specification, review it together with all stakeholders, clarify all questions, do dependency mapping (backend, frontend, QA), get a commitment from all teams. Finally, we proceed with estimation: descoped specification to tasks, played poker, estimated efforts, we use a three-point estimation technique, maybe even PERT/CPM. While mapping it with team velocity we get 2 sprints of work. After that PMO creates a beautiful Gantt diagram, and at the same time discusses possible risks with the team and adds safety buffer if required. As a result, PM gets a specific deadline: 3 sprints. What is interesting here is that even with one more Sprint we may fail to deliver in time :). There is definitely something wrong. I decided to investigate what actually causes this distortion between expected and real estimates. Rephrasing Uncle Bob: you should not ask an engineer for estimates. My research led me to many possible issues, but I picked a few that, in my opinion, relate to me as a software engineer:

  • Murphy\’s law
  • Parkinson\’s law
  • Pareto principle
  • Student syndrome
  • Overtime
  • Monkey of responsibility or proper delegation
  • Long-tail time management
  • Lack of leadership

Murphy’s law

Murphy’s law — the first thing you learn in a PM MBA. Literally, the law of the universe, states: “Everything that can go wrong will go wrong”. The latest example from my experience is the Coronavirus (COVID-19) — before the virus breach, when our Beijing team went on vacation, nobody expected that they will be locked in their cities for months. This is one example when Murphy’s strike shines at its best. All our plans went down the drain. So what can we do with it? We definitely cannot be prepared for all possible problems that we could face, we cannot have all possible safe measurements — it is hard and expensive. What a chance that solar flare will destroy the electricity network in the city of your subcontractor? Almost zero, then why would you waste money on power generators? In one company that I worked we rarely but had a problem with electricity, so we had our own power generator that provided energy for almost 1000 employees, so theoretically we were ready for Murphy’s strike. And when it actually happened, the generator was not working properly. What would most managers do to avoid Murphy’s strike? Multiply the estimates by X. And if provided estimates were highly exaggerated, and you actually finished your work much faster than expected, would the other teams/engineers, that need to pick-up work after you, start earlier? Probably no. They were planning their work based on the original estimate, they have their own backlog to finish first… unless they also used Murphy’s estimation.

Parkinson’s law

Parkinson’s law — work expands so as to fill the time available for its completion. Work estimated for one week will take one week even if it could be finished in one day. The most common example of this is an engineer who finishes their work earlier, spends a few more days on refactoring, improving documentation and more precise testing. So to say, the time buffer we used for Murphy’s strike before, will be used even if it is not required.

Pareto principle

Pareto principle or 80/20 rule — 80% of the effects come from 20% of the causes. In other words, you will spend 20% of your effort to provide 80% of the functionality, other 80% of the effort you’ll spend on small updates which are 20% of functionality. You can apply this to many paradigms in the engineering world, the root cause is probably that we don’t know what is waiting for us until we start working. So I created 80% of code, that provides the major part of the functionality, and then I started to deal with bugs, edge cases, race conditions, deadlocks, and other unobvious cases. I’m in the state when I need 80% of effort while having only 20% of the time. It usually happens with newcomers, who are not aware of all project nuances.

Student syndrome

Student syndrome — maybe not all of you have heard this name but you probably already guessed what it means. A student will only start to work on an assignment at the last moment before the deadline. Of course, adults are more responsible, they will not wait till the last moment. From my observations, we waste around 10% before actually starting to work. Besides the time we spend on meetings we don’t need to attend, Youtube, social networks and arguing about politics.

Overtime

Encouraging overtime is the worst thing a company can do to itself. It is creating the common thought amount workers that they can procrastinate during working hours because they can do work later, take work home — as a result, it kills productivity and motivation during potentially most productive hours. The best thing a good leader can do is to turn off the lights in the office after working hours, as stated in the Deadline. This will not only improve productivity but potentially eliminate student syndrome. Companies in Japan already practice this — turning off the lights after working hours. Or something similar in Singapore a company turn off the lights during lunch hours. Even more, based on many experiments 6 hour working day could be even more productive than 8 hours.

Monkey of responsibility or proper delegation

“Monkey of responsibility”, published in 1974, is still very relevant. The idea behind it is that engineers, often implicitly, put their troubles onto their teammates or manager. For example, an engineer comes to their team leader with a problem and says: “Hey I’m blocked by team A, we still didn’t agree on the API, I’m still waiting for them to send me design”. The lead replies:” I’m busy, but I’ll try to check with them tomorrow”. That’s it, the blocker is not a problem of the engineer anymore, but of a team leader, and hence the engineer is not responsible for any delays in the end. From my experience this problem is highly spread in the companies which are practicing micromanagement, when there is somebody constantly controlling you — people tend to lose the feeling of responsibility. From my perspective, the correct way to handle it is to practice responsibility and discipline. The lead should ask an engineer to do some actions and loop him in for support: create a meeting, send follow up email, etc. Or at least, ask a worker to come later to him with this issue.

Long-tail time management

When you are decoupling new features into tickets and prioritize them, usually you will not pay much attention to the small tasks. As well, on the way to the release, you will create more and more small tickets, like update string content, change the color of a button, etc. Some of them are blocked, some need a small adjustment or just have low priority: whatever the reason is, you will decide not to work on them right now. The only problem is that in the end we will have a long tail of small tasks. We tend to forget that it takes some time to rise an MR, run a test, update a ticket, get approval from the design team and eventually validate it by QA. It could take more time than coding. Ukrainian saying: “Never put off till tomorrow what can be done today”. I like to say it in another way: “Never put off till tomorrow what can be skipped”, maybe if you don’t need to do it today, you don’t need to do it at all.

Lack of leadership

The same team of software engineers may show very different results working under the guidance of different leaders. A leader is a role model, someone who is responsible for, what I call, team mental health: motivation, sense of purpose, happiness. Back in 1946 Viktor Frankl, in his book “Man’s Search for Meaning” noted the importance of high aim. For him, the aim wat the only reason to survive, this aim helped him to get through concentration camps and eventually write this book.

Now, can you imagine a leader who is capable of setting high aims for the team and on his/her example show how to reach it? This person existed — Ernest Shackleton — he led an expedition to the South Pole that failed miserably but yet it is one of the greatest examples of leadership that we know. In 1915, Shackleton’s ship got trapped in the ice on North of Antarctica and it took twenty months for the crew of 27 people to get back home. What is amazing here is that comparing to all other attempts to reach the South Pole at that time, nobody died in his team.

As soon as he understood that they got trapped, he set a new goal — bring everyone back home alive. Although he had a chance to accomplish his expedition — reach the South pole, he decided to stay with his team. For long months he kept his team motivated, while he had a lot of doubts about the success of the mission (what we know from his diary). He was doing town halls with his team, talking about the future, discussing the next steps. As well as encouraging them to be active, play sports games and still do their daily duties on the ship, keeping them busy.

It is very hard to imagine what these people went through in this long 20 months. I could not imagine someone who would ever want to get back to that place again. But in 1920, he puts the call out to his old crew. He wants to do one more expedition to reach the Sought Pole. And 12 out of 25 (alive at that time) members agreed. This is an unbelievable level of loyalty and trust they showed to their leader. They failed again, but it is a whole other story. Years after the first expedition, the BBC interviewed all the survivors. They asked them, “How did you do this?” And all of them said — “the boss”.

The team, guided by a motivated leader will accomplish tasks effectively and in a timely manner.

Kyiv-Mohyla Business School (KMBS) has a program of “Management and Leadership”, and one of the courses in it is dedicated to “Leadership in the face of uncertainty”. This part of the program is based on the real-life story of explorer Ernest Shackleton. I had a chance to attend a 1-day training organized by GlobaLogic on this topic hold by KMBS and it is amazing how modern leaders are still learning from this experience.

Summary

With relatively small experience in leading and managing teams, I haven’t found a single solution to the estimation problems. But I have some rules that I follow with each and every team, whatever role I play there, to help us deliver effectively and in time:

  1. Don’t take the process as given — adapt it to your team’s needs. Scrum is not a panacea, not anymore. Many companies already managed to build a much better process specifically for their needs.
  2. It is very important in what way you delegate a task. For example, for senior engineers, it is crucial to understand why they are doing the task, see the aim in it because they value their time. At the same time, junior engineers are usually very motivated and all they need is proper technical guidance, so make sure you pay attention to an individual’s needs. And what’s important for both is clear expectations about the result.
  3. “Seek to understand then to be understood” — is a golden rule for planning and grooming/refinement sessions. Confidence in estimation is often built on top of wrong assumptions, created by a lack of understanding. Listen to your teammates.
  4. Be an example, motivation is contagious.
  5. Give/Delegate the best opportunities to your best people, their success will motivate their colleagues.
  6. Spend time with your team. As stated in “Never Eat Alone” — “secret of success = people you meet * what you do together”.
  7. Make meetings productive. Start a meeting with the list of questions and finish the meeting with the list of answers. Keep discussions relevant to topics. Keep everybody involved. During meetings address questions personally, but decisions should be made by a team, not individuals. Encourage discussions to make them an essential part of all meetings. Be aware of “Meeting recovery syndrome”.
  8. Planning poker is a very effective estimation tool but for a big team, it could take an enormous amount of time. In my experience with a team of 12 members, we were spending 2 hours per week for poker. I recommend keeping the team size within 3–5 people.
  9. Learn from failures and praise for success. All members of the team should get credits for their achievements.

“If you know the enemy and know yourself, you need not fear the result of a hundred battles. If you know yourself but not the enemy, for every victory gained you will also suffer a defeat. If you know neither the enemy nor yourself, you will succumb in every battle.” ― Sun Tzu, The Art of War

PS, for me the biggest example of the estimation fail is history is Channel Tunnel — which had an initial budget of £5.5 billion actually coasted £9 billion and was finished a few years after the initial deadline.

Critical chain

This is an interesting approach, published by Eliyahu M. Goldratt in 1997 in the novel of the same name. I would summarise a book by three statements: the production chain is as weak as the weakest link; time buffer should be shared among a production chain (Sprint) but not added to each link (Story); use the highest chance completion probability for time estimation. The author suggested creating a time buffer that will be a shared amount for team members. For example, if a team has a 2 weeks sprint, you are estimating tasks for 80% of sprint time, another 20% will be shared among team members when they are failing to deliver in time.

Tasks should be estimated with a time that corresponds to the highest chance of being finished. For example, the task will be definitely done in 5 days, but most likely, if everything will be fine — in 3 days, that is a correct estimate.

Production chain, in modern terms, we can map this to reactive streams. Imagine that we have some network request which produces a stream, a stream is going through 6 different layers of logic, each layer is adding a new processing logic and, in the end, stream gets consumed by UI. On the second layer, we are processing responses from the server and update data in the DB. Let’s assume, this is the slowest layer with 2x latency comparing to any other. This means, that in the end, no matter how fast the other layers are — processing is at least as slow as the slowest layer. So you need to put more resources on the slowest layer to balance performance.

I think this method could really work, but only in outsourcing or hardware production.

References: