Mastering Predictable Feature Delivery on Live Service Games- Part 2

By systematically collecting and analysing relevant metrics, you can gain valuable insights into your process performance and identify areas for improvement.

Dec 09, 2024

1×

0:00

-12:18

Deep Dive podcast created with Notebook LM.

Collecting and analysing the right metrics to measure performance, spot issues, and keep your updates on track.
Learn the how and why of analysing Work in Progress, Cycle Time, Work Item Age and Throughput.
Unlock Cumulative Flow Diagrams to highlight potential bottlenecks and imbalances that could disrupt delivery.

Welcome back to part two!

In part one, we looked at building a structured workflow to improve flow and predictability. By defining start and finish points, mapping workflows, setting WIP limits, establishing exit criteria, adopting pull policies, and managing blockers, you set the foundation for smoother operations. Based on the work by Daniel S. Vacanti, author of Actionable Agile Metrics for Predictability: An Introduction, Actionable Agile Metrics for Predictability Volume 2 and co-author of Flow Metrics for Scrum Teams, this guide will walk you through designing and managing an effective workflow to enhance your team’s ability to deliver value consistently.

In part two, we’ll discuss the next step: collecting and analysing the right metrics to measure performance, spot issues, and keep your updates on track. Let’s get started!

We will break this down into four parts:

Process Design and Policies (add link)
Data Collection and Analysis (this edition)
Using Service Level Expectations
Continuous Improvement

Data Collection and Analysis

Data-driven decision-making is at the heart of optimising workflows. By systematically collecting and analysing relevant metrics, you can gain valuable insights into your process performance and identify areas for improvement. If you are using Jira or something similar, you are likely already collecting the data you need to make data-driven decisions.

Below, we will explore four key metrics that help identify issues that can impact predictable delivery:

Work in Progress (WIP)
Cycle Time
Work Item Age
Throughput

We will show example charts and graphs, break down the anatomy of these artefacts, and explore how they can help you deliver more predictable LiveOps. We will also explore the Cumulative Flow Diagram, a data visualisation tool that helps identify process issues.

For this type of analysis, I prefer a tool called Actional Agile by 55 Degrees—the examples below were created in that tool. It works as a plugin for Jira, Azure DevOps, or as a stand-alone SaaS (what I use for about £20 a month subscription for one seat). It is compatible with any project management system that can collect some key data and export it to CSV format.

Track the Basic Metrics of Flow

Consistently Capture Key Data Points

When working with Agile or Kanban systems, it’s vital to understand the flow of work. Four key metrics help you do that: Work In Progress (WIP), Cycle Time, Work Item Age, and Throughput. These metrics are the foundation of managing work effectively and improving processes. Let’s break down each one.

Work In Progress (WIP)

Work In Progress (WIP) refers to all the units of player value that have entered your workflow but haven’t yet been completed. These units—often called work items—could be user stories, features, projects, or any other work that provides value to the player. Simply put, WIP is everything that’s started but not finished.

Tracking WIP is crucial because it gives insights into system performance. If you notice work building up in queues, it’s a signal that flow is getting sluggish. Too much WIP means longer Cycle Times and less predictability.

Not to be confused with WIP limits: In Kanban, WIP limits are caps you set to manage the number of items in progress, but the total WIP is not just the sum of those limits.
WIP doesn’t include backlog items: Only the work started counts.

By managing WIP, you can maintain a balance: enough work to keep everyone busy, but not so much that progress grinds to a halt.

A Work In Progress (WIP) Run Chart is a straightforward but powerful tool for tracking WIP levels over time. It helps you visualise trends, identify patterns, and spot unusual changes in your workflow. Here’s how to construct and interpret one to improve your process performance.

Understanding the Anatomy of a WIP Run Chart

Data Collection
- Track the number of work items started but not completed at the end of each day.
- Regularly capturing this data ensures accuracy.
Time Axis (Horizontal)
- Represents time intervals (e.g., days, weeks, or sprints).
- Each tick marks a specific point in time.
WIP Axis (Vertical)
- Displays the WIP count at each time interval.
Plotting Data
- For each period, plot a dot representing the WIP count.
- Connect the dots to show the trend over time.
Adding Context
- Enhance the chart with helpful details:
  - Baseline Period: Highlight stable periods for comparison.
  - Average WIP: Add a horizontal line showing the average WIP during the baseline.
  - WIP Limits: Show your defined limits to check compliance.
  - Annotations: Note events or changes that may affect WIP.

What can a WIP Run Chart can tell you?

1. Trends

Increasing WIP: This could signal bottlenecks or a system overwhelmed by demand.
Decreasing WIP: This may indicate improved flow or insufficient work entering the system.
Stable WIP: Suggests a balanced system, though further checks are needed to ensure efficiency.

2. Patterns

Recurring Spikes: This could stem from batching, resource issues, or external dependencies.
Recurring Dips: This might result from policies encouraging empty queues before events.

3. Signals of Unusual Variation

Points Above Average: May point to process slowdowns or increased demand.
Runs: Consecutive points above or below average indicate potential shifts in the process.

4. Contextual Information

Impact of Events: Use annotations to link changes in WIP to specific events.
WIP Limit Adherence: Check how often your process stays within the defined capacity.

Cycle Time

Cycle Time measures how long it takes for a work item to move from start to finish. It begins when a work item enters the ‘in progress’ stage and ends when it’s completed.

Why is Cycle Time important?

Predictability: It helps answer the critical question: “When will it be done?” Shorter, consistent Cycle Times mean you can give more reliable delivery forecasts.
Efficiency: Lower Cycle Times indicate that work flows smoothly without unnecessary delays or bottlenecks.
Feedback: Faster Cycle Times mean quicker input from players, which is crucial in an Agile environment.

Understanding the Anatomy of a Scatterplot

X-Axis (Horizontal)
- Tracks the timeline, usually in days, weeks, or months.
- Shows when each work item was completed.
Y-Axis (Vertical)
- Measures Cycle Time for individual tasks, often in days or weeks.
- The higher the dot, the longer the task took to complete.
Dots
- Each dot represents a completed task.
- Its position is based on the completion date (X-axis) and the task’s Cycle Time (Y-axis).
- Stacked Dots: When multiple tasks are finished on the same day with the same Cycle Time, dots stack, often with a number showing the quantity.
Percentile Lines
- Added to clarify the distribution of Cycle Times.
- Common options include the 50th, 70th, 85th, and 95th percentiles.
- Example: The 85th percentile, shown as a red line in the table above, shows that 85% of tasks were completed within that time or less.

What Can a Scatterplot Tell You?

1. Spotting Trends and Patterns

Look for shifts in Cycle Times over time.
For instance, a steady increase in Cycle Times might signal issues like rising Work in Progress (WIP) or bottlenecks.

2. Clusters

Dots grouped at specific Cycle Times or dates highlight changes.
These could indicate improvements or challenges, like a new tool speeding up tasks or a crunch period increasing delays.

3. Gaps

Empty spaces on the chart represent periods when no tasks were completed.
Possible causes include holidays, delays, or batching (e.g., Scrum teams at sprint boundaries).

4. Outliers

Dots far above the others are tasks that took significantly longer than usual.
Investigating these helps uncover delays caused by blockers, large task sizes, or process inefficiencies.

Work Item Age

Work Item Age is the time elapsed since a work item entered the workflow. Unlike Cycle Time, which looks back at completed items, Work Item Age focuses on items still in progress.

Identify stagnation: If an item has been in progress for a long time, it could indicate a blocker or inefficiency.
Prioritise stuck items: Understanding Work Item Age helps teams decide if older items need immediate attention.
Visualise ageing items: An Aging Work In Progress chart can be a helpful visual tool, highlighting the age of items in each workflow stage.

Work Item Age only applies while an item is in progress. Once it’s finished, its ‘age’ becomes its Cycle Time.

An Aging Work in Progress (WIP) chart, often called an "Aging Chart," tracks how long tasks have been active in your workflow without completion. Focusing on Work Item Age provides real-time insights into delays, bottlenecks, and opportunities for improvement.

Understanding Work Item Age vs. Cycle Time

Before diving into the Aging Chart, it’s important to distinguish these two metrics:

Work Item Age:
- The time a task has been in progress, from its start to the present moment.
- Applies only to incomplete tasks.
Cycle Time:
- The total time a task takes to complete, from start to finish.
- Calculated only after the task is finished.

The key difference is that Age reflects real-time data for ongoing work, while Cycle Time is a historical measure. The Aging Chart focuses on Age, helping teams address delays before they extend Cycle Times.

Anatomy of an Aging Chart

Visualising the Workflow

Columns: Match the stages of your workflow (e.g., "Analysis Active," "Development," "Testing").
This layout mirrors your Kanban board for easy alignment.

Tracking Work Item Age

Vertical Axis: Displays item age, typically in days. Adjust the scale based on typical workflow durations.
Dots: Represent individual tasks. The higher the dot, the older the task.
Horizontal Position: Indicates the task’s current workflow stage (e.g., "Testing").

Contextual Overlays

Percentile Lines: Borrowed from the Cycle Time Scatterplot, these lines show historical benchmarks, such as the 50th, 70th, or 85th percentiles.
Interpretation: A dot above the 50th percentile line means the task has already taken longer than half of previously completed tasks.

What Can an Aging Chart Tell You?

1. Spot Delays Early

Quickly identify tasks that are taking longer than usual.
Investigate causes like bottlenecks, blockers, or dependencies before delays escalate.

2. Manage Risks Proactively

Compare task age to Service Level Expectations (SLEs).
Take timely actions like prioritising tasks, breaking them down, or removing blockers.

3. Guide Stand-Up Discussions

Use the chart in daily scrums to focus on tasks needing attention.
Encourage team collaboration on resolving impediments.

4. Drive Process Improvements

Spot recurring delays in specific workflow stages.
Use these insights to adjust WIP limits, streamline steps, or prioritise critical tasks.

5. Promote Flow Thinking

Emphasise reducing delays and improving the steady progression of work.
Shift focus from individual task completion to overall system efficiency.

Throughput

Throughput is the count of work items completed per unit of time. It tells you how many items leave the workflow and reflects productivity.

Productivity: Higher Throughput means more work is completed, often indicating a healthy, efficient system.
Forecasting: Throughput data can be used to predict future performance, for example, by applying Monte Carlo simulations.
Spotting bottlenecks: Analysing Throughput across different stages of your process helps identify where work is slowing down.

Throughput differs from Scrum Velocity because it measures the actual count of items completed, not ‘story points’ or other effort estimates.

A Throughput Run Chart tracks the number of tasks completed over a specific time period, offering a clear view of your team’s completion rate and how it changes over time. This visual tool is handy for identifying trends, patterns, and unusual variations in Throughput, enabling teams to make data-driven improvements.

Anatomy of a Throughput Run Chart

1. Reporting Interval

Represents the time unit used to measure Throughput, such as days, weeks, or sprints.
Daily Throughput is often preferred for greater granularity and flexibility, especially for advanced analyses like Monte Carlo simulations.

2. Horizontal Axis (Time)

Displays evenly spaced time intervals, such as consecutive days or weeks.
Acts as the timeline for tracking completed work.

3. Vertical Axis (Throughput)

Shows the number of completed tasks for each reporting interval.
The scale can be adjusted to fit the range of observed Throughput values.

4. Data Points and Trends

Each dot represents the Throughput for a specific interval.
Connecting the dots with a line visualises trends such as increases, decreases, or stability over time.

5. Contextual Overlays

Baseline Period: Highlights a stable reference period for comparison.
Average Throughput: A horizontal line representing the average completion rate during the baseline.
Natural Process Limits (NPLs): Upper and Lower Limits (UNPL and LNPL) are calculated from average throughput and variability. These lines separate routine variation from exceptional changes.
Annotations: Notes to mark significant events like holidays, staffing changes, or demand spikes that may explain variations in Throughput.

What can a Throughput Run Chart tell you?

1. Analysing Trends

Increasing Trend: Indicates improved efficiency or capacity.
Decreasing Trend: Suggests potential problems like bottlenecks, resource constraints, or increased task complexity.
Stable Trend: Reflects consistent performance, but further analysis is needed to confirm optimal stability.

2. Identifying Patterns and Signals

Flat Lines: Zero Throughput periods suggest no tasks were completed. Prolonged flat lines could indicate blockers, holidays, or workflow disruptions, which harm predictability.
Stair Steps: Flat lines followed by sudden Throughput spikes suggest batch processing. While not always bad, batching can affect predictability and Cycle Time.
Points Outside NPLs: Dots above or below Natural Process Limits signal exceptional variations that warrant further investigation.
Runs: A series of points consistently above or below the average line may indicate process shifts or sustained external influences.

3. Leveraging Contextual Insights

Impact of Events: Correlate Throughput changes with annotations to understand their causes.
System Dynamics: Analyse Throughput patterns alongside WIP limits, workflow policies, and other system factors for a complete picture.

Throughput Run Charts in Scrum

Throughput Run Charts are particularly valuable for Scrum teams:

Daily Throughput Offers Better Insights
- Scrum often emphasises Sprint-level Velocity, but tracking daily Throughput provides a clearer view of flow and variability.
Handling Zero Throughput Days
- Scrum teams completing few tasks daily may encounter frequent zero Throughput values. In such cases, consider measuring the Time Between Throughput (time between task completions) and plotting it on an XmR chart for more sensitivity to variability.
Forecasting with Throughput
- Use Throughput data to forecast backlog completion or release timelines. Monte Carlo simulations can model completion probabilities based on historical data, offering robust, data-driven predictions.

Putting It All Together

These metrics—WIP, Cycle Time, Work Item Age, and Throughput—are essential tools for improving workflows in Agile and Kanban systems. By tracking and analysing them, teams can better manage capacity, enhance predictability, and drive continuous improvement.

Understanding your flow metrics isn’t just about numbers—it’s about creating a more predictable, efficient process that delivers value to players faster. Start measuring these metrics, and you’ll soon spot the bottlenecks, celebrate the wins, and keep improving.

Visualise Your Flow Data

Leverage Flow Analytics for Insights

Visualisation tools are essential for turning raw data into clear, actionable insights. They make it easier to spot patterns, track progress, and identify workflow improvement areas.

One of the most effective tools for this purpose is the Cumulative Flow Diagram (CFD). This chart visually represents how work flows through each process stage, highlighting potential bottlenecks and imbalances that could disrupt progress.

CFDs are particularly valuable for understanding flow dynamics and improving predictability. By studying the chart’s structure and patterns, teams can identify issues, make informed decisions, and optimise their processes. CFDs offer a comprehensive view of process performance when combined with other flow metrics, helping teams achieve better outcomes.

Anatomy of a Cumulative Flow Diagram

1. Horizontal Axis (Time)

Represents the progression of time, typically divided into days, weeks, or months.
The reporting interval depends on the level of detail required for analysis.

2. Vertical Axis (Cumulative Count)

Displays the cumulative number of work items in the process at each point in time.
The scale adapts to accommodate the range of work item counts.

3. Bands

Coloured bands represent workflow stages, such as "To Do," "In Progress," and "Done."
Top Line of a Band: Tracks cumulative arrivals into the stage.
Bottom Line of a Band: Tracks cumulative departures from the stage.

What can a CFD tell you?

Top Line = Cumulative Arrivals
- Shows the total number of work items entering the process.
Bottom Line = Cumulative Departures
- Represents the total number of work items that have exited the process.
No Decreasing Lines
- Lines should never slope downwards due to the cumulative nature of the data.
- A downward slope signals errors in data collection or calculation.
Vertical Distance = WIP
- Measures Work In Progress (WIP) between two workflow stages simultaneously.
Horizontal Distance = Approximate Cycle Time
- Reflects the average time taken to complete work items, though this is an approximation.
Slope = Arrival/Departure Rate
- Indicates the average rate of arrivals or completions for a specific workflow stage.

Interpreting CFDs

Quantitative Insights

WIP: Shown by the vertical distance between two lines.
Cycle Time: Estimated by the horizontal distance between the top and bottom lines.
Throughput: Calculated from the slope of the bottom line.

Qualitative Patterns

Widening Bands: Signal increasing WIP in a stage, suggesting a bottleneck.
Flat Lines: Indicate zero Throughput or arrivals, often due to disruptions or holidays.
Stair Steps: Suggest batch processing, where items are completed in chunks.
S-Curves: These are seen in workflows that start and end with zero WIP, reflecting challenges in balancing arrivals and departures.

Common Misconceptions about CFDs

Using Counts Only
- Relying solely on item counts can be misleading if items move backwards or leave the process before completion.
Exact Cycle Time
- Horizontal distance provides an approximate average, not the precise Cycle Time, as items starting and finishing might differ.
Including Backlogs
- Adding backlog stages can confuse interpretations if Cycle Time calculations don’t account for time spent in the backlog.
Ignoring Context
- Understanding the workflow’s policies, external factors, and data context is essential to avoid superficial conclusions.

Conclusion of Part Two

By laying the groundwork with a structured workflow and focusing on data collection and analysis, your team is on the right path towards improving predictability and delivering consistent value. Metrics like Cycle Time, WIP, and Throughput offer powerful insights when used correctly, helping you identify patterns, address inefficiencies, and refine your process.

In part three, we’ll explore the concept of Service Level Expectations (SLEs). We’ll discuss how to set realistic goals for task completion and use flow metrics to maintain accountability while enhancing your team’s performance. Don’t miss it!

Thanks for reading Game Production Alchemist! This post is public, so feel free to share it.