Spreadsheets die hard
How a stealthy incremental refactor saved us when we eventually miss a target

There was a time when Product decided to dump on my team the responsibility of producing forecasts for a specific engagement metric that was part of the key corporate metrics. My team was responsible for the standardization of some widely used metrics (acquisition, engagement, retention, monetization) across a finite set of standard segments (geo, product, user type, etc). However, my team was not owning any downstream processing of this data, we just made it available to different functions at the company, with certain data quality guarantees. Product was using the actuals to generate forecasts in Excel manually, once a year, before the start of the next fiscal year (FY)
The reason why Product wanted to offload that process to us was because the Director of that group was leaving the company, and until then she was personally building these forecasts for the Finance team at the beginning of each fiscal year. Among the key corporate metrics, this one was the most “tame”, especially compared to the financial ones. Although it was a factor in how execs' compensation was calculated at the end of the year, the company was exceeding that target year after year. I am quite sure Product was introducing a decent amount of sandbagging, because it was a safer metric to manipulate downward compared to revenue or profit, which have far bigger implications because they are communicated to Wall Street.
The fiscal year was starting in a few months when I got the mandate to own this process, so there was no time to change. I also did not want to assign this to someone in my team, so I cleaned up the Excel myself and kept the same approach. After a few rounds of back and forth with Finance, the forecasts were locked.
I immediately wanted to automate this and get rid of the Excel-based manual process, but I knew that it was not going to be easy. Finance (not surprisingly) used Excel a lot1. I also did not assign this to someone on my team, so I kept it as my crux, or side gig. This is why I decided to stealthily automate it without anybody noticing.
In the first quarter of the FY, I replaced the manual copy-pasting of historical data. The sheet containing the raw historical data was still there, but now it was pulling the data directly from Snowflake. In the second quarter, I implemented a Jenkins script in R that replicated the simple methodology used in the calc sheet to generate the forecasts. Now I had something that was running in the background, producing and persisting forecasts on a weekly basis, and all the deltas between forecasts and actuals. In the third quarter of the FY, I changed the calc sheet to pull from Snowflake the pre-calculated forecasts generated by the R script. At this stage, the Excel file was just a UI displaying data from a DB. I even left the sheet with the historical data to not raise too many questions, although at this point it was not used at all. When we entered the fourth quarter and were getting ready for the next FY forecasting exercise, I pretended that everything was business as usual, sharing the Excel file back and forth by email.
It was a mirror-first, innovate-later strategy. A very quiet, almost invisible migration, which was going to be a small story … until we started falling behind the target. For years nobody sweated the forecast because the target got crushed every FY, so the Excel workbook was good enough. No need to inspect the methodology or the composition of the forecast. Once growth started to have some hiccups and suddenly the engagement metric printed -5% vs target, execs started to be more interested in how these forecasts were made and what was the source of the miss.
If we were still fully based on Excel beyond the thin UI layer, it would have been an absolute nightmare to diagnose and produce explanations for where the miss came from. Instead, thanks to all the deltas stored on a weekly basis, it was easy to detect which geo or product was responsible, or at which point in time the forecast started to diverge from the target. This was not the whole explanation and we had to investigate a bit more, but it was a very good starting point. This level of observability, which was basically unnecessary for two years, suddenly paid off, turning a potential witch hunt into a 30-minute root-cause session.
If your metric is important but not urgent, that is your window of opportunity. Automate slowly and stealthily while nobody is watching. Because the day it does matter, you will not regret it.
Do you like this post? Of course you do. Share it on Twitter/X, LinkedIn and HackerNews