forecasting - Stringfest Analytics

How to identify hidden risks in Excel-based decisionmaking

George Mount — Fri, 26 Dec 2025 15:50:05 +0000

For many of the teams I work with, Excel isn’t an occasional tool. It’s the foundation of how decisions get made. Forecasts, budgets, scenario plans, capacity models, pricing analysis, investment cases… they all run through spreadsheets at some point.

That’s fine. Excel is flexible, transparent, and accessible. The risk isn’t that Excel is a bad tool. The risk is that it can quietly produce numbers that feel trustworthy even when their decision value has deteriorated.

You won’t see this risk show up in a green error indicator. You won’t get a circular reference warning. That’s why it’s hard to spot until it actually hurts you.

In this post I want to unpack an idea I’ve been thinking about a lot lately: the difference between accuracy and what I call decision-readiness. Understanding that difference will help you spot the places where Excel models can mislead even experienced analysts and leaders.

Accuracy vs. decision-readiness

When people talk about analytics quality, they almost always start with accuracy. It makes sense: if the math is wrong, everything built on it is suspect. So we check formulas, reconcile totals, compare outputs to expectations.

But accuracy is only part of the story.

What really matters for decisions is whether the model is appropriate to use in the current decision context. That’s what I mean by decision-readiness. A model can be accurate in a technical sense but still be inappropriate or risky to use for a particular decision.

Here’s a simple way to think about it:

Accuracy vs. Decision-Readiness Matrix

	High Decision-Readiness	Low Decision-Readiness
High Accuracy	Decision-Ready Analysis: Sound math and clear context	False Confidence: Correct numbers, unclear fit for the decision
Low Accuracy	Fragile Insight: Good intent, weak implementation	Known Brokenness: Errors are obvious

Decision-ready analysis is what you want: outputs that are correct and appropriate for the decision at hand.

False confidence is the quadrant that gets people in trouble. The math looks fine, so the numbers carry weight. Meanwhile, assumptions, ownership, and context aren’t clear, and the model is being asked to do more than it was designed for.

What decision-readiness looks like in real Excel work

Decision-readiness isn’t abstract. It shows up in very practical, very Excel-specific ways.

It starts with assumptions. In many spreadsheets, key assumptions live inside formulas, helper tabs, or hard-coded values that haven’t been touched in months. Growth rates, thresholds, exclusions, seasonality adjustments… they’re technically visible if you dig, but they’re not top of mind. When assumptions are explicit and easy to point to, they can be revisited when conditions change. When they’re buried, they quietly shape outcomes long after their original rationale has faded.

Ownership matters just as much. I don’t mean ownership of the file in the sense of “who last saved it,” but ownership of the logic. Someone should be able to explain why the model is structured the way it is and why that structure makes sense for the decision it supports. If the only people who really understand the logic have moved teams, left the company, or simply stopped thinking about the model day to day, accuracy alone doesn’t buy you much safety.

Context is another common failure point. A forecasting model built to support monthly planning gradually becomes the number everyone quotes in exec meetings. A scenario analysis meant to explore ranges turns into a single point estimate that anchors decisions. The spreadsheet hasn’t changed, but the decision stakes have. Decision-ready analysis makes those boundaries clear instead of letting usage drift silently.

Review also has to go beyond “does it still calculate.” In Excel terms, that means more than refreshing queries and checking that formulas didn’t break. It means stepping back and asking whether the logic still fits the decision being made. Accuracy can be tested with reconciliations and spot checks. Decision-readiness is tested by asking whether the model still reflects current reality.

How risk actually sneaks in

When people talk about “Excel risk,” they often picture a typo in a formula. In my experience, that is rarely the biggest issue. The more serious problems usually come from patterns like these.

Inputs get overridden at the last minute to “make the numbers work,” and those overrides become permanent. Logic gets copied forward month after month, even as the business changes. Assumptions that were once discussed explicitly are now just “how the model works.” A file that lived on one analyst’s desktop becomes a shared reference point across teams.

In each case, Excel continues to do exactly what it is told to do. There is no error message. The outputs still look reasonable. That is what makes this kind of risk so persistent.

Area	How risk enters	Why it’s hard to detect
Inputs	Manual overrides, external data, ad-hoc adjustments	Refresh cycles and totals still work
Transformations	Pasted logic, layers of calculation	Output still “balances”
Assumptions	Hard-coded thresholds, implicit business rules	Nobody reviews them anymore
Usage	Reuse beyond original purpose	The model “seems fine” in a new context

In each case, Excel continues to produce results. There is no red flag. The numbers don’t betray anything obvious. That’s what makes this so insidious.

A model built for one use case becomes a go-to reference for another. A file that lives on someone’s desktop gets shared and repurposed. An assumption that was explicit six months ago is now “just
how we do it.”

Why confidence erodes quietly

When decision-readiness declines, it rarely collapses in a dramatic way. Instead, confidence drifts.

You might hear:

“It’s directionally right.”
“This has always worked before.”
“We just need something to anchor discussions.”

Those phrases are subtle signals that decision-readiness has weakened.

Teams often compensate with process (reviews, reconciliations, parallel models, etc.) but that doesn’t fix the root issue. It just adds friction.

The real opportunity is to make judgment explicit again: what assumptions really matter, who is accountable for them, and when a model needs reevaluation.

This is not primarily a tooling problem

One of the common reactions when confidence erodes is to blame the tool: Excel is the culprit, or so the thinking goes. But the same patterns I describe here show up in BI dashboards, planning tools, and custom software.

Changing tools does not address the underlying issue: the way judgment and ownership are handled in analysis workflows.

Accuracy is easy to automate and test. Decision-readiness is social and contextual.

What decision-ready Excel analysis looks like

Excel workflows that reliably support decisions share a handful of characteristics:

Dimension	What it looks like
Assumptions	Documented, revisited regularly
Ownership	Clear responsibility for logic and context
Context	Defined use case and limits
Transparency	Anyone can explain why the model guides this decision
Review	Focuses on relevance, not just math

When these elements are present, leadership can challenge assumptions, adapt to new conditions, and trust the outputs where appropriate. When they’re absent, decisions rely on numbers that feel authoritative but lack grounding.

A more effective starting point

If you want to reduce Excel-related decision risk, the first question shouldn’t be “what’s the next tool we adopt?” or “how do we tidy this spreadsheet?” Instead, start with questions like:

What decisions depend on this analysis?
What assumptions drive the outputs?
When were those assumptions last checked?
Who can explain the logic today, not six months ago?

This shift in focus from accuracy alone to decision-readiness helps organizations treat models as tools for informed judgment rather than comforting precision.

Closing thoughts

Excel is a durable, capable platform. The real challenge is not the calculations. It is making sure the numbers we produce are still appropriate to guide the decisions we are using them for.

If this post is ringing a bell, and you are wondering whether hidden risk might be building up in your own Excel-based workflows, get in touch for a discovery call:

Book me

I would be glad to talk through how this shows up for your team and where it might make sense to dig deeper.

The post How to identify hidden risks in Excel-based decisionmaking first appeared on Stringfest Analytics.

How to evaluate forecast accuracy in Excel, Part 1: Using Excel formulas and Forecast Sheet

George Mount — Mon, 22 Sep 2025 20:04:21 +0000

If you work in finance, accounting, or a related field, chances are you’ve had to put together a forecast… and then explain why you built it the way you did. Excel makes that process feel deceptively simple. With functions like FORECAST.LINEAR() or FORECAST.ETS(), or even the Forecast Sheet wizard, you can generate a projection in just a few clicks. But these tools don’t really help you answer the follow-up questions: How accurate is this forecast? Does another approach perform better?

That’s the missing piece. To judge whether a forecast is reliable, you need to measure it against actual results. In this post, we’ll walk through how to calculate common evaluation metrics like MAE, RMSE, and MAPE directly in Excel, and we’ll see what they reveal about the strengths and weaknesses of a forecast. Along the way, we’ll also look at why Excel’s legacy tools make serious model-to-model evaluation difficult.

This is Part 1 of a two-part series. Here we’ll focus on what you can do with built-in Excel functions and features. In Part 2, we’ll raise the bar with Copilot and Python in Excel to build smarter forecasts and evaluate them more effectively.

You can follow along with a synthetic retail sales dataset by downloading the exercise file below:

Download the exercise file here

Splitting the data into training and testing sets

To properly evaluate a forecast we need to give it a fair test. That means splitting our dataset into two parts. Most of the history will act as the training data, which is what the model uses to build its predictions. The most recent 12 months will be set aside as testing data, which the model does not see during training.

This step is important because if you measure accuracy on the same data used to build the forecast, the results will look much better than they really are. It is like grading a student on an exam after giving them all the answers. By holding back the last year of actual sales, we can compare the forecast against numbers the model did not have access to. That gives us a much clearer view of how the forecast would perform when faced with the unknown future.

In Excel, we can use simple formulas to mark which rows belong to training and which belong to testing. This keeps the process transparent and makes it easy to follow along.

There are lots of ways to split this dataset, but I’m going to use dynamic array functions to create the train and test datasets.

For the train set:

=FILTER(
retail_sales,
retail_sales[Date] < EDATE(MAX(retail_sales[Date]), -11)
)

For the test set:

=FILTER(
retail_sales,
(retail_sales[Date] >= EDATE(MAX(retail_sales[Date]), -11)) *
(retail_sales[Date] <= MAX(retail_sales[Date]))
)

These formulas work by taking the maximum date in the dataset, stepping back 11 months with EDATE, and then filtering rows based on those date ranges. The train set pulls everything before that cutoff, while the test set captures the most recent 12 months. Because these are formulas, the split will update automatically if new rows get added.

Your worksheet should now show the data divided into training and test sets. Next, add three extra rows at the top reserved for forecast evaluation using MAE, RMSE, and MAPE. These metrics give us a consistent way to judge whether one forecast is performing better than another.

MAE, or Mean Absolute Error, shows the average size of the errors. RMSE, or Root Mean Squared Error, does the same but penalizes larger misses more heavily, which can be important in business settings. MAPE, or Mean Absolute Percentage Error, expresses the error as a percentage, which makes it easy to communicate results to others.

In each case, we calculate the metric by comparing the forecasted values to the actual values in the test set and then averaging the results. This gives us a fair, side-by-side way to evaluate different forecasting methods.

I’ll share the formulas for this in just a moment when we set up the linear forecast.

Building a linear forecast with `FORECAST.LINEAR()`

We’ll start by creating a basic forecast using the FORECAST.LINEAR() function:

=FORECAST.LINEAR(I9, retail_sales[Retail_Sales], retail_sales[Date])

This is Excel’s simplest way to create a forecast. It fits a straight line through the historical data and then extends that line forward to predict future values. In the screenshot, the function is being used on the test set dates to generate a “linear forecast” for each of the last 12 months.

This gives us two sets of numbers side by side: the actual sales and the forecasted values. At this stage, even before calculating accuracy metrics, a good first step is to simply compare the forecasts to the actuals visually or in a table. By lining them up in two columns with actuals in one, forecasts in the other we can quickly spot whether the forecast is generally too high, too low, or missing obvious patterns.

There are more automated ways to do this kind of evaluation, but since this is a one-off example, we are keeping it manual and straightforward. For now, the main goal is just to see what the forecast looks like and start building intuition before we dive into the formal accuracy measures.

The previous chart shows our simple linear forecast compared to the actual sales. If you want to avoid a gap between the two lines, one quick trick is to include the last actual value as the first point of the forecast series. That way the forecast picks up exactly where the actuals leave off.

Even so, what we see here is a straight line projection. It does capture the overall upward trend in sales, but it completely ignores the seasonal ups and downs that are so obvious in the historical data. That lack of seasonality is a real limitation: if your business depends on cyclical patterns, a linear forecast like this can be misleading.

As we start to compare different models, it will be important to have good benchmarks for measuring performance. To that end, let’s calculate three key metrics in Excel. In our worksheet, the actuals are in column J and the linear forecasts are in column K for the test set. We’ll now use simple formulas to generate MAE, RMSE, and MAPE at the top of the sheet.

MAE, or Mean Absolute Error, is calculated with =AVERAGE(ABS(J9:J20 - K9:K20)). Excel subtracts each forecast from the actual, takes the absolute value so negatives do not cancel positives, and then averages those differences. This shows us, on average, how far off the forecast is.

RMSE, or Root Mean Squared Error, uses =SQRT(AVERAGE((J9:J20 - K9:K20)^2)). This works in almost the same way, but instead of absolute values, the differences are squared before averaging, and then the square root is applied at the end. Squaring emphasizes larger errors, so RMSE gives more weight to big misses.

MAPE, or Mean Absolute Percentage Error, is calculated with =AVERAGE(ABS((J9:J20 - K9:K20) / J9:J20)). This divides each error by the actual value, takes the absolute percentage difference, and then averages them. The result is a percentage that is easy to interpret: on average, the forecast was off by a certain percent.

All three formulas rely only on basic Excel functions like ABS(), AVERAGE(), and SQRT(), which makes them quick to set up. Together, they give us a rounded view of forecast accuracy: MAE for a simple average error, RMSE for error severity, and MAPE for a percentage that is easy to communicate.

With our forecast visuals and accuracy metrics in place, we can now start exploring alternative models.

Building a exponential smoothing forecast with `FORECAST.ETS()`

Using FORECAST.LINEAR() gives us a forecast, but it assumes the data follows a straight-line trend. Most business datasets, like retail sales, don’t behave this way because of seasonality and other repeating patterns. That is where FORECAST.ETS() comes in. ETS stands for Exponential Triple Smoothing, and it is designed to capture both trend and seasonality rather than forcing everything into a simple line.

The real test is whether ETS actually performs better than linear forecasting. To find out, we will measure both approaches on our holdout data using metrics like MAE, RMSE, and MAPE. That way we can directly compare accuracy and see how much the seasonal adjustment helps.

Our next, step, then is to use the FORECAST.ETS() function to generate forecasts for the test set:

=FORECAST.ETS(I9, retail_sales[Retail_Sales], retail_sales[Date])

This forecast looks very similar to the linear version, but under the hood it is much more complex. FORECAST.ETS() applies Exponential Triple Smoothing, which can account for both trend and seasonality in the data. Excel does allow you to pass additional arguments to fine-tune seasonality, confidence intervals, and other options, but for our purposes we can leave those arguments blank and rely on the defaults. That keeps the function easy to use while still giving us a stronger model than a straight line.

On the right, you can see how the ETS forecast values (column L) are now lined up with the actuals (column J). We have effectively filled in the test set with a new set of predictions, giving us something to directly compare to the linear forecast. Great work. This is exactly what we need to start evaluating how ETS performs.

Go ahead and recreate the formulas, this time comparing the actuals against the ETS forecast values in column L. That way we are measuring accuracy for this new model. The formulas are the same as before, just swapping in the ETS forecast range.

Looking at the results, the ETS forecast shows a MAE of about 31, RMSE of about 35, and MAPE of about 2%. These numbers are much lower than what we saw with the linear forecast, where MAE was about 81, RMSE about 91, and MAPE about 4%.

So how do we interpret this? Lower error values mean the ETS model is doing a better job at capturing the actual sales. MAE tells us the typical error dropped from roughly 80 units to about 30 units. RMSE shows that big misses are also smaller with ETS, falling from over 90 units to around 35. And MAPE shows the forecast is now off by only about 2 percent on average, compared to 4 percent before.

The takeaway is clear: FORECAST.ETS() is giving us a more accurate forecast than FORECAST.LINEAR() because it can account for the seasonality in the data.

Building an exponential smoothing forecast with Forecast Sheet

Even though our accuracy metrics show that ETS is performing better, it is still a good idea to build a visual. Numbers can confirm performance, but charts often reveal patterns and gaps that metrics alone might miss. We could build this chart manually, but Excel gives us an easier option with the Forecast Sheet wizard:

AI-Powered Excel: How to create forecasts with Forecast Sheet

This tool automatically generates an ETS forecast chart and includes confidence intervals, giving us both the forecast and a sense of its uncertainty. Let’s run Forecast Sheet on our dataset and see what it produces.

To create the forecast, start by selecting the training dataset (i.e., not including the 12 months we set aside) in your worksheet. Then go to Data > Forecast Sheet. Just like the FORECAST.ETS() function, the wizard includes quite a few customization options, and you’ll even see a preview of the chart before you commit. For now, I’ll leave the default settings and click OK to generate the forecast.

When you use Forecast Sheet, Excel is really just running FORECAST.ETS() behind the scenes. In that sense it’s a glorified version of what we already built manually, organized in Excel Table format. The difference is that Forecast Sheet gives you more structure: it outputs a full table with the forecasted values, plus the upper and lower confidence bounds, and it automatically generates a chart that ties it all together.

The included chart shows the blue line for historical values, the red line for the ETS forecast, and the confidence bands around it.

This combination makes Forecast Sheet a helpful tool. You still get the core of FORECAST.ETS(), but with added detail and visuals that highlight both the continuation of seasonal patterns and the uncertainty that comes with forecasting. It’s a more polished package that makes your results easier to analyze and explain.

The blue line represents the historical sales data. The red line shows the forecast generated by ETS, and the shaded area around it (bounded by the upper and lower confidence lines) represents a range of possible outcomes. This is Excel’s way of showing uncertainty. Instead of just a single forecasted value, you now get a band that communicates the fact that future values are never known with certainty.

This visual is especially important because even though our accuracy metrics already told us ETS was performing better than a simple linear model, the chart makes the results much easier to interpret. You can see both the continuation of the seasonal pattern and the spread of potential outcomes, which helps build intuition and confidence in the forecast.

Conclusion

In this post we walked through how to evaluate forecast accuracy in Excel using both FORECAST.LINEAR() and FORECAST.ETS(). We started by splitting the data into training and testing sets, then built forecasts, and finally compared them using accuracy metrics like MAE, RMSE, and MAPE. The results showed that ETS does a much better job than a simple linear model because it can capture the seasonality in the data.

We also saw how Forecast Sheet is essentially a more polished version of FORECAST.ETS(). It outputs a full table of forecasts with confidence bounds and creates a chart that makes it easier to interpret results. But as helpful as these features are, working with forecasts in Excel can still feel like a patchwork of different tools. Splitting datasets, calculating metrics, and visualizing results all take extra effort, and it’s not always straightforward to keep everything consistent.

The key takeaway is that seasonality matters, and while Excel has some useful forecasting features, it can be difficult to evaluate models rigorously and present results clearly using the built-in tools alone.

That’s why in the next post we’ll take this further by using Python inside Excel. With Python we can slice through the process more consistently and powerfully, and with the help of Copilot we’ll be able to generate the Python code we need through simple prompts. This combination will give us both the flexibility of code and the accessibility of Excel, making forecasting more accurate and easier to explain.

The post How to evaluate forecast accuracy in Excel, Part 1: Using Excel formulas and Forecast Sheet first appeared on Stringfest Analytics.

Python in Excel: How to understand the random walk with Copilot

George Mount — Mon, 15 Sep 2025 14:00:06 +0000

As an Excel analyst, you’ve probably had moments where your data seemed to show clear trends,profits rising, sales dropping, or costs climbing steadily. But sometimes, these apparent patterns might actually be random,tricking you into thinking there’s more control and predictability than there really is.

That’s the essence of the “random walk.” It illustrates how purely random processes can look structured, meaningful, and even predictable. Why should you care? Because knowing the difference between true signals and random noise helps you make better, smarter decisions in your analysis, whether it’s forecasting sales, tracking performance, or spotting real trends.

Fortunately, exploring random walks doesn’t require complicated setups or fancy software,just Excel, Python, and Copilot. Open a blank Excel worksheet, launch Copilot, start Advanced Analysis, and you’re ready to follow along.

Simulating a random walk with coin flips

Let’s take a moment to explore randomness using a simple example. Imagine flipping a coin 100 times. Each heads moves you one step forward (+1), each tails one step back (-1). After every flip, you track your position to see how far you’ve drifted from the start.

“Make a Python script that simulates flipping a coin 100 times. Each flip moves you up (+1) or down (–1). Keep a running total so we can see how far we drift. Put the results in a DataFrame with columns Step and Position, and make a simple line plot of the position over time. Explain in comments what’s happening in plain English.”

Copilot responds by creating a neatly organized DataFrame, listing each step alongside your position. It also produces a line chart visualizing your journey through these random flips:

When you examine this chart, you’ll notice the position line moves unpredictably, sometimes drifting upward, other times downward. Each flip was random, yet it’s natural to try to spot trends or meaningful patterns here. Randomness can trick us into seeing patterns that aren’t really there. By practicing simulations like this, you’ll develop intuition about what’s meaningful and what’s simply noise in real-world datasets.

Simulating stock prices with a random walk

Instead of flipping a coin, let’s start our simulation this time with a stock price of 100, then add a small, random increase or decrease each trading day for one year (252 days). This subtle daily randomness resembles the fluctuations you’d see in actual stock data. Here’s exactly what you can ask Copilot to do:

“Now change the random walk so it looks like a stock price. Start at 100 and add a little random up or down each day for 1 year (252 days). Plot the stock price. Add comments explaining why this looks kind of like a real stock price (sometimes up, sometimes down, but trending).”

Notice how, like the coin toss simulation, the stock price doesn’t follow a straight line. Instead, it fluctuates over time, occasionally dipping downward, sometimes climbing higher again. Even though each individual day’s change is random, the result looks strikingly similar to actual stock market behavior.

But here’s the thing: One single forecast can only show one possible future. In real life, many outcomes are possible.

Think of the typical forecasts you see in business like projected revenues, sales goals, or stock prices. They’re often presented as one neat line or a single estimate. But relying solely on one forecast can give a false sense of certainty. It’s easy to overlook that the real world rarely matches that neatly drawn line.

Instead, it’s crucial to embrace uncertainty. By visualizing many different scenarios, you can better understand the full range of possible outcomes. This helps you plan for the unexpected, identify potential risks, and avoid overconfidence in a single prediction.

Creating and visualizing multiple random walk simulations

To truly appreciate this idea, let’s extend our simulation and visualize multiple scenarios at once.

“Simulate 100 random stock price paths instead of just one. Plot them all on the same chart so it looks messy. Add comments explaining why one single forecast isn’t enough — in real life we could end up with any of these.”

At first glance, this chart looks chaotic, and that’s precisely the point. Each line represents one possible scenario that might unfold. Some scenarios end positively, some negatively, and others hover in the middle.

Here’s the key insight for you as an Excel analyst: Relying on a single forecast can be risky, because real-life outcomes can vary widely. When you visualize multiple possible futures, you’re reminded of this uncertainty. This helps you make decisions more cautiously and realistically, preparing you for multiple possibilities instead of placing too much confidence in one narrow prediction.

Visualizing a the “cone of uncertainty”

The previous chart showed us 100 possible outcomes. But realistically, presenting hundreds of scenarios isn’t always practical or helpful. Instead, let’s summarize all these possibilities by showing a clear range of likely outcomes.

Specifically, we’ll calculate the 5th, 50th (median), and 95th percentiles from these 100 scenarios for every day of our simulated year. This creates a visual “uncertainty cone,” clearly communicating not only what’s possible, but also what’s probable.

Here’s how you’d prompt Copilot:

“From the 100 random paths, calculate the 5th, 50th, and 95th percentiles for each day. Plot a shaded cone (5–95%) with the median line in the middle. Add comments explaining how this cone shows a range of likely futures, not just one line.”

Instead of placing too much confidence in a single prediction, this visualization helps you recognize the inherent uncertainty of forecasts. As an analyst, presenting data in this way helps decision-makers understand risk, plan more realistically, and prepare for multiple potential futures—not just the one we hope or expect to see.

Visualizing the distribution of final outcomes

The uncertainty cone showed a broad range of likely outcomes. But let’s focus specifically on the end of our forecast period. If we simulate 500 random stock price paths, where do prices usually end up after one year?

You can prompt Copilot with the following instruction:

“After simulating 500 paths, make a histogram of the final prices after 1 year. Mark the average, the low end (5%), and the high end (95%). Write a short caption in plain English: ‘Most of the time the stock ends up around X, but there’s a chance it could be much lower or higher.'”

This histogram summarizes clearly how our 500 simulations played out. The average final stock price lands around 114, shown by the solid vertical line. But notice the two dashed lines: at the lower end (5%), the stock price could drop to around 87, while at the higher end (95%), it could climb to around 146.

Conclusion

As explored in this post, randomness plays a larger role in your data than you might initially expect. A single forecast can feel clear and precise, but it rarely captures the true uncertainty of real-world outcomes. By visualizing multiple scenarios, using uncertainty cones, and examining distributions of potential outcomes, you communicate clearly about risk and uncertainty, helping decision-makers set more realistic expectations.

These simulations have limitations, however. Real-life data often includes specific trends, seasonal variations, or sudden shocks that our simplified random walk model doesn’t fully address. To extend your analysis further, consider incorporating historical patterns into your simulations or exploring more sophisticated forecasting techniques like Monte Carlo analysis.

Embracing uncertainty makes you a more effective analyst by improving your judgment, clarifying your communication, and better preparing you and your team for a variety of potential futures.

If you’d like to explore these concepts directly, download the complete Excel sample solution file below:

Download the solution file here

The post Python in Excel: How to understand the random walk with Copilot first appeared on Stringfest Analytics.

Copilot in Excel: How to build ARIMA forecasts with Python

George Mount — Tue, 26 Aug 2025 23:46:53 +0000

ARIMA (short for AutoRegressive Integrated Moving Average) is a classic statistical model for time series forecasting. It works by combining three elements: autoregression (using past values to predict future ones), integration (removing trends to make data stationary), and moving averages (capturing patterns in forecast errors).

For analysts, it’s valuable because it turns historical data into actionable forecasts, helping answer questions like “What’s likely to happen next month?” or “How will sales look next quarter?”

Excel does have built-in forecasting models, such as exponential smoothing, but it does not include ARIMA. Implementing ARIMA in traditional Excel requires complex workarounds or external tools, making it impractical for most users.

With Python and Copilot in Excel, this changes. Instead of wrestling with formulas or switching to another application, you can describe your goal in plain language and let Copilot generate the Python code for you. Python handles the statistical heavy lifting while Copilot explains each step. This makes ARIMA forecasting far more accessible and transparent than it’s ever been for Excel users.

In this post, we’ll walk through an ARIMA forecast on the famous airline passengers dataset, which tracks monthly passenger totals from 1949 to 1960. Follow along with the exercise file below.

Download the exercise file here

Explanation of ARIMA

First, we’ll have Copilot clearly explain the ARIMA model components and visualize the data with this prompt:

Explain the components of an ARIMA model and describe how each part captures different aspects of time series data. Then plot the data to visualize trends and seasonality.

Copilot gives us a clear, concise summary:

Checking the data’s suitability for ARIMA

It’s good to have a clear game plan in place if we decide to move forward with ARIMA. But how do we know if ARIMA is actually a good choice for our specific dataset? That’s exactly what we’ll ask Copilot next, with this prompt:

“Evaluate the suitability of this dataset for ARIMA forecasting by assessing its characteristics, and discuss whether alternative models should be considered based on the data’s properties.“

This explanation is clear and helpful, pointing out important details. Specifically, Copilot notes the strong autocorrelation, meaning past values significantly influence current observations. It also identifies the presence of seasonality and non-stationarity. Non-stationarity means the statistical properties, like mean and variance, change over time… something ARIMA explicitly addresses through differencing.

Copilot also rightly suggests that because of the clear seasonal pattern, we might want to consider a Seasonal ARIMA (SARIMA) or even an Exponential Smoothing model, both of which explicitly handle seasonality. This highlights the importance of carefully matching our forecasting model choice to the unique characteristics of our data.

Checking for stationarity and performing seasonal adjustments

At this point, Copilot has shown us that our dataset has strong seasonal patterns, which could make forecasting challenging if left unaddressed. With this next prompt, our goal is to break down the concept of seasonal differencing into clear, non-technical steps. This helps Excel users grasp exactly how adjusting for seasonality makes the dataset “stationary” for ARIMA to handle effectively.

“Explain what seasonal adjustments mean for this dataset and then apply a seasonal difference to the data step-by-step. Tell me how this helps ARIMA in a way a non-technical Excel user can follow.“

As Copilot explains, seasonal adjustments essentially mean removing regular, predictable patterns such as spikes in sales every holiday season from your data. It does this by subtracting each month’s value by the value from the same month the previous year. By doing this, the repeating seasonal pattern disappears, leaving behind clearer information about trends and unpredictable changes.

Creating the ARIMA forecast

Next, we’ll prompt Copilot to build the model. As we do this, it’s important to clearly see how closely our model’s predictions match actual historical data. We’ll use the following prompt:

“Fit a simple ARIMA model to the data. Visualize and explain the model summary in simple language. Focus on what Excel users need to know to see if the model is good.“

Copilot’s output is exactly what we want: clear, practical, and visual. The dashed orange line (predictions) closely tracks the blue line (actual data), which means our ARIMA model is accurately capturing the pattern in the historical data. This visual check is crucial because it gives Excel users a quick and intuitive way to judge the quality of the ARIMA forecast without needing to dive deeply into complex statistical metrics.

Visualizing the ARIMA forecast

Since this initial check looks promising, we might next explore using Copilot to evaluate forecast accuracy in greater depth.

Recall earlier, to make our dataset suitable for ARIMA, we performed seasonal differencing. While differencing helped us make the data stable enough to accurately forecast, it also transformed the data away from its original units, making our predictions harder to interpret directly. That’s why it’s essential we now reverse this step, restoring the forecasts back to their original scale, such as actual monthly passenger counts. This way, Excel users can directly compare forecasts to historical data.

We’ll use the following prompt to have Copilot clearly handle this step and complete the forecast:

“Use my fitted ARIMA model to forecast the next 12 months of the original (non-differenced) dataset. After forecasting, reverse the seasonal differencing to return the forecasted values to their original scale. Plot these forecasted values alongside the historical data (on its original scale), clearly labeling actual vs. forecasted values. Explain the forecast in a way that Excel users can easily understand.“

Copilot’s visual clearly shows the forecast in the familiar context of our original historical passenger counts, making it easy for Excel users to see how the future predictions naturally follow the historical pattern. By reversing the seasonal differencing step, users can confidently interpret forecasts in practical terms, quickly gauging expected trends and seasonal behaviors going forward.

Comparing ARIMA to Holt-Winters visually

We’ve successfully created forecasts with our ARIMA model. But to make sure we’re using the best model possible, let’s compare ARIMA to another popular forecasting method—Exponential Smoothing (specifically the Holt-Winters approach)—which is particularly good at handling clear trends and seasonal patterns like we have in this dataset.

Here’s the prompt we’ll give Copilot to do that:

“Now fit an exponential smoothing (Holt-Winters) model with a 12-month seasonal period. Plot its forecast on the same chart as the ARIMA forecast and the actual data so they’re easy to compare. Briefly explain in plain language which model appears to fit better and why that might be the case for this dataset.“

If we look closely at the visualization, both forecast lines (ARIMA and Holt-Winters) appear very similar and closely match historical patterns. While Copilot’s explanation gently nudges us toward Holt-Winters—primarily because Holt-Winters explicitly handles trend and seasonality without requiring differencing—it’s tough to confidently pick the “best” model by eye alone.

Comparing ARIMA to Holt-Winters through accuracy metrics

Since these forecasts look pretty close, we can’t make a definitive choice based solely on visual comparisons. To choose confidently, we should dive into summary metrics (such as mean absolute error or mean squared error) next. These numerical measures will clearly quantify which model provides the most accurate predictions.

“Evaluate my forecasts by calculating key accuracy metrics like RMSE (Root Mean Squared Error), MAE (Mean Absolute Error), and MAPE (Mean Absolute Percentage Error). Explain each metric simply, focusing on what it means for assessing forecast quality. Then summarize whether my models provide reliable forecasts based on these results.“

Copilot’s summary metrics give us a precise, easy-to-follow comparison. RMSE, MAE, and MAPE all indicate forecast accuracy—the smaller, the better. While visually both models seemed similar, these numbers clarify that ARIMA actually provides slightly more precise forecasts in this case.

This step highlights to Excel users that accuracy metrics can uncover subtle differences between forecasting methods, helping them confidently select the best model for their data.

Conclusion

Working through ARIMA forecasting step by step shows how each component of the model contributes something valuable. Autoregression lets past values guide our predictions, integration helps manage trends and keep forecasts realistic, and moving averages handle unexpected fluctuations smoothly. ARIMA combines these elements effectively, making it versatile for various datasets with clear historical patterns.

Yet as we’ve seen, exploring alternative methods such as Holt-Winters exponential smoothing provides additional context. Holt-Winters explicitly manages trends and seasonality, often making it an excellent choice for data with strong repeating patterns. Evaluating these models side by side underscores the value of testing multiple forecasting techniques. Accuracy metrics like RMSE, MAE, and MAPE further highlight subtle but meaningful differences between models, guiding more confident model selection.

The next logical steps in your forecasting journey include experimenting with more sophisticated variations. Tools like Prophet or even machine learning methods such as gradient boosting or neural networks might also be appropriate when the data becomes particularly complex or when traditional methods fall short. Many of these advanced approaches remain easily accessible directly in Excel through Python and Copilot integration.

By expanding your toolkit to include methods like ARIMA and Holt-Winters, you gain deeper insight into the strengths and limitations of each forecasting technique, enabling you to make better-informed decisions tailored specifically to your data challenges.

The post Copilot in Excel: How to build ARIMA forecasts with Python first appeared on Stringfest Analytics.

Copilot in Excel: How to build exponential smoothing forecasts with Python

George Mount — Tue, 19 Aug 2025 23:11:35 +0000

Forecasting is a constant challenge for Excel users. Business data doesn’t just grow in straight lines… it shifts with trends, cycles, and noise. Exponential smoothing is a family of methods that accounts for these patterns by weighting recent data more heavily while still learning from the past.

Excel does offer basic exponential smoothing through tools like the Analysis ToolPak and Forecast Sheet, but its features are limited compared to Python and Copilot, which make it easier to explore variations, see diagnostics, and understand what’s happening under the hood.

To show the full picture, we’ll use the classic Airline Passengers dataset, a monthly record from 1949–1960 with both a clear upward trend and repeating seasonal cycles, exactly the kind of structure exponential smoothing was built for.

You can follow along using the download file below:

Download the exercise file here

Simple Exponential Smoothing

We’ll start simple with Simple Exponential Smoothing (SES). This is the most basic version of exponential smoothing. It assumes the data more or less bounces around a stable level with no trend and no seasonality.

Let’s see what happens if we apply SES to the airline passenger data:

“Fit a simple exponential smoothing (SES) model to the dataset and plot the smoothed line against the actual data. Use this to explain why SES alone isn’t enough when the data has a clear trend and seasonality.”

The forecast, shown by the orange dashed line, comes out completely flat. The model is basically saying, “passenger counts will just stay around the same level as the last year or so of data.” But the actuals, shown by the blue line, tell a different story. The series is clearly growing, and it has regular seasonal ups and downs.

This mismatch is important. SES did exactly what it was designed to do: smooth the data and project a constant level. The problem is that this dataset doesn’t fit those assumptions. Starting here gives us a baseline. Now we can ask, what happens if we use models that account for trend and seasonality?

Holt’s linear trend

Now let’s take things a step further with Holt’s linear trend method. This model adds a second smoothing component for trend, so instead of projecting a flat line like SES, it tries to extend the overall direction of the series into the future:

“Apply Holt’s linear trend method to the Airline Passengers dataset. Plot actual values through 1960, then overlay only the next 24 months of forecast. Explain how adding a smoothed trend improves the forecast compared to SES, but why it still doesn’t handle seasonality in the data.“

This time the forecast rises upward instead of staying flat. Compared to SES, that’s an improvement because Holt’s method is recognizing that the series has been trending up over time. But notice what’s missing. The actual passenger counts go up and down in a regular seasonal cycle, and the forecast doesn’t capture any of that. Holt’s method can follow the long-term direction of the data, but it still smooths away those repeating seasonal swings.

This is progress compared to SES, but we’re still not fully matching what the data is doing. Next we’ll see how adding in seasonality changes the picture.

Damped trend model

The next step is to look at a variation of Holt’s method called the damped trend model. Holt’s linear trend keeps extending upward at the same rate forever, which can be too aggressive for many real-world situations. The damped version adds a mechanism to gradually flatten the trend as the forecast goes further out.

“Fit a damped trend model to the Airline Passengers dataset. Plot the actual data through 1960, then add only the forecast for the next 24 months. How does this version of the trend look compared to Holt’s linear trend?“

In this forecast, the orange dashed line starts off like Holt’s linear trend but then slowly bends and flattens over time. Compared to the straight-line projection we saw earlier, this version looks more cautious. It avoids predicting unlimited growth at the same steep rate and instead tapers the trend.

This makes the damped trend especially useful when you think a series will keep growing but not as quickly as in the past. At the same time, it has the same limitation as Holt’s linear trend: it doesn’t capture the repeating seasonal ups and downs in the airline passenger data. That’s the next piece we’ll add.

Holt-Winters seasonal model

So far, we’ve tried SES, Holt’s linear trend, and Holt’s damped trend. Each one added something new, but none of them picked up on the repeating seasonal pattern that’s so obvious in the airline passenger data. The Holt-Winters model extends Holt’s method by adding a seasonal component.

“Fit a Holt-Winters seasonal model to the Airline Passengers dataset. Plot the actual values through 1960, then only the 24-month forecast. Do you see anything new in the way this forecast follows the patterns in the data?”

This time the forecast doesn’t just rise upward in a smooth curve. Instead, it rises and falls in a regular cycle, closely matching the seasonal peaks and dips we see in the actual data. That repeating wave pattern was missing from the earlier forecasts.

With Holt-Winters, we now have a model that reflects three things at once: the overall level of the series, the upward trend, and the repeating seasonal cycle. For this dataset, that combination makes the forecast much more realistic than what we saw with SES or Holt’s methods.

Forecast model comparison

So far we’ve been looking at the different smoothing models by eye. That’s helpful, but in practice we often want a more objective way to compare them. One common approach is to split the data into a training set and a test set, then measure how well each model’s forecasts match the actuals in the test period.

Fit SES, Holt’s linear trend, and Holt-Winters seasonal models. Split the data into training (up to 1958) and test (1959–1960). For each model, generate forecasts and compute accuracy metrics such as MAE, RMSE, and MAPE. Summarize the results in a comparison table and explain which model performed best.

The table shows accuracy metrics for each model on the 1959–1960 test period. SES has the largest errors, Holt’s linear trend does a bit better, and Holt-Winters clearly outperforms both. Its MAE, RMSE, and MAPE are all much lower than the other two models.

This makes sense given what we saw visually. The airline passenger data has an upward trend and a repeating seasonal cycle, so the model that can capture both ends up giving the best forecasts. SES was too simple and ignored both features. Holt’s method accounted for the trend but still missed the seasonal swings. Holt-Winters brought everything together.

This exercise shows the value of comparing models in a structured way. Instead of just relying on a chart, you can use error metrics to confirm which model is the best fit for your data.

Conclusion

Working through exponential smoothing step by step shows how each model adds something new. SES gives a flat forecast, which works only when the data has no structure. Holt’s method improves things by adding a trend, and the damped trend variation shows how you can keep that growth from running away. Holt-Winters takes it further by layering in seasonality, which is why it matched our particular dataset of airline passenger counts so well.

The next step in analysis, as usual, is broadening the toolkit. You might experiment with additive vs multiplicative seasonality, add forecast intervals to communicate risk, or move beyond exponential smoothing into ARIMA, SARIMA, or Prophet for more flexible modeling. Machine learning approaches like gradient boosting or neural networks can go even further when the data is complex. Many of these techniques can again be implemented directly in Excel with the help of Python and Copilot.

For Excel users, the key lesson is that internal tools like Forecast Sheet and the Analysis ToolPak is only one flavor of Holt-Winters behind the scenes. It’s a good starting point, but not the whole story. By learning the wider family of smoothing methods, you can see when Excel’s defaults fall short and when it’s worth exploring other tools. As with most analyst tasks, there’s no single best forecasting model. The real skill is knowing the strengths and limits of each approach and matching them to the problem.

The post Copilot in Excel: How to build exponential smoothing forecasts with Python first appeared on Stringfest Analytics.

Python in Excel: How to visualize seasonality

George Mount — Wed, 09 Jul 2025 21:02:27 +0000

Visualizing seasonality is a critical analytical skill in business because many essential metrics like sales, revenue, or demand often follow predictable seasonal patterns. Understanding these patterns helps businesses anticipate periods of peak activity, manage inventory, allocate resources effectively, and create accurate forecasts.

Python in Excel makes seasonal analysis especially straightforward. Unlike traditional Excel, where handling and grouping date-based data can become complex or cumbersome, Python seamlessly manages dates and periods, enabling intuitive grouping, aggregation, and visualization.

Ready to see this in action? Follow along using the famous airline passengers dataset by downloading the exercise file below:

Download the exercise file here

Before we start visualizing seasonal patterns, we need to prepare our dataset by extracting date-related components. The original file contains a month column in date format, so we first convert it using pd.to_datetime() to ensure pandas recognizes it as a proper datetime object.

From there, we extract the year, month name, and month number, storing each in its own column. These new fields allow us to group, sort, and plot the data by month and year: critical for showing seasonality over time. This simple prep step makes the upcoming visualizations far easier to build and interpret in Python.

Seasonal overlay plot

With the data set up, our first visualization is a line chart showing airline passenger counts for each month, with separate lines representing each year. To highlight growth across this period, we’re labeling selected years (1949, 1955, and 1960) at their December data points. It would be possible to label every year, but neatly positioning all these labels is tricky in Python in Excel without access to external libraries like adjustText, which manage overlapping labels automatically.

This plot clearly reveals recurring seasonal patterns: passenger volumes consistently peak around mid-year, particularly July and August, before declining toward December. Overlaying multiple years helps to compare these seasonal cycles directly, allowing us to quickly assess consistency across time periods and identify overall growth trends. Seeing each year as its own line makes it easy to recognize both typical seasonal fluctuations and any unusual deviations, making this visualization especially valuable for exploratory analysis and trend identification.

To be fair, you could easily make a plot like this in Excel using PivotTables. But rather than linger on what’s simple in Excel, let’s keep moving and check out some visualizations Python handles even more gracefully.

Seasonal subseries plot

In the next plot, we’re first creating a boxplot for each month using seaborn’s boxplot(). These boxes illustrate the typical passenger counts—showing medians (the central line), interquartile ranges (the box itself), and overall distribution (the whiskers). Next, we layer a stripplot with individual data points jittered to prevent overlapping, providing extra transparency and detail about each month’s specific data.

The resulting visualization highlights seasonal patterns in the passenger dataset. You can quickly see that passenger numbers tend to rise consistently from January, peak strongly during mid-year (particularly July and August), and then gradually decrease toward year’s end. Additionally, the combination of boxplot and individual data points reveals the month-to-month variability and potential outliers—months like July show a higher median and greater spread, while months such as November show lower and tighter distributions. This helps analysts clearly understand not only the typical monthly passenger counts but also how consistent or variable each month’s pattern is.

Small multiples plot

In this next visualization, we’re creating a small multiples plot (also known as a facet grid) to examine monthly passenger trends individually, year-over-year. Using seaborn’s FacetGrid, we’ve broken the data into twelve distinct subplots; one for each month. Within each subplot, the gray line shows how passenger counts for that month changed from year to year, allowing us to easily track long-term trends month-by-month.

Additionally, we’ve calculated and drawn a horizontal dashed line representing the average passenger count for each month across all years. This provides a useful benchmark, helping us quickly assess whether a given month in a particular year was above or below its long-term average.

This type of visualization is valuable because it separates seasonal effects from broader trends. We clearly see the growth pattern repeated across months, confirming an overall increasing passenger volume year by year. At the same time, individual facets let us assess each month separately, which highlights whether certain months grew faster or slower relative to their historical averages. This structured view helps analysts and decision-makers identify subtle variations or exceptional cases more quickly than by looking at a single aggregated plot.

Seasonal heatmap

To wrap things up, we have a seasonal heatmap visualization created with seaborn’s heatmap() function. The dataset has been pivoted so that each row represents a year, each column represents a month, and each cell displays the number of airline passengers. Colors ranging from pale yellow (low passenger counts) to deep blue (high passenger counts) illustrate the intensity of passenger traffic clearly and immediately.

This visualization helps us quickly grasp patterns and trends at a glance. As you scan from left to right, the seasonal pattern we’ve previously seen becomes very apparent—passenger numbers consistently peak around July and August, as indicated by the darker colors in those months. Scanning from top to bottom (earlier to later years), the deepening shades of blue illustrate the overall upward trend in passenger traffic across the entire dataset..

Conclusion

Being able to visualize seasonality using Python in Excel offers significant advantages to analysts. Python streamlines date management, simplifies the aggregation of data across time periods, and provides flexible and insightful visualizations that can deepen your analytical insights and decision-making abilities.

However, it’s important to recognize some limitations. Python in Excel currently doesn’t support certain advanced visualization features readily available through dedicated Python libraries, such as automated label adjustment or highly interactive plots. For tasks requiring very detailed customizations, you may still need standalone Python environments.

A natural next step would be exploring seasonal decomposition, a more robust and statistically rigorous technique for analyzing seasonality. Seasonal decomposition breaks down a time series into clear components—trend, seasonal, and residual—which sets you up perfectly for forecasting future values. Applying this method will further enhance your analytical capabilities and precision in forecasting.

If you have any questions about visualizing seasonality or Python in Excel more broadly, I’d love to hear from you! Drop your comments and questions below.

The post Python in Excel: How to visualize seasonality first appeared on Stringfest Analytics.

Advanced analysis with Python in Copilot: How to work with time series data

George Mount — Tue, 01 Oct 2024 18:32:25 +0000

Time series analysis lets analysts identify patterns, trends, and cyclic fluctuations over time. These insights are essential for accurate forecasting, strategic planning, and informed decision-making.

Despite its significance, working with dates and times in a sophisticated manner is often hindered by Excel, which has become the butt of many internet jokes due to its handling of dates and times:

Optimist: The glass is ½ full.
Pessimist: The glass is ½ empty.
Excel: The glass is January 2nd.

— [email protected] (@jxxf) May 7, 2022

Incel Excel

Incorrectly assuming
Something is a date

— Nic Keaney (@NicKeaney) July 11, 2021

Although Excel has traditionally struggled with handling time series data, integrating Python within Excel has significantly enhanced its capabilities. The popular Pandas package, widely used in this environment, is even named in part after “Panel Data,” a type of time series data.

To make things even more user-friendly, we can now leverage Copilot’s Advanced Analysis features—a generative AI-assisted tool—to run various time series analyses on our data with ease.

To explore this feature on a time series dataset, download the exercise file below, which contains a copy of a well-known monthly shampoo sales dataset:

Download the exercise file here

If you’ve never used this feature before and aren’t sure how to load the dataset into Copilot’s Advanced Analysis, check out the post below for step-by-step instructions:

How to get started with Advanced Analysis with Python for Copilot in Excel

One of the key points covered in that article is that the results from Advanced Analysis can be volatile from one session to another. This means you will likely get different results than what is shown here!

In my run, it began by providing some basic statistics about the data, including descriptive statistics and a basic plot—a pretty good start:

I’ve always found it overly complicated in Excel just to resample data, that is, to change the level of the date aggregation. For example, if I want to visualize the data by quarterly sales instead of monthly, all I have to do is ask Copilot something like, “Can you visualize total shampoo sales by quarter?” and I’ll get the following plot:

As a dataset showing sales over time, it stands to reason that a line chart would typically be used here. However, visualizing total sales over time as a bar chart can emphasize individual time periods, making it easier to compare sales across specific months or years. This approach highlights distinct intervals, which can be useful if the data is categorical or if the focus is on total sales in each period rather than the trend between them.

You could, of course, make those changes in the underlying Python code yourself, or you could see if Copilot can adapt and respond by providing you with a new chart option, which I’ll do:

Of course, we could continue describing and visualizing this dataset, but let’s dive into building forecasts. It’s always a good idea, when trying to build a predictive model, to take things step by step. For example, with time series data, it’s wise to first assess the stationarity of the dataset.

If you’re not familiar with this concept, stationarity in a time series means that its statistical properties—such as mean, variance, and autocorrelation—remain constant over time. Checking for stationarity is crucial before building a forecast because many forecasting models assume that the time series is stationary for reliable predictions.

While Copilot can definitely provide guidance here, it’s helpful to have some foundational knowledge of the subject as well.

In this case, I’m going to ask Copilot, “Can you check for the stationarity of this dataset?”

The results will include the augmented Dickey-Fuller test, which is a common method for assessing stationarity, along with some interpretation. Copilot determined that this data is not stationary on its own, which could cause issues for forecasting, but it also suggests some potential alternatives to handle this situation.

I’ll definitely keep this in mind, as everything’s being stored in the Copilot chat session for easy reference. Now, let’s ask Copilot:

“Based on what you’ve assessed about the stationarity of this data, use a sensible technique to forecast the next three months of sales. Visualize the results.”

This will prompt Copilot to apply a relevant forecasting method based on the stationarity assessment, likely using something like ARIMA or a similar model suitable for non-stationary data, and then generate a visual representation of the forecasted sales for the next three months:

The jump between the actual and forecasted values in the ARIMA forecast plot above could be due to the model being trained on non-stationary data, making it difficult to accurately predict the transition from historical to forecasted values. Another possibility is a mismatch between the last observed data point and the first predicted point, which may result from how the forecast horizon is set. If the model’s differencing or other parameters aren’t well-tuned, this could also cause a noticeable shift. This is a great opportunity to use Copilot for further analysis or to dig into the Python code to investigate whether there’s truly an issue here.

In conclusion, integrating Python into Excel opens up a wide range of possibilities for time series analysis, far beyond what Excel can traditionally handle on its own. With tools like Copilot, Excel users can now easily perform complex tasks such as resampling data, checking for stationarity, and building advanced forecasting models like ARIMA.

But the potential doesn’t stop there. For example, Excel users can utilize Python’s advanced time series decomposition techniques to break down data into trend, seasonal, and residual components, providing deeper insights into underlying patterns. Additionally, Python enables the application of machine learning models for more sophisticated predictions, anomaly detection to identify unusual events, or even Monte Carlo simulations to model future uncertainties. These techniques allow Excel users to explore time series data in ways that would have been challenging, if not impossible, with Excel alone.

What questions do you have about working with time series data in Python within Excel or using Copilot’s Advanced Analysis features? Let me know in the comments.

The post Advanced analysis with Python in Copilot: How to work with time series data first appeared on Stringfest Analytics.

How to understand what “AI-Powered Excel” is all about

George Mount — Thu, 07 Mar 2024 02:17:50 +0000

As someone who has created LinkedIn Learning courses on AI for Excel and written extensively about the topic on my blog, I am frequently asked about how AI will influence the future of Excel. One of the most common—and arguably the most intriguing—questions is, “What exactly is AI-Powered Excel? What does it encompass?”

This question is perfectly reasonable, considering the definition of AI can be somewhat vague due to its broad applicability and the rapid evolution of technology. AI spans a wide array of systems and technologies that mimic human intelligence, encompassing learning, reasoning, problem-solving, perception, and language understanding. Yet, this generality can introduce ambiguity, as there is no universal agreement on what precisely constitutes “intelligence” in machines.

In the context of Excel, AI is integrated in various ways, each aimed at enhancing our interaction with data to be not just easier, but also seemingly intelligent. Consider the Flash Fill feature, which intuitively completes data based on observed patterns. It’s akin to having an astute observer at your side, understanding your intentions and stepping in to assist, thereby reducing the monotony of repetitive tasks.

AI further excels in guiding users through Excel’s intricate formula system. With suggestions and explanations, it feels as though you have a guide decoding the language of data for you, making complex calculations more approachable. Similarly, AI intervenes to recommend the most effective data visualization techniques, applying principles that would typically demand a deep level of expertise.

These capabilities hinge on pattern recognition, a cornerstone of AI, where the system intelligently identifies patterns in your formulas and data, automating the completion process.

Excel’s forecasting tools also showcase AI’s potential, using advanced algorithms to predict future trends. Here, AI functions like an experienced analyst, analyzing historical data to forecast upcoming patterns. While the underlying forecasting algorithms may not always be considered AI due to their reliance on static, unlearning formulas, features like the Forecast Sheet, with its ability to generate sophisticated forecasts and identify potential data issues, edge closer to AI applications.

Moreover, AI acts as a supportive tool in data analysis and visualization, leveraging generative AI capabilities to derive insights and suggest actions. This functionality extends from Excel’s Recommended PivotTables and Analyze Data features to the newer Copilot integration. Copilot is designed to further enhance productivity and analysis within workbooks through natural language interaction.

However, it’s crucial to acknowledge that these generative AI tools might occasionally err. Therefore, maintaining oversight of your data and asserting your authority over Excel’s AI features is vital. Ensure that you don’t forsake your human intelligence under the assumption that the computer’s artificial intelligence is infallible.

Understanding AI in Excel goes beyond strict definitions or the intricacies of machine learning models. It’s about appreciating the practical benefits these features offer. AI in Excel bridges the gap between technology and human assistance, delivering tools and insights that make data analysis not only more potent but also more intuitive.

Have you utilized AI in Excel, perhaps without initially realizing it? Where do you see potential for AI in your Excel projects? I’d love to hear about your experiences and interests in the comments.

The post How to understand what “AI-Powered Excel” is all about first appeared on Stringfest Analytics.

How much money do you make writing a technical book?

George Mount — Wed, 29 Nov 2023 17:41:03 +0000

Writing a technical book is a crowning achievement in my life and career, a milestone that fills me with immense pride. It’s an experience so enriching that I’m eagerly diving into it again. I’m passionate about encouraging others to embark on this journey, ensuring they don’t see it as unattainable.

However, when discussing this transformative experience, conversations often shift from dreams and ideals to the more pragmatic realm of finances. A question I frequently encounter is, “How much money will I make doing this?”

In this post, we’ll explore why focusing solely on financial gain might not be the best approach, but we’ll also address this important aspect. Let’s delve into the realities of authoring a technical book, balancing passion with practicality.

This is the wrong question to ask

Let me emphasize: if your primary concern in writing a technical book is how much money you’ll make, you might need to reconsider your mindset. There are several reasons for this.

Firstly, writing a book requires immense dedication and perseverance. You won’t see any financial return for many months, and the outcome is highly speculative. It’s akin to asking an entrepreneur the exact profit from starting a business – it’s simply unpredictable, with risks and rewards deeply intertwined.

Moreover, if you’re used to a regular paycheck, particularly as a technical professional, writing a book represents a significant shift. It’s not just about financial gain; it requires intrinsic motivation. This might include a desire to share your knowledge, contribute to your field, or challenge yourself creatively. For technical writers, there’s a substantial opportunity cost in terms of time. Financially, writing may not be the most lucrative option, but its value extends beyond monetary measures. It’s about passion, contribution, and personal growth.

The indirect payments might matter more, anyway

While it’s true that writing a technical book often doesn’t lead to substantial direct earnings, it’s crucial to recognize the indirect benefits that can be far more valuable.

Indeed, the financial rewards from book writing are often not immediate or direct. However, the true ‘gold’ lies in the opportunities that your book can unlock. For instance, authoring a book could be the deciding factor for an organization when choosing you as a keynote speaker at a conference. It can lead to workshops and training sessions that start with one company and can expand to many, all because someone influential read your book and recommended it.

So, it’s more accurate to view your book as an incredibly powerful tool for personal branding. It’s like having an expansive, highly visible business card or billboard. On its own, the book might not seem impactful, but once it falls into the right hands, it can open doors to opportunities that far exceed the book’s initial sales revenue.

Publishing versus self-publishing proceeds

Before delving into the financial specifics of writing a technical book, it’s important to understand the difference in earnings between traditional publishing and self-publishing. My experience is predominantly with traditional publishers, though I’ve also experimented with self-publishing.

With a traditional publisher, authors typically earn about 10 to 15 percent of the book’s sales. To put it in perspective, buying me a beer could financially benefit me more than purchasing a copy of my book—excluding the potential for indirect revenue from increased visibility and opportunities.

Self-publishing, on the other hand, allows you to retain a larger portion of the sales revenue. The exact amount depends on various factors, including whether your book is an ebook or paperback, its pricing, and where it’s sold, with Amazon being a common platform. The royalty rates vary based on these factors, but they are generally higher than what traditional publishers offer.

However, self-publishing means covering costs like cover design, proofreading, and other production expenses out of pocket. You also may not receive the same level of marketing support that a traditional publisher provides. But, if you already have a built-in audience, this might not be a significant drawback.

The complexities and numerous variables involved make it challenging to provide a definitive answer about potential earnings. But for those eager to understand the financial aspect more clearly, let’s continue exploring this topic.

Amazon is the gold standard for overall sales

When trying to estimate potential earnings from writing a technical book, it’s crucial to consider the total book sales, factoring in the royalty rate and any fixed costs. The royalty rate can vary significantly, ranging from about 35% to 85%, depending on whether you choose traditional publishing or self-publishing, along with other variables.

To get a grasp of potential sales, Amazon serves as a key resource, being the de facto book database on the internet. By analyzing data from Amazon, we can gain insights into the sales volume a book might achieve. However, remember that this is only part of the equation. After determining the sales figures, you need to apply the royalty rate to understand your potential earnings. Additionally, deduct any upfront costs you incurred, especially relevant in self-publishing, to get a clearer picture of the net income from your book.

This process is intricate, and there are numerous factors to consider. But by breaking it down step by step, we can get a more realistic estimate of what an author might earn from writing a technical book.

Book rankings

Analyzing Amazon rankings is a strategic way to estimate book sales, even though Amazon doesn’t explicitly disclose the exact sales numbers associated with these rankings. It’s widely understood that book sales on Amazon follow a power curve distribution, where higher-ranking books sell significantly more copies than those with lower rankings.

For a more conservative estimation, author Edward W. Robertson’s formula can be useful: the daily sales of a book can be roughly calculated as 100,000 divided by the book’s Amazon rank. This method provides a straightforward approach to estimating sales based on Amazon rankings.

Additionally, the book sales calculator widget from TCK Publishing offers a more nuanced approach. This tool likely incorporates various factors and algorithms to provide an estimate of book sales, though the specifics of their calculation might be less transparent compared to Robertson’s straightforward formula.

Both methods offer valuable insights, but it’s important to remember that these are estimations. Actual sales can vary based on numerous factors, including the book’s niche, marketing efforts, and the author’s reputation. By combining these tools with an understanding of your specific context, you can arrive at a more informed estimate of potential book sales.

Amazon Book Sales Calculator

To estimate the total lifetime sales of a technical book, considering its typical lifespan is crucial. Technical books often have a relatively short shelf life due to the rapid evolution of technology, leading to a decent half-life of sales. Typically, you can expect a few years of steady sales before they start to decline.

To apply this method:

Identify a Comparable Book’s Rank: Find a book similar to yours in content and market appeal and note its Amazon rank.
Convert Rank to Sales: Use the earlier mentioned schematics, such as Edward W. Robertson’s formula or the TCK Publishing calculator, to convert this rank into an estimated daily or monthly sales figure.
Extrapolate Sales Over Time: Project these sales over a two-year period. This timeframe can be adjusted based on the specific technical field and how quickly it changes.
Account for Expenses: Deduct any upfront costs or continuous expenses you might incur, especially relevant in self-publishing scenarios.

This approach will give you a broad, somewhat speculative range of potential earnings. It’s important to remember that these are rough estimates. Actual sales can be influenced by various factors such as marketing efforts, changes in technology, competition, and the author’s platform. Nonetheless, this method provides a starting point for understanding the potential financial trajectory of writing a technical book.

Number of ratings

Using the total number of ratings or reviews a book has received on Amazon to estimate its lifetime sales is another approach. This method, suggested by author Jeff Schwarting on Quora, involves multiplying the number of reviews by certain factors to approximate total copies sold:

High Estimate: Multiply the number of reviews by 600.
Mid Estimate: Multiply the number of reviews by 140.
Low Estimate: Multiply the number of reviews by 20.

This method provides a range of estimates, from conservative to optimistic. The accuracy of this approach can vary, but based on your experience with your own book falling within the mid-level range, it adds some credibility to this estimation technique.

Once you have these estimates, remember to perform all necessary calculations to arrive at the total revenue figure. This includes backing out your royalty percentage and subtracting any expenses you’ve incurred.

It’s important to acknowledge that these methods result in a wide range of potential numbers. This variability is inherent in the nature of book sales, particularly for technical books with niche audiences and varying degrees of market longevity. Despite the uncertainty, these methods can offer some guidance for authors trying to gauge the potential financial success of their book-writing endeavors.

Taking the questionably compensated steps

Discussing the financial aspects of writing a technical book can sometimes be daunting, potentially deterring aspiring authors. My intention is not to discourage but to present a realistic perspective.

Much like embarking on a business venture or any significant endeavor, the range of outcomes in book writing is broad, and precise predictions are challenging. However, it’s crucial to approach this journey with confidence in your abilities, believing in the value it will bring to you personally. Regardless of the financial outcome, it’s a pursuit of self-improvement and personal growth.

Do you have any further questions about the financial aspects of writing a technical book? Please feel free to share them in the comments. I’m also happy to provide more insights through my technical book coaching program, which you can find linked below.

Technical book coaching

The post How much money do you make writing a technical book? first appeared on Stringfest Analytics.

AI-Powered Excel: How to create forecasts with Forecast Sheet

George Mount — Sat, 29 Jul 2023 20:29:01 +0000

As analysts, we’re frequently tasked with explaining past data and predicting future outcomes. Building a forecast can be invaluable for this purpose. Excel offers several methods for creating forecasts, but Forecast Sheet stands out as the most powerful and user-friendly option, requiring minimal effort.

Let’s see it in action using housing starts data from the St. Louis branch of the Federal Reserve Bank.

Download the exercise file here

Availability

Forecast Sheet is available in Excel for Microsoft 365 along with the last several static releases of Office. Get the full list here.

Assumptions of forecast sheet

Like with other uses of AI for Excel, it’s important to structure your data in a consistent format. In particular, Format Sheet requires consistent spacing between time periods. And although it can handle missing values and duplicates, better results are achieved if you resolve these issues in the data before feeding them into Forecast Sheet. Forecast Sheet will do its best to detect these issues before actually building the model.

Creating the forecast

To create the forecast, click anywhere in the data and go to Data > Forecast Sheet:

Customizing the forecast

Before inserting the forecast, you have a couple of customization options.

Customizing the forecast

Forecast Sheet is designed to work effortlessly, requiring minimal forecasting expertise or intervention. However, if you want to utilize additional options, just click on “Options.”

This will let you adjust the time horizon, tweak your seasonality, and more. To learn more about the customization options, check out the documentation.

Customizing the visualization

At the top of the menu, you can alsochoose between creating a line chart or a column chart:

When to use which? The default option, a line chart, is effective when data points follow a natural flow or have a continuous progression. It is ideal for revealing trends or fluctuations clearly. Conversely, a column chart is more suitable when you want to emphasize individual data points rather than the continuous progression of a trend.

Methods used in Forecast Sheet

The Forecast Sheet feature in Excel uses a version of the Exponential Triple Smoothing (ETS) algorithm. This method smooths out historical data by applying exponentially decreasing weights to past observations, giving more influence to recent data while still considering older points. It’s designed to detect and incorporate trends and seasonal patterns, making it a robust choice for time series forecasting when data exhibits recurring cycles or consistent growth patterns.

The algorithm automatically calculates seasonality and confidence intervals, which Excel then uses to generate a forecast table and chart, simplifying the process for users without requiring deep statistical knowledge.

This method works best at handling datasets with clear seasonal patterns, such as monthly sales or yearly weather data, and can manage up to 30% missing data through interpolation, enhancing its flexibility. However, its reliance on exponential smoothing means it assumes past patterns will continue, which can falter if sudden shifts or external factors disrupt the trend. It’s less effective for datasets with irregular or non-linear behavior, and its automated nature limits customization, potentially oversimplifying complex scenarios where manual parameter tuning might yield better results.

The Forecast Sheet is best suited for small to medium-sized businesses or individuals analyzing straightforward time series data, like retail sales, website traffic, or inventory levels, where seasonal trends are evident and historical consistency is a reasonable assumption. For volatile or sparse datasets, or those requiring advanced modeling beyond ETS, users might find it lacking, and more specialized tools or manual forecasting methods could be more appropriate. Its simplicity and integration into Excel make it a practical starting point for users needing quick, reliable insights without diving into advanced statistical methods.

Interpreting the results

Forecast Sheet table

When you have finished specifying the output of your forecast, click Create and you should see that your source data gets converted to a table along with some additional calculated columns:

Specifically, you will see a forecast column, along with estimated lower and upper confidence bounds. To learn more about the significance of confidence intervals, check out my book Advancing into Analytics: From Excel to Python and R.

Forecast Sheet chart

You should also see your forecast as the specified line or column chart, along with the confidence intervals.

In this case, the main forecast suggests that housing starts are expected to continue increasing over the next several years. However, the confidence interval serves as a reminder that the range of possibilities includes a more volatile set.

Is this really AI?

While Forecast Sheet is a remarkable Excel product, it’s arguably not exactly AI.

Forecast Sheet uses exponential smoothing, a popular time series forecasting approach, to develop a forecast. Exponential smoothing shares some AI-like traits with adaptability and learning from historical data for predictions, as well as predictive capabilities and real-world applications. Its fine-tuning potential aligns with incremental improvement in AI techniques.

However, it lacks adaptability, relying on fixed formulas, and lacks explicit learning mechanisms, cognitive abilities, generalization, and abstraction usually seen in AI algorithms. As a well-established statistical method predating AI, its core principles do not involve AI-specific methodologies.

That being said, Forecast Sheet’s ability to generate complex forecasts, provide compelling visualizations, and identify potential data issues before model building brings it close to an AI application.

Compare your forecast to mine below:

Download the solution file here

Do you build forecasts in Excel or another tool? If in Excel, do you prefer Forecast Sheet or something else? Let me know in the comments.

The post AI-Powered Excel: How to create forecasts with Forecast Sheet first appeared on Stringfest Analytics.

forecasting - Stringfest Analytics

How to identify hidden risks in Excel-based decisionmaking

Accuracy vs. decision-readiness

Accuracy vs. Decision-Readiness Matrix

What decision-readiness looks like in real Excel work

How risk actually sneaks in

Why confidence erodes quietly

This is not primarily a tooling problem

What decision-ready Excel analysis looks like

A more effective starting point

Closing thoughts

How to evaluate forecast accuracy in Excel, Part 1: Using Excel formulas and Forecast Sheet

Splitting the data into training and testing sets

Building a linear forecast with FORECAST.LINEAR()

Building a exponential smoothing forecast with FORECAST.ETS()

Building an exponential smoothing forecast with Forecast Sheet

Conclusion

Python in Excel: How to understand the random walk with Copilot

Simulating a random walk with coin flips

Simulating stock prices with a random walk

Creating and visualizing multiple random walk simulations

Visualizing a the “cone of uncertainty”

Visualizing the distribution of final outcomes

Conclusion

Copilot in Excel: How to build ARIMA forecasts with Python

Explanation of ARIMA

Checking the data’s suitability for ARIMA

Checking for stationarity and performing seasonal adjustments

Creating the ARIMA forecast

Visualizing the ARIMA forecast

Comparing ARIMA to Holt-Winters visually

Comparing ARIMA to Holt-Winters through accuracy metrics

Conclusion

Copilot in Excel: How to build exponential smoothing forecasts with Python

Simple Exponential Smoothing

Holt’s linear trend

Damped trend model

Holt-Winters seasonal model

Forecast model comparison

Conclusion

Python in Excel: How to visualize seasonality

Seasonal overlay plot

Seasonal subseries plot

Small multiples plot

Seasonal heatmap

Conclusion

Advanced analysis with Python in Copilot: How to work with time series data

How to understand what “AI-Powered Excel” is all about

How much money do you make writing a technical book?

This is the wrong question to ask

The indirect payments might matter more, anyway

Publishing versus self-publishing proceeds

Amazon is the gold standard for overall sales

Book rankings

Number of ratings

Taking the questionably compensated steps

AI-Powered Excel: How to create forecasts with Forecast Sheet

Availability

Assumptions of forecast sheet

Creating the forecast

Customizing the forecast

Customizing the forecast

Customizing the visualization

Methods used in Forecast Sheet

Interpreting the results

Forecast Sheet table

Forecast Sheet chart

Is this really AI?

Building a linear forecast with `FORECAST.LINEAR()`

Building a exponential smoothing forecast with `FORECAST.ETS()`