data visualization - Stringfest Analytics

Python in Excel: How to conduct linear regression with Copilot

George Mount — Wed, 12 Nov 2025 16:56:15 +0000

Data and tech trends come and go, but linear regression has remained one of the most reliable tools in a data analyst’s toolbox. It helps you identify relationships, test hypotheses, and make predictions with a clear view of how different factors influence outcomes. Whether you’re working in finance, marketing, manufacturing, or real estate, few methods match linear regression for both clarity and ease of use.

For all the benefits, however, Excel’s built-in regression tools were never very user friendly. You had to load the Analysis ToolPak (if you could find it) step through a dated wizard, and then make sense of an output sheet that offered little guidance. Changing your model or presenting the results to others was awkward.

With Copilot, things are much smoother. You can build more advanced models with Python in Excel, understand how they work, and interpret the results directly within your workbook. It’s easier to see what your data is telling you and focus on meaningful conclusions rather than the mechanics.

We’ll explore this using a fuel economy dataset. Download the exercise file below to follow along.

Download the exercise file here

If you haven’t used the Advanced Analysis with Python feature yet, take a look at this post:

How to get started with Advanced Analysis with Python for Copilot in Excel

To get started, we’ll run a very simple linear regression: just one dependent and one independent variable. It’s a good habit to make the scope of your model explicit, even when you’re testing something small. In this case, it makes sense to treat mpg as the variable we’re trying to explain and weight as the factor we think influences it.

Here’s the Copilot prompt I used:

“Run a linear regression in Python where mpg is the dependent variable and weight is the independent variable. Include a summary of the results.”

Copilot automatically fitted the model using the dataset and produced the following regression summary:

The interpretation is straightforward: as a car’s weight increases, its fuel efficiency tends to decline. The negative coefficient for weight means heavier vehicles use more fuel. The very small p-value confirms the relationship is statistically significant.

This is the classic starting point for regression analysis: one variable at a time, clear direction, and easily interpretable results. From here, we can begin layering in more predictors to see how horsepower, displacement, or cylinder count refine the story.

Adding more predictors and checking model diagnostics

Now that we’ve built our first model, it’s natural to wonder what other factors might influence fuel economy. Weight appears significant, but horsepower and acceleration could also play a part.

As we start refining our models, we need a way to tell if each new version is actually improving. Two standard metrics help with this: R-squared, which shows how much of the variation in mpg is explained by the predictors, and RMSE, which measures the average prediction error in miles per gallon.

Here’s the Copilot prompt:

“Fit a multiple linear regression model in Python predicting mpg using weight, horsepower, and acceleration.
Calculate and return the model’s R-squared and RMSE as a small summary table.“

The R-squared value of about 0.71 means roughly 71 percent of the variation in fuel efficiency is explained by these three variables. The RMSE of 4.22 means the model’s predictions are off by about four miles per gallon on average. It’s a noticeable improvement over our single-variable model and a good sign that we’re moving in the right direction.

Visualizing predicted versus actual values

Once you’ve built a model and reviewed the metrics, it’s important to see how well the predictions line up with reality. A quick visual check often reveals patterns or problems that numbers alone can miss.

“Plot the predicted vs actual mpg values from the model to check how well the regression fits. Include a line showing perfect predictions for reference.”

Copilot produced a scatter plot comparing the model’s predicted mpg values with the actual ones. Each point represents a car in the dataset. The red dashed line shows what perfect predictions would look like, where predicted and actual values are exactly equal.

This visualization gives a quick gut check on model performance. The tighter the points hug that line, the stronger the predictive power. And while the model isn’t perfect, it’s doing a solid job of explaining how weight, horsepower, and acceleration interact to influence fuel efficiency.

Interpreting model coefficients

You might be wondering how each variable contributes. That’s where interpretation comes in, and Copilot can help you reason through it, not just calculate.

Here’s the prompt:

“Interpret the coefficients of the model using statsmodels. Which features have the biggest impact on mpg and in what direction? Explain in plain language.”

Copilot returned a summary showing that both weight and horsepower have negative coefficients. This means that as either of these increases, fuel efficiency tends to decrease. Weight has the strongest influence. Each additional unit of weight leads to the largest drop in miles per gallon. Horsepower also lowers mpg, though not quite as sharply.

Acceleration, on the other hand, shows a very small and statistically insignificant coefficient, suggesting it doesn’t meaningfully affect fuel economy in this dataset. In other words, how quickly a car accelerates doesn’t matter much for mpg once weight and horsepower are already accounted for.

Together, these results tell a clear story: heavier and more powerful cars use more fuel, while quick acceleration on its own doesn’t add much explanatory value.

Checking model assumptions

Once you’ve built and interpreted your model, it’s a good idea to run a few quick diagnostics to make sure the basic assumptions of linear regression hold. One of the most important checks is to look at the residuals, or the differences between the predicted and actual values.

Here’s the Copilot prompt:

“Plot the residuals of the model. Are they randomly distributed? Is there evidence of non-linearity or heteroskedasticity?“

Copilot produced a residuals vs. predicted values plot. Ideally, the points should be scattered randomly around the zero line. That pattern suggests the model is capturing the data well and that errors are evenly spread across all prediction levels.

In this case, the residuals look mostly random, but there’s a slight funnel shape as mpg increases. That widening spread hints that the model may fit smaller cars a bit more consistently than larger ones, a mild sign of heteroskedasticity. It’s not severe, but it’s worth noting.

Residual plots are one of several ways to check whether your model is behaving properly. You can also look at whether the relationships between predictors and mpg appear roughly linear, whether residuals seem normally distributed, or whether there’s evidence that one error predicts the next. These checks help confirm that the model’s estimates are trustworthy.

Copilot can guide you through these steps, not just by generating plots or statistics, but by explaining what they mean and why they matter. In that sense, it acts less like a calculator and more like a coach, helping you understand the reasoning behind good modeling practice.

Making predictions

Finally, let’s put the model to work in a real-world example. In business settings, the real value of regression often isn’t just understanding relationships. It’s using those relationships to make predictions. Decision-makers care less about the exact slope of a line and more about what it means for future outcomes: how a change in product weight, horsepower, or price might affect performance or profit. A well-built model lets you turn analysis into foresight.

Here’s the Copilot prompt:

“Given a car with 3000 lbs weight, 130 horsepower, and 15 seconds of acceleration, use the regression model to predict mpg.“

Copilot returned a predicted fuel efficiency of about 21.6 miles per gallon.

That means for a car with those specifications, the model expects it to travel roughly 21 and a half miles on a gallon of fuel. This is where regression analysis becomes more than just theory. You can use it to estimate outcomes for new observations, guide design tradeoffs, or compare how different features affect performance.

Conclusion

Iinear regression remains one of the most practical and interpretable tools in data analysis, and Copilot makes it easier than ever to use inside Excel. Even a simple model can uncover useful insights when built thoughtfully and checked carefully. Metrics like R-squared and RMSE help quantify performance, but visuals and diagnostics often reveal the places where your model fits well and where it struggles.

And in the business world, the real power of regression lies in prediction. The ability to estimate how changes in one factor might influence another turns analysis into something decision-ready.

That said, linear regression isn’t magic. It assumes straight-line relationships and evenly distributed errors, which don’t always hold up with messy real-world data. Outliers, overlapping variables, or curved relationships can throw things off, and that’s where judgment comes in. Copilot can automate the steps, but it still takes a human eye to decide what makes sense.

From here, you might explore adding interaction terms, adjusting variables to handle nonlinearity, or comparing results to more flexible models like decision trees or random forests. You could even use Copilot to test cross-validation or experiment with feature selection to see how stable your model really is.

The post Python in Excel: How to conduct linear regression with Copilot first appeared on Stringfest Analytics.

How to use Researcher and Analyst agents in Copilot for Excel

George Mount — Tue, 11 Nov 2025 21:16:47 +0000

Copilot in Excel started as your built-in AI assistant, something you could chat with right inside your workbook. You could ask natural language questions like “summarize sales by region” or “find trends in this data,” and Copilot would instantly generate the right formulas, charts, or summaries to help you explore your data.

Now we’re entering what Microsoft calls the agentic era, a new stage where AI doesn’t just assist you but begins to take action. Inside Excel, this means you can call on specialized agents, pre-built and task-focused versions of Copilot that can reason about your data, gather context, and act on your behalf.

To see what this looks like in practice, we’ll use a dataset on global CO₂ emissions. Within Excel, two pre-built agents available for Microsoft 365, Researcher and Analyst, offer different perspectives on this data. The Researcher agent helps you understand context and background, while the Analyst agent focuses on computation and insights.

Together they show how Copilot is evolving from a simple assistant into a true reasoning partner for your data. Let’s see it in action. Download the exercise file below and follow along.

Download the exercise file here

To find these agents in Excel, open the Copilot button and select Copilot Chat. Then, toggle the navigation pane in the upper-left corner of the window. You’ll see a list of agents, some preloaded by Microsoft, others that you can sync or even create yourself. You’ll also notice that Copilot Chat is listed among them. The implication here is that Copilot is just one of many agents you can work with.

Researcher Agent: context and framing

Unlike the traditional Copilot in Excel experience you might be used to, Researcher doesn’t interact directly with your workbook data. Instead, it draws on trusted sources and knowledge models to help you shape and frame your analysis.

You might start by asking Researcher:

“Summarize global CO₂ emission trends over the last 50 years.”

You can even choose what kinds of sources Researcher can draw from to provide context, whether that’s information from the web, internal company data, specific SharePoint sites or more.

After a few moments, Researcher produces a neatly formatted, report-style summary that brings together all the information sources it can access… including, yes, the data in your workbook.

So even though it doesn’t interact directly with your data in the same way Copilot does, it can still read it to inform its analysis. And because our data was stored in a properly structured Excel table with clear column headers, Copilot could understand it much more easily and generate cleaner, more relevant results.

Let’s look at one more example. The goal here again is to think broadly about how you’re approaching your research. Consider what kinds of context you might want to include, what questions are worth exploring, and what insights you might want to explain or connect back to the data itself.

“Explain how CO₂ emissions relate to GDP and industrialization.”

In other words, think about the bigger story of how you came across this data, why it matters, and what perspective you’re trying to uncover through your analysis. Rather than performing calculations, you’re using Researcher to think through the problem, define your approach, and clarify what matters.

In short, the researcher helps you focus on the “why” behind your work, giving you the background and conceptual grounding to make your data exploration meaningful. What makes this an agent is that it operates with a defined purpose and set of capabilities. It doesn’t just respond to a single question and stop there. It reasons across multiple sources, builds structured outputs like reports or summaries, and adapts its behavior based on your prompts and settings.

Analyst Agent: Data-driven execution

Once you’ve set the direction, the next step is analysis. The Analyst agent operates differently from Researcher because it works directly with your Excel data. You can also upload additional datasets if you want to expand its scope. This is where the idea of an agent really starts to come to life. Analyst can plan, reason, and carry out a sequence of steps to accomplish a defined goal.

You might ask Analyst to do things like:

“Find the top five countries with the highest CO₂ emissions in the most recent year.”

Notice that Analyst doesn’t search the web or pull in outside information. Its focus is entirely on the data at hand, making it ideal for in-depth, workbook-level analysis. In many ways, it operates like the Advanced Analysis feature of Copilot, where you can ask to perform more complex or custom computations.

If you open the dropdown menu while it’s running, you’ll see exactly what’s happening behind the scenes: much like Advanced Analysis, Python code is being generated and executed to carry out the analysis. This gives you transparency into how the results are produced and lets you learn from or even customize the underlying logic if you want to refine your approach further.

Please note that there’s a lot happening behind the scenes here: reasoning loops, code checks, and data validations running in sequence. It may take a little time to process, so be patient while the agent works through each step to deliver accurate and reliable results.

Next, I’ll ask for a data visualization:

“Create a chart showing total CO₂ emissions over time for the top three countries.”

Not only do I get a clean, well-formatted chart in return, but it even includes interactive tooltips. This visualization is created using a Python library called Plotly, which unfortunately isn’t yet supported natively in Python in Excel.

However, the underlying code is fully visible, so you can copy it, save the image, or adapt it as needed. With a bit of help from Copilot, it wouldn’t take much to refactor this Plotly chart into a native Python in Excel visualization.

What makes Analyst an agent is that it doesn’t just execute a single command or calculation. It operates with intent and autonomy toward a defined analytical goal. It plans a sequence of reasoning steps, checks its own work, and adapts as it goes, much like a skilled analyst would. Rather than returning one-off results, it manages a full analytical workflow: identifying the right computations, generating Python code, validating outcomes, and presenting insights in clear, visual form.

Conclusion

Researcher gives you context. Analyst gives you execution. Together, they show how Copilot in Excel is moving from a simple helper to an intelligent partner that can reason, plan, and act. This builds on the basic Copilot experience. Where Copilot Chat handled single questions or formulas, agents like Researcher and Analyst can connect steps, hold context, and adapt to your goals.

Tools like Copilot Studio and Power Automate will only extend this further, letting you design full workflows that link multiple agents and data sources. Together, they form a growing Excel AI stack built for reasoning, automation, and insight:

How to understand the modern Excel AI stack

Thanks for exploring this with me. I hope it helps you see how these new agentic tools can take your Excel work to the next level and inspire new ways to analyze, automate, and create with your data.

The post How to use Researcher and Analyst agents in Copilot for Excel first appeared on Stringfest Analytics.

How to use Copilot in Excel for insurance claims data

George Mount — Mon, 03 Nov 2025 15:40:44 +0000

A lot of people are excited about Copilot, but they don’t just want to see it summarize the same old generic, synthetic datasets again. They want to see what it can actually do for them.

If you work in insurance, this one’s for you. We’ll walk through how to use Copilot in Excel to extract key details from claims, automate text reviews, and uncover insights that would have taken hours to find manually. From formatting and calculated columns to advanced analysis with Python, here’s how to turn your everyday insurance data into clear, actionable intelligence, all without ever leaving Excel.

To follow along, download the exercise file below:

Download the exercise file here

This dataset contains real-world insurance claims, and we’ll perform a practical, hands-on analysis using that data.

Formatting and calculated columns

The easiest way to get started with Copilot in Excel is often to work with your columns, especially if your data is in a table format (and it should be). Copilot understands structure, so the more organized your data is, the better it performs. A great starting point is reformatting your columns or creating calculated ones. Here are some example prompts.

Notice how I’m being very specific with the column names and phrasing things like a programmer would. That precision helps Copilot understand exactly what you want it to do.

“Convert the ‘policy_bind_date’ and ‘incident_date’ columns into standardized date formats.”

Create a new calculated column ‘policy_tenure_days’ showing the number of days between ‘policy_bind_date’ and ‘incident_date’ to measure how long customers held their policy before filing a claim.

“Categorize claims into ‘High’, ‘Medium’, or ‘Low’ severity based on ‘total_claim_amount’.”

For more help working with formulas and functions, check out my LinkedIn Learning course.

Excel Copilot: Working with Formulas and Functionshttps://t.co/z1FdE9zKsz

— George Mount (@gjmount) November 3, 2025

What we did not really cover here is using Copilot for one-off formulas and functions like you might in a financial model. To be fair, this is one area where Copilot struggles a bit. It works best when your data model is structured and clear, not when it has to scan a large, loosely connected workbook.

Financial models often rely on ad hoc formulas and cross-sheet links that only make sense in context, which makes them tricky for Copilot to interpret. In those cases, you might actually find it easier to use Agent Mode to build an updated workbook instead of trying to repair the existing one. Starting fresh gives the AI a cleaner structure to follow and often leads to faster, more reliable results.

How to get started with Agent Mode in Excel

Copilot can even help with conditional formatting. This is one of those places where the more specific you are, the more controlled and meaningful the output will be. In the examples below, I’m telling Copilot exactly which colors to use and what thresholds they map to.

“Apply conditional formatting to the ‘total_claim_amount’ column, highlighting amounts above $10,000 in red, amounts between $5,000 and $10,000 in yellow, and amounts below $5,000 in green.”

You could just ask Copilot to “add conditional formatting” to a column and let it take its best guess, but the result might not mean much. Sometimes that’s fine if you’re just brainstorming ideas or exploring trends. Just make sure you can always explain your reasoning and turn those visuals into something useful, not just sugar for your stakeholders.

Summary PivotTables and PivotCharts

Now we can actually get into analyzing the data. I find that PivotTables and charts are usually the best tools to use with Copilot for this stage. It is often a good idea to specifically ask for these artifacts in your prompt. Otherwise, Copilot might generate a random Python code block instead, which probably is not what you wanted (we’ll look at using Python in a more controlled manner later in this post).

If you are looking to explore or summarize a dataset, start by asking Copilot to create PivotTables and PivotCharts. Once they are built, Copilot can help make small adjustments to them, though it may not handle every detail. For more complex refinements, you can ask Copilot more generally for high-level guidance.

If you would like more structured help with PivotTables and Copilot, check out my LinkedIn Learning course on the topic:

Excel Copilot: Working with PivotTableshttps://t.co/NmopWSycZC

— George Mount (@gjmount) November 3, 2025

“Summarize the total claims by ‘incident_severity’ using a PivotTable and visualize with a bar chart.”

“Visualize claims frequency by the hour of the day using a histogram.”

These summaries go far beyond simple reports. They help you see which claim types drive the most losses, when incidents occur most frequently, and where underwriting or claims operations could focus next. For leaders, that means faster reviews and more consistent insight across teams.

Advanced Analysis with Python

And now we get to my favorite part of Copilot in Excel, the advanced analysis mode.

Maybe you have been a little underwhelmed so far with what Copilot can do. If you are an experienced Excel power user, you can already handle formulas, charts, and PivotTables in your sleep. Fair enough. But once you add Python to the mix, the possibilities expand fast.

Advanced analysis with Python lets you do things that would otherwise be very difficult or impossible to achieve with standard Excel tools. You can run correlations, model claim probabilities, and visualize risk distributions, all with plain-language prompts. You don’t need to be a Python or statistics expert, just analytical enough to know whether the results make sense.

To get started, use the phrase “Advanced analysis with Python” in Copilot. This opens a new message asking if you want to continue in that mode.

“Create box plots comparing claim amounts across different ‘incident_type’ categories.“

“Run a basic correlation analysis between ‘insured_age’ and ‘total_claim_amount’ to understand potential relationships, and visualize with a scatter plot.”

From there, we can ask Python to handle everything from visualizations to exploratory data analysis to full-on predictive modeling. It really is a powerhouse tool, and the best part is that we are doing all of this just through our prompts.

“Perform a logistic regression analysis predicting the likelihood of insurance claim fraud based on policy characteristics and incident details. Generate a ROC curve and confusion matrix to evaluate model performance.”

For insurers, this capability means identifying outliers faster, testing hypotheses on the fly, and validating models directly in Excel. It transforms spreadsheets from static reports into dynamic analytical workbenches.

Conclusion

Generative AI is changing how insurers use Excel. With Copilot, analysts can automate claim reviews, speed up audits, and uncover insights that support underwriting and risk management. It makes data analysis faster, clearer, and more accessible to teams who already know Excel well.

Copilot still depends on clean, well-structured data and human judgment. It can handle formulas, formatting, and even predictive modeling, but it needs clear direction and critical thinking to ensure accurate results. The best outcomes come when analysts combine their business knowledge with Copilot’s automation and analysis features.

As a next step, insurance teams should start by testing Copilot on a few key workflows such as claim reviews or loss ratio reports, learning where it saves the most time and produces the clearest insights. From there, the benefits can scale across departments.

If you want to make this technology work for your insurance company, book a free discovery call:

Book me

The post How to use Copilot in Excel for insurance claims data first appeared on Stringfest Analytics.

Python in Excel: How to edit the initialization environment

George Mount — Wed, 15 Oct 2025 18:55:36 +0000

In an earlier post I talked about the Python in Excel initialization environment:

I likened it to the starting lineup of packages that load every time you open a workbook. It’s what gives you access to pandas, seaborn, and all the other core tools right inside Excel, along with some behind-the-scenes plumbing that lets Python and Excel talk to each other.

How to understand the initialization environment in Python for Excel

As of October 2025, you can now customize this environment. That means you can decide which libraries, helper functions, and even your own custom classes load automatically whenever the workbook opens.

In practice, this opens up a lot of fun possibilities. You can preload machine learning packages, set global formatting so every chart and table looks the way you want, or even create little interactive summaries that pop up when you hover over a cell.

We’ll try a few of these ideas using the mpg dataset, a simple but useful sample of car data. You can follow along with the exercise file below.

Download the exercise file here

1. Preload extra packages

The first thing we can do with editable initialization is call in extra packages.

Keep in mind that Python in Excel doesn’t include everything yet. If you want to see which packages are currently available, check out the below post:

How to see all packages available in Python in Excel

For example, we might want to use scikit-learn to run a quick linear regression. We can add it to our initialization so it’s always ready whenever we open the workbook.

Once that line is in the initialization editor, you can use scikit-learn like any other built-in library.

While you might choose to add imports, I wouldn’t recommend removing anything that’s there by default. Some of it might look random, but a lot of it is important plumbing that makes Python in Excel work properly. Packages like excel handle how data moves between cells and Python, and libraries like pandas and numpy are what most of Excel’s Python features rely on under the hood.

2. Define reusable helper functions

The next thing we can do with editable initialization is add a few helper functions that you find yourself writing over and over.

Think of these like the “utility belt” of your Python in Excel setup. They’re small functions you can reuse across your workbook to clean up data, format values, or calculate common stats. Because they’re baked into your initialization, they’re ready to go every time you open the file.

Here’s an example of a couple we might add to this workbook:

The format_currency() function is great for any data you want to show as dollars and cents — maybe sales figures, costs, or anything you want to make presentation-ready. And kpi_summary() gives you a quick way to summarize your data without writing mean(), min(), and max() in separate cells.

Once these are part of your initialization, you can call them right in Excel like this:

To make sure our new cost-per-mile column is formatted a specific way, we can apply our formatting function right in the formula. Excel can handle formatting too, but this gives us a simple, formula-driven way to control how the output looks from Python.

If we want to be able to change the gas price easily and use it in other parts of the workbook, we can make it a global variable in our initialization. For example, we could add a line like

gas_price_per_gallon = 3.25

in the initialization editor. That way, if the price changes later, we only have to update it once.

3. Set consistent styling defaults

Next, let’s set up a global look for our charts and tables:

What this does is establish a clean, consistent look for all of your seaborn plots. The white grid background makes it easier to read values, the deep palette gives you balanced colors that work well for presentations, and the talk context increases font sizes so your charts are easier to see on slides or dashboards.

Once you add this to your initialization, every plot you make will follow the same styling automatically. You don’t have to restyle every chart or remember which color palette you used last time.

For example, this scatterplot of horsepower versus miles per gallon uses that same theme without any extra formatting in the code:

4. Create custom classes and tooltips

Last but not least, let’s look at something a little more advanced. So far, we’ve been adding functions, constants, and settings to our initialization. Those are great for quick tasks, but sometimes you want to bundle related things together: for example, a dataset and a few operations that always go with it.

That’s what this class does. It’s a small, reusable object that holds a dataset and a couple of ready-made actions you can run on it.

Here’s what’s happening.
The word class just means we’re defining a little container for data and the functions that go with it. In this case, the class is called QuickStats. When we create a new one, it expects a dataset (mpg_df) to be passed in. Inside the class, that dataset is cleaned up a bit, and a few summary numbers are calculated — how many rows there are, the average miles per gallon, and the average weight.

After that, the class has two built-in actions, called methods.

tooltip() creates a small DataFrame you can use as an at-a-glance summary in Excel.
chart() creates a quick scatterplot showing horsepower versus miles per gallon.

Once this is in your initialization, you can use it in Excel like this:

That produces a neat little table card summarizing your data.

Or you can run:

to get a chart, automatically styled with the theme you set earlier.

The win here is that you’ve packaged up a repeatable analysis into something that feels like its own tool. You don’t need to re-type formulas or copy and paste code. You just call the class and it does the work for you.

This is a simple example of object-oriented programming, which just means building small, reusable “objects” that combine your data and the actions you take on it. You don’t need to master OOP to benefit from it, but it’s a great way to keep your analyses organized and consistent across workbooks.

Conclusion

Here’s the complete initialization code with everything we covered in this post:

When you edit your initialization script in Excel, make sure to save your changes before closing the editor. Then go to Formulas > Python > Reset > Reset Runtime for the changes to take effect.

That reset step is important because it restarts the Python engine that Excel uses in the background. Until you do that, Excel is still running the previous environment in memory, so your new imports, functions, or variables won’t load.

Editable initialization is one of those quiet but powerful upgrades to Python in Excel. It saves setup time, makes your workbooks more consistent, and helps you keep analysis tools, constants, and styles in one place.

The quick wins are being able to open any file and start working right away, with your preferred imports, formatting, and helper functions already loaded. The main highlight is consistency: you can build a repeatable workflow that looks and behaves the same across projects.

The post Python in Excel: How to edit the initialization environment first appeared on Stringfest Analytics.

How to choose between Copilot, Agent Mode, and Copilot Studio in Excel

George Mount — Fri, 10 Oct 2025 18:20:12 +0000

Microsoft keeps expanding what Excel can do with AI. First came Copilot, then Agent Mode. At the same time, Copilot Studio and Agent Flows are entering the picture. The result is powerful but also confusing. Many people are trying to figure out what each tool is for and when to use it.

This post explains how to think about Agent Mode, how it compares with Copilot, and why both still matter. It also looks at where tools like Copilot Studio and Agent Flows fit into the broader Microsoft ecosystem for Excel users.

how-to-choose-copilot-agentmode-studio-carousel-li Download

Copilot vs. Agent Mode

At first glance, Copilot and Agent Mode sound like two versions of the same thing. Both involve AI that interacts directly with Excel. In reality, they have very different design goals.

Copilot is a helper. It is designed for the small, piecemeal tasks that analysts perform every day. You might ask Copilot to clean up a dataset, write a complex formula, summarize a range, or create a quick chart. It provides targeted help within the context of the workbook you already have open.

Agent Mode is a builder. Instead of working cell by cell, it can take a broad instruction and generate a complete workbook. You might tell it to build a quarterly sales dashboard or create a forecasting model for next year. It can create sheets, link formulas, and even write explanations. It is far more autonomous and structured around end-to-end creation.

A simple comparison helps clarify the difference:

Feature	Copilot	Agent Mode
Purpose	Task assistance	Full workbook creation
Scope	One request at a time	Multi-step process
Strength	Works with existing files	Builds from scratch
User Role	Active collaborator	High-level supervisor
Best Used For	Quick help and debugging	Prototyping new reports or dashboards

This distinction matters because it changes how you work with Excel. Copilot sits beside you while you work. Agent Mode takes your prompt, runs with it, and delivers a finished product.

Excel Copilot: When piecemeal still wins

It might sound like Agent Mode is the clear winner. After all, if it can build an entire model for you, why not use it all the time?

The reason comes down to how analysts actually work. Most of us are not starting from a blank sheet. We are maintaining workbooks that already exist. They might be forecasting models, KPI dashboards, or monthly reports that have evolved over years. They are usually mission-critical, connected to multiple data sources, and fragile in places.

In that context, incremental help is often safer and more realistic than full automation. You want a tool that can step in, understand what is there, and fix small issues without breaking the logic. Copilot handles this better right now. It can explain formulas, generate snippets, or reformat data without taking over the file.

Agent Mode, on the other hand, behaves like a blank-slate designer. It is better at starting fresh than at understanding what is already built. From what I have seen so far, it struggles when the goal is to repair or optimize an existing model “in flight.” It tries to interpret your workbook, but the context often gets lost.

Analysts know this feeling well. Sometimes it is easier to start over than to fix what is broken. That is exactly how Agent Mode currently operates. It builds something new rather than carefully weaving into the logic you already have.

The bigger picture: Copilot Studio and Agent Flows

There is also a broader shift happening in how analysts work. Excel is no longer the single destination for analysis. It is part of a much larger ecosystem.

Data now flows through Power BI, SharePoint, OneDrive, Dataverse, and external sources like SQL or Azure. Analysts collaborate in Teams or push reports through Power Automate. In that world, Agent Mode’s single-application focus stands out. It can do amazing things inside Excel but does not yet extend far beyond it.

That is why Copilot Studio and Agent Flows are such important developments. They bring the same agentic logic to the entire Microsoft 365 environment. Copilot Studio allows you to design and deploy your own custom agents that can move between apps. You can connect Excel to Outlook, Teams, or Power BI without writing a line of code.

Agent Flows take that one step further. They combine the logic of Power Automate with the intelligence of AI. Instead of following rigid “if this, then that” rules, an agent can interpret the situation and decide what to do next. It is automation that learns context rather than just repeating instructions.

Seeing the Layers

A helpful way to visualize this evolution is to think in terms of layers:

Each layer builds on the previous one. Copilot helps you perform tasks within Excel. Agent Mode automates the creation of a full workbook. Copilot Studio and Agent Flows orchestrate those agents across the broader Microsoft stack.

Practical takeaways

If you are curious about where to begin, start with Copilot in Excel. It remains the foundation for understanding how AI works inside your spreadsheets. You can take my LinkedIn Learning course on Copilot in Excel for free. The course focuses on practical, real-world examples that help you build confidence before exploring the more advanced agentic tools.

Take the course here →

Once you are comfortable with Copilot, try Agent Mode. Test how it builds reports from scratch and see where it fits into your process. Use it not as a replacement but as a design partner that can show what is possible.

If your team is trying to make sense of all this, whether it’s how to integrate these tools into your existing workflow or how to train analysts for the next generation of AI-powered Excel, I am building training sessions and advisory resources around exactly that.

You can get in touch here to discuss what your organization is exploring, or connect with me on LinkedIn for new articles, sessions, and hands-on tutorials. I am still learning myself where the biggest opportunities for organizations lie, and your feedback helps shape that journey.

В даркнете существует множество торговых площадок, каждая из которых имеет свою специализацию и аудиторию. Они предоставляют пользователям анонимный доступ к широкому спектру предложений, которые невозможно найти в обычном интернете. Известная платформа Kraken market darknet предлагает услуги в различных категориях, обеспечивая высокий уровень безопасности транзакций и конфиденциальности данных благодаря использованию передовых технологий шифрования.

The post How to choose between Copilot, Agent Mode, and Copilot Studio in Excel first appeared on Stringfest Analytics.

Python in Excel: How to understand the random walk with Copilot

George Mount — Mon, 15 Sep 2025 14:00:06 +0000

As an Excel analyst, you’ve probably had moments where your data seemed to show clear trends,profits rising, sales dropping, or costs climbing steadily. But sometimes, these apparent patterns might actually be random,tricking you into thinking there’s more control and predictability than there really is.

That’s the essence of the “random walk.” It illustrates how purely random processes can look structured, meaningful, and even predictable. Why should you care? Because knowing the difference between true signals and random noise helps you make better, smarter decisions in your analysis, whether it’s forecasting sales, tracking performance, or spotting real trends.

Fortunately, exploring random walks doesn’t require complicated setups or fancy software,just Excel, Python, and Copilot. Open a blank Excel worksheet, launch Copilot, start Advanced Analysis, and you’re ready to follow along.

Simulating a random walk with coin flips

Let’s take a moment to explore randomness using a simple example. Imagine flipping a coin 100 times. Each heads moves you one step forward (+1), each tails one step back (-1). After every flip, you track your position to see how far you’ve drifted from the start.

“Make a Python script that simulates flipping a coin 100 times. Each flip moves you up (+1) or down (–1). Keep a running total so we can see how far we drift. Put the results in a DataFrame with columns Step and Position, and make a simple line plot of the position over time. Explain in comments what’s happening in plain English.”

Copilot responds by creating a neatly organized DataFrame, listing each step alongside your position. It also produces a line chart visualizing your journey through these random flips:

When you examine this chart, you’ll notice the position line moves unpredictably, sometimes drifting upward, other times downward. Each flip was random, yet it’s natural to try to spot trends or meaningful patterns here. Randomness can trick us into seeing patterns that aren’t really there. By practicing simulations like this, you’ll develop intuition about what’s meaningful and what’s simply noise in real-world datasets.

Simulating stock prices with a random walk

Instead of flipping a coin, let’s start our simulation this time with a stock price of 100, then add a small, random increase or decrease each trading day for one year (252 days). This subtle daily randomness resembles the fluctuations you’d see in actual stock data. Here’s exactly what you can ask Copilot to do:

“Now change the random walk so it looks like a stock price. Start at 100 and add a little random up or down each day for 1 year (252 days). Plot the stock price. Add comments explaining why this looks kind of like a real stock price (sometimes up, sometimes down, but trending).”

Notice how, like the coin toss simulation, the stock price doesn’t follow a straight line. Instead, it fluctuates over time, occasionally dipping downward, sometimes climbing higher again. Even though each individual day’s change is random, the result looks strikingly similar to actual stock market behavior.

But here’s the thing: One single forecast can only show one possible future. In real life, many outcomes are possible.

Think of the typical forecasts you see in business like projected revenues, sales goals, or stock prices. They’re often presented as one neat line or a single estimate. But relying solely on one forecast can give a false sense of certainty. It’s easy to overlook that the real world rarely matches that neatly drawn line.

Instead, it’s crucial to embrace uncertainty. By visualizing many different scenarios, you can better understand the full range of possible outcomes. This helps you plan for the unexpected, identify potential risks, and avoid overconfidence in a single prediction.

Creating and visualizing multiple random walk simulations

To truly appreciate this idea, let’s extend our simulation and visualize multiple scenarios at once.

“Simulate 100 random stock price paths instead of just one. Plot them all on the same chart so it looks messy. Add comments explaining why one single forecast isn’t enough — in real life we could end up with any of these.”

At first glance, this chart looks chaotic, and that’s precisely the point. Each line represents one possible scenario that might unfold. Some scenarios end positively, some negatively, and others hover in the middle.

Here’s the key insight for you as an Excel analyst: Relying on a single forecast can be risky, because real-life outcomes can vary widely. When you visualize multiple possible futures, you’re reminded of this uncertainty. This helps you make decisions more cautiously and realistically, preparing you for multiple possibilities instead of placing too much confidence in one narrow prediction.

Visualizing a the “cone of uncertainty”

The previous chart showed us 100 possible outcomes. But realistically, presenting hundreds of scenarios isn’t always practical or helpful. Instead, let’s summarize all these possibilities by showing a clear range of likely outcomes.

Specifically, we’ll calculate the 5th, 50th (median), and 95th percentiles from these 100 scenarios for every day of our simulated year. This creates a visual “uncertainty cone,” clearly communicating not only what’s possible, but also what’s probable.

Here’s how you’d prompt Copilot:

“From the 100 random paths, calculate the 5th, 50th, and 95th percentiles for each day. Plot a shaded cone (5–95%) with the median line in the middle. Add comments explaining how this cone shows a range of likely futures, not just one line.”

Instead of placing too much confidence in a single prediction, this visualization helps you recognize the inherent uncertainty of forecasts. As an analyst, presenting data in this way helps decision-makers understand risk, plan more realistically, and prepare for multiple potential futures—not just the one we hope or expect to see.

Visualizing the distribution of final outcomes

The uncertainty cone showed a broad range of likely outcomes. But let’s focus specifically on the end of our forecast period. If we simulate 500 random stock price paths, where do prices usually end up after one year?

You can prompt Copilot with the following instruction:

“After simulating 500 paths, make a histogram of the final prices after 1 year. Mark the average, the low end (5%), and the high end (95%). Write a short caption in plain English: ‘Most of the time the stock ends up around X, but there’s a chance it could be much lower or higher.'”

This histogram summarizes clearly how our 500 simulations played out. The average final stock price lands around 114, shown by the solid vertical line. But notice the two dashed lines: at the lower end (5%), the stock price could drop to around 87, while at the higher end (95%), it could climb to around 146.

Conclusion

As explored in this post, randomness plays a larger role in your data than you might initially expect. A single forecast can feel clear and precise, but it rarely captures the true uncertainty of real-world outcomes. By visualizing multiple scenarios, using uncertainty cones, and examining distributions of potential outcomes, you communicate clearly about risk and uncertainty, helping decision-makers set more realistic expectations.

These simulations have limitations, however. Real-life data often includes specific trends, seasonal variations, or sudden shocks that our simplified random walk model doesn’t fully address. To extend your analysis further, consider incorporating historical patterns into your simulations or exploring more sophisticated forecasting techniques like Monte Carlo analysis.

Embracing uncertainty makes you a more effective analyst by improving your judgment, clarifying your communication, and better preparing you and your team for a variety of potential futures.

If you’d like to explore these concepts directly, download the complete Excel sample solution file below:

Download the solution file here

The post Python in Excel: How to understand the random walk with Copilot first appeared on Stringfest Analytics.

Python in Excel: How to check Benford’s Law with Copilot

George Mount — Fri, 12 Sep 2025 20:36:22 +0000

Numbers don’t spread out evenly across digits. In many real datasets, the leading digit “1” shows up about 30% of the time while “9” hardly appears. That’s Benford’s Law. If your numbers don’t follow this pattern when they should, it could mean something odd or even fraudulent is going on.

The reason comes from scale. When values stretch across different magnitudes like dollars, populations, or GDPs, lower digits cover more of the number line. That makes “1” appear more often than “9.”

Excel users should care because this is a quick way to sanity check data. Doing it in plain Excel is messy, but with Copilot and Python in Excel you can generate the code, run the test, and see results right in the sheet.

We’ll be working with GDP data from the World Bank. It spans everything from tiny island nations to the biggest economies, which makes it a perfect test case for Benford’s Law. Download the exercise file below to follow along:

Download the exercise file here

Frequency counts of first digit

To check Benford’s Law, I asked Copilot:

“Using GDP data, extract the first digit from each country’s GDP value. Count how frequently each digit from 1 to 9 occurs as the first digit.”

Copilot returned a simple frequency table. The digit 1 appeared 69 times as the first digit, far more than any other. The counts then tapered off, with 2 showing up 50 times, and by the time we reach 9, it only appeared 8 times.

That “long tail” is exactly what Benford’s Law predicts: in many naturally occurring datasets, smaller leading digits like 1s and 2s are far more common than larger ones like 8s and 9s. Why? Because numbers grow exponentially across orders of magnitude. A country’s GDP is just as likely to fall between 10 and 20 billion as between 80 and 90 billion, but the first digit in those ranges will be very different.

This quick count shows the GDP data behaving as expected. If we saw a flat or unusual distribution, that would be a red flag. Analysts sometimes use this property of Benford’s Law to detect anomalies or even fraud in financial reporting.

Visualizing the actual distribution versus Benford’s Law

Once I had Copilot tally up the first digits, I asked it to take the next step:

“Calculate the percentage distribution of these first digits and compare them visually against the expected distribution according to Benford’s Law.”

Copilot generated a side-by-side chart. The blue bars show the observed GDP data; the orange bars are the theoretical Benford’s Law distribution. Notice how closely they track. About 30% of GDP figures start with the digit 1, while fewer than 5% start with 9: almost a perfect fit with Benford’s prediction.

Interpreting the results

To push the analysis further, I asked Copilot to highlight where the GDP data deviates from Benford’s Law.

“Highlight significant deviations between actual first-digit frequency versus Benford’s Law expected frequency, if any. Suggest possible interpretations or concerns from a business auditing perspective.”

Copilot’s commentary pointed out where the GDP data seemed to deviate from Benford’s Law… a few too many 2s and 4s, a few too few 8s and 9s. But are those differences actually meaningful, or just random noise in the dataset?

Testing for statistical significance

That’s where a statistical test comes in. I asked Copilot to run a chi-square test to formally compare the observed distribution against Benford’s Law.

“Perform a statistical test (e.g., chi-square) to quantify the deviation from Benford’s Law.”

The result was a chi-square statistic of about 11.44 with a p-value of 0.18. Since that’s well above the usual 0.05 threshold, we don’t have evidence that the GDP data significantly deviates from Benford’s prediction. In other words, the small bumps we saw in the bar chart are well within the range of normal variation.

Conclusion

So what does all this mean? Benford’s Law isn’t just a neat math curiosity. It’s a practical lens for spotting when numbers might not add up. With Copilot and Python in Excel, you can go from raw data to statistical tests without leaving your spreadsheet.

For auditors, that means a quick way to flag suspicious ledgers. For analysts, it’s a sanity check on whether a dataset behaves like it should. And for anyone working with large, messy numbers, it’s a reminder that the patterns in data can be just as important as the numbers themselves.

The takeaway: if your digits line up with Benford’s Law, good… you can have more confidence in your dataset. If they don’t, that’s not proof of fraud, but it is a signal worth investigating. Either way, Excel now gives you the power to test it yourself with just a few prompts.

The post Python in Excel: How to check Benford’s Law with Copilot first appeared on Stringfest Analytics.

How to create a real estate dashboard in Power BI, Part 2: Building the dashboard

George Mount — Fri, 12 Sep 2025 18:28:04 +0000

In a previous post, I showed how to take a tenancy schedule from a real estate brochure PDF and turn it into a Power BI dashboard. In the last post, we focused on retrieving and cleaning the data:

How to create a real estate dashboard in Power BI, Part 1: Data cleaning

In this post, we’ll move on to building out the dashboard itself. You can follow along using the exercise file below:

Download the exercise file here

Remember, the Power Query steps we did earlier were mainly backend work. If you want numbers to appear in a specific format, like currency or percentages, that’s handled separately in Power BI under Table View. As you continue building the dashboard, be sure to switch back to this view when needed to adjust formatting. I’ll get started by formatting Rent (£ PA) as currency in GBP.

The first step in the report view is to add KPI numbers across the top of the dashboard. Select the Card visual and place it in the upper-left corner of the canvas, then drag Sum of Rent (£ PA) into the Fields area of the card.

If your card value isn’t showing in pounds, make sure to revisit the previous step: switch to Table View and format the column correctly.

This is the point where we can start validating our numbers against the original source. The PDF (linked in the first post of this series) includes a Total row at the end. Let’s compare our dashboard results with that figure… and they match.

Nice work so far. Next, let’s calculate the average rent per square foot. Create a new Card visual, drag Rent (£ PSF) into the Fields, and change the aggregation to Average. You’ll get a result… but it won’t match the figure in the PDF. Why is that?

To match the average rent per square foot shown in the brochure, we can’t just take a simple average of the rent values row by row. That approach gives every lease the same weight, even though some spaces are much larger than others. Instead, we need a weighted average that takes square footage into account. In Power BI, we can calculate this by creating a DAX measure.

Go to Home on the ribbon, select New Measure, and enter the following:

The resulting measure should now match the total from the original PDF. Great. To format it, switch back to Table View. You won’t see the measure displayed inside the table itself, but you will find it listed with the table’s fields on the far right side of the screen.

Great work so far. The last KPI we’ll add to this dashboard is the vacancy rate. In the original data, you’ll notice a row labeled Vacant (Fitted). We can use this row as the basis for calculating the percentage of total square footage that is vacant. To do that, create the following measure:

This function is calculating the vacancy rate. At a high level, it works by dividing the square footage of vacant space by the total square footage. The inner CALCULATE finds the sum of square feet for rows where the tenant is listed as “Vacant (Fitted).” That result is then divided by the sum of square feet across the whole dataset to give us a percentage.

The last piece is the COALESCE function, which just says “if the calculation doesn’t return anything, replace it with zero.” This matters because if the vacant row gets filtered out, the numerator would be blank and the result would look like an error. By wrapping the calculation in COALESCE, we make sure it shows a 0% vacancy rate instead.

It’s completely fine if you don’t follow every detail here. The main idea is that the function finds vacancy as a share of the total, and COALESCE is just a safety net to show zero if the vacant row isn’t present.

We’ve still got a little room left on the top panel, so let’s add a slicer. In Power BI, a slicer works like an interactive menu that filters what you see on the dashboard. Here, we’ll use it to switch between Cat A and Cat B fit-outs. In real estate terms, Cat A means the space is partially finished, while Cat B means it’s fully customized and move-in ready. Go ahead and drop Fit Out into the slicer’s Fields.

To make this slicer a little bigger and easier to use I am going to modify the appearance under Format visual and Style and make this a Tile-style slicer and then set the font to a bigger font under Values.

Nice work so far! We’ve got some solid KPIs and interactive slicers at the top of the dashboard. Now, what should we add to the bottom? One good option is a Gantt chart to track lease timelines.

A Gantt chart shows data across time: each row represents a tenant, and the bars show when their lease starts and ends. This makes it easy to see lease lengths, overlaps, and upcoming expirations.

For a real estate dashboard, that’s especially useful because vacancies stand out at a glance. Power BI doesn’t include a Gantt chart by default, but Microsoft offers one you can add. Just click the three dots under Visual options and select Get more visuals:

Look up Gantt charts and choose the one published by Microsoft:

This one takes a bit more effort to set up. Start by dragging Tenant to Task, Lease Start to Start Date, and Lease Expiry to End Date. Power BI will aggregate this as the earliest Lease Expiry by default, but since each tenant has only one expiry date, that works fine.

You’ll see a Gantt chart appear, but at first it’s tough to read because it shows progress week by week. To make it clearer, roll it up to the annual level. Go to the format visual options and, near the bottom, change the aggregation to Year. Now the chart is much easier to read.

You may notice that many of the category labels are getting cut off in this chart. To fix this, go to Category Labels in the visual format settings and widen the area. You can also bump up the font size a bit if you’d like.

It’s also a good idea to add a title. You can do this under the General section of the format settings.

This is looking great so far. To give each lease a bit more context, we can add details to the tooltips, like Area and Rent. Just drag those fields into the Tooltips section. Don’t use the average rent measure here. We want the raw values, and Power BI won’t accept a measure in this case.

This is coming along nicely. One thing I’m not loving is the date formatting—I don’t need the timestamp. And what if I wanted it in the British format instead? To fix this, go to the visual format settings, open Tooltip Settings, and change the date format to dd/MM/yyyy.

This is looking great so far. But you might notice that the Vacant line isn’t showing up in the Gantt chart. That’s because there’s no start or end date. There’s no lease since it’s vacant.

To work around this, we can fill in those missing dates with the earliest and latest lease dates from the other tenants. We’ll do this with calculated columns. To set it up, go back to Table View, then on the ribbon under Home or Column Tools, select New Column and create this column:

This calculation uses each tenant’s lease start date, but if that’s missing it fills it with the latest lease start date in the dataset. In short, it makes sure there’s always a start date available, which forces the vacant rental to appear in the Gantt chart.

Next, do the same process to fill in the lease expiry.

Vacant is now showing up in the Gantt chart. As far as I know there isn’t a way to conditionally format its color. We could group vacant rentals under a parent category, but with the limited space here I’ll leave it as is. Still, pretty cool that we turned a static PDF into something like this.

Bringing this all together, we’ve gone from a static PDF tenancy schedule to a fully interactive Power BI dashboard. Along the way, we cleaned and shaped the data, added KPIs, built a slicer for quick comparisons, and even created a Gantt chart to track lease timelines. We also solved some tricky issues like formatting dates, weighting averages correctly, and ensuring vacant units display properly.

That’s the real power of a dashboard: it transforms flat, static information into something dynamic and insightful. Instead of flipping through rows in a brochure, you can instantly spot trends, validate numbers, and track leases.

The post How to create a real estate dashboard in Power BI, Part 2: Building the dashboard first appeared on Stringfest Analytics.

Python in Excel: How to do price elasticity analysis with Copilot

George Mount — Thu, 28 Aug 2025 12:12:40 +0000

Understanding how price changes affect customer demand is one of the most valuable insights you can bring to your business. This concept, known as price elasticity of demand, helps answer practical questions such as what happens to sales if prices increase, how much additional volume promotions really generate, and whether a company is leaving money on the table by discounting too often or not often enough.

Traditionally, analyzing elasticity required exporting data into specialized statistical tools or coding environments. That meant extra complexity and often a steep learning curve. With Python now integrated directly into Excel, and with Copilot guiding you step by step, you can perform this kind of sophisticated econometric analysis without ever leaving your spreadsheet.

In this post, we will use Copilot prompts to generate and analyze a dataset in Excel with Python, estimate the price elasticity of demand for everyday products, visualize the impact of promotions on customer demand, and summarize the business implications of the results.

To make this practical, you can follow along with the exercise file linked below. The dataset contains 200 days of simulated sales data for three products: coffee, tea, and juice. Each record includes variation in price, units sold, promotion status, and revenue. This combination of columns provides just the right balance of simplicity and analytical richness, giving us enough variation to model how price and promotions affect demand while remaining accessible even if this is your first time using Python in Excel.

Download the exercise file here

Running the elasticity regression

Price elasticity of demand measures how sensitive customers are to changes in price. If demand is elastic, even a small increase in price will cause sales to fall noticeably. If demand is inelastic, customers will continue buying roughly the same amount even if prices rise. Elasticity is expressed as a number: for example, an elasticity of –1.2 means that a one percent increase in price will cause a 1.2 percent decrease in quantity sold. The negative sign reflects the usual inverse relationship between price and demand.

The most reliable way to calculate elasticity is through a regression model. Instead of simply comparing two price points, regression uses all of the available data to estimate the average relationship between price and quantity sold. A common approach is a log-log regression, where both price and sales are expressed in logarithmic terms. The beauty of this method is that the coefficient on log(price) can be interpreted directly as the elasticity. In other words, Python does the math once, and we get a clean number that tells us how responsive demand is to price.

Here’s the prompt I’ll use to get these numbers:

“Run a regression in Python to estimate the price elasticity of demand from my dataset.”

Copilot fitted the model for each product and returned elasticity estimates: Coffee at –0.074, Tea at –0.067, and Juice at –0.108. All three values are negative, as expected, confirming that higher prices lead to lower sales. The magnitude of the number tells us the degree of sensitivity. These results indicate that all three products are inelastic: for instance, raising the price of coffee by one percent would reduce sales by only about 0.07 percent. Tea is slightly less sensitive still, while juice shows the greatest responsiveness of the group.

Visualizing the relationship between price and units sold

Numbers alone can sometimes feel abstract. That’s why it’s helpful to visualize the connection between price and units sold. By plotting these two variables against each other, we can see the demand curve in action.

“Visualize the relationship between price and units sold as a scatterplot with a trendline.”

The result is a set of scatterplots, one for each product, showing the observed sales at different price points, with a fitted trendline running through the data. Each blue dot represents a day’s sales. The red line represents the overall relationship between price and units sold.

Looking at the charts, the downward slope of the trendline confirms what the regression already told us. As prices rise, sales tend to fall. For coffee and tea, the slope is very shallow, which aligns with the relatively inelastic elasticity values we saw earlier. Customers keep buying even as the price fluctuates. Juice, however, shows a more noticeable downward slope, indicating that its sales are more sensitive to price changes.

Are These Really Demand Curves?

If you’ve ever sat through an economics class, you probably remember seeing demand curves drawn as smooth, downward-sloping curves. They often look a bit exponential: sales fall gradually at first as prices rise, then more sharply as prices get very high. That shape reflects the idea that the relationship between price and quantity demanded is not perfectly straight, but curved.

So how do the scatterplots we just made fit into that picture? Strictly speaking, what you see above are empirical demand relationships, not demand curves in the textbook sense. Each dot represents a real observation from our dataset: a specific day’s price and the units sold at that price. The red line is a fitted trendline showing the overall direction of the relationship. It looks straight because we asked Python to fit a simple linear line in this plot.

Behind the scenes, though, the regression model we ran earlier was a log-log regression. That means both price and units sold were transformed into logarithmic terms before fitting the model. This approach essentially assumes that the relationship between price and demand is multiplicative, closer to the curved demand curve you might have seen in an econ diagram. The elasticity coefficient that comes out of that regression captures the slope of that curve at any point, telling us how sensitive demand is to percentage changes in price.

So while these scatterplots are extremely useful for building intuition and for communicating with a broader audience, they’re not demand curves in the pure economic sense. They’re visualizations of the actual data we have. The true “demand curve” is an abstract concept: it’s the smooth function we estimate from the regression that describes how quantity would respond to every possible price, not just the ones we happened to observe in the data.

Examining the impact of promotions

Price is not the only factor shaping demand. Promotions such as discounts, special offers, or marketing pushes can dramatically influence how much customers buy, even at the same price point. To tease apart these effects, we can ask Copilot to separate the data into two groups: sales that happened during promotions and sales that did not. Here’s the prompt I used:

“Show me how promotions shift the demand curve compared to non-promotion days.”

The resulting charts plot two sets of points for each product: blue dots represent promotion days, while orange dots represent non-promotion days. Each has its own trendline fitted through the cloud of points.

What do we see? For all three products, the promotion trendline sits higher than the non-promotion trendline. That means that at any given price, promotions are associated with higher units sold. For coffee, the gap is fairly wide, suggesting promotions provide a noticeable boost in demand. Tea shows a smaller but still consistent uplift. Juice again looks the most sensitive, with promotion days pulling sales clearly above the baseline.

Deriving business recommendations

At this point, we’ve run regressions and plotted scatterplots, but what does the elasticity number actually mean for someone making pricing decisions? To bridge that gap, I asked Copilot to summarize the results in straightforward business terms with the following prompt:

“Summarize in plain English what the elasticity estimate means for my pricing strategy.”

The key takeaway here is that your pricing strategy should be deliberate: modest adjustments can capture additional revenue with minimal risk, but aggressive changes may erode demand more sharply.

Conclusion

Elasticity analysis gives us more than just numbers; it provides a way to think strategically about pricing and promotions. There are limits, however, to what this analysis can tell us. Elasticity is an average, not a guarantee. It doesn’t capture seasonal swings, competitive pressures, or differences across customer segments.

Promotions, too, are not all created equal. The type of offer, the timing, and the marketing channel can all change their impact. And of course, our dataset was simplified for illustration. Real-world data will be messier, and interpretation will always require judgment.

The next steps for someone doing this kind of work in a business setting would be to expand beyond a single product and time frame. You might test whether elasticity looks different across categories, examine how it changes during peak versus off-peak seasons, or simulate “what if” scenarios to see how a pricing or promotion strategy could play out.

For the typical Excel user, the bigger picture is that elasticity analysis provides a structured, data-driven way to quantify customer behavior. It helps you gauge how much flexibility you have when adjusting prices or running promotions, turning raw sales data into guidance for real decisions. Approached thoughtfully, it becomes one of the most practical tools in your analytics toolkit.

The post Python in Excel: How to do price elasticity analysis with Copilot first appeared on Stringfest Analytics.

Copilot in Excel: How to do social influencer analysis with Python

George Mount — Wed, 27 Aug 2025 18:16:39 +0000

Understanding who your most influential customers are—those who consistently recommend your business—is essential for sustainable, cost-effective growth. This approach, called social influencer analysis, identifies customers who act as connectors and referral hubs within your customer base. Historically, performing this type of analysis required complex Excel formulas or specialized analytics software, placing it beyond the reach of many everyday Excel users.

With Python’s NetworkX package now directly accessible inside Excel, performing sophisticated influencer analysis has become intuitive and visual. NetworkX quickly reveals referral relationships, measures influencer importance through centrality metrics, and identifies natural customer communities, giving Excel users a significant analytical advantage. Better yet, by leveraging Copilot and generative AI, you can perform this entire analysis simply by using clear, natural-language prompts.

To get started, download the exercise file below, which includes sample customer referral data you can use to practice these techniques.

Download the exercise file here

Creating a referral network

Once you’ve loaded your referral data into Excel, you’re ready to visualize the network of customer referrals. Using Excel’s built-in Copilot and generative AI, you can prompt it to create a clear and intuitive network graph:

“Using NetworkX, create a directed network graph where each node is a customer, and each edge represents a referral (from referrer_customer to referred_customer). Visualize the graph clearly labeling all nodes.”

The resulting visualization represents customers as nodes (points), with arrows indicating the direction of referrals from one customer to another. From this directed graph, it’s easy to immediately spot key connectors (customers who refer many others, indicated by multiple outgoing arrows) as well as chains of referrals flowing through the customer network.

At a glance, you can see influential referrers like Alice, Bob, and Grace, each having multiple connections. You can also observe referral chains such as Victor → Uma → Tina → Steve, highlighting how certain customers can indirectly drive growth beyond immediate connections.

This powerful yet intuitive visualization provides immediate insights into referral behavior, identifying influential customers and potential areas to strategically encourage even more referrals.

Identifying the most influential customers

After visualizing the customer referral network, the next step is to quantify which customers are the most influential: those who directly referred the highest number of new customers. NetworkX simplifies this step by calculating out-degree centrality, a measure of how many referrals each customer made relative to the entire network.

Using Copilot, we enter the following clear and simple prompt

“Use NetworkX to calculate centrality metrics for each customer node in the referral graph. Identify the top 5 customers based on their out-degree centrality, clearly displaying each customer’s name along with their centrality score.”

The results show the five customers who are most actively referring new customers: Alice, Carol, Grace, and Bob are tied as top influencers, each with an out-degree centrality score of approximately 0.0952. This indicates these customers have made the most referrals compared to the rest of the network. Quinn also appears as an influential customer with a centrality score of about 0.0476, meaning she’s also actively contributing to customer growth.

From a business perspective, these influential customers are ideal candidates for rewards, loyalty programs, or ambassador campaigns. Recognizing and incentivizing these individuals can amplify your growth by motivating them to refer even more new customers, boosting organic growth at a relatively low cost.

Detecting customer communities

Next, we want to discover customer communities, or natural groupings within our referral network, using something called a community detection algorithm. This algorithm will analyze how customers are connected through referrals, finding groups who interact more closely with each other than with the broader network. It starts by taking our referral network (where referrals have a direction, from one customer to another) and simplifying it into connections without direction, effectively just looking at who knows who, without worrying about who referred whom first. Then it systematically identifies groups that are tightly interconnected, giving us natural segments within our customer base.

Here’s our prompt:

“Convert our directed referral graph into an undirected graph, and apply a community detection algorithm (such as Girvan-Newman from NetworkX) to identify distinct customer communities. Clearly and visualize list the customers within each detected community.”

Excel’s traditional grid can have difficulty clearly showing these community clusters, because the algorithm returns groups as Python lists rather than familiar Excel tables (DataFrames). However, when in doubt, simply ask Copilot for a visual representation.

As shown above, Copilot provided a clear, intuitive graph visually highlighting three distinct communities within the network, each identified with its own color. The blue community, for example, contains customers like Victor, Uma, Tina, Steve, Grace, and Quinn, who strongly interact with each other. Another tightly-knit group (the orange community) includes Bob, Mike, David, Ian, and John, clearly showing internal referral activity. Lastly, customers Alice, Carol, Eva, Henry, Laura, and Kate form a cohesive third (green) community.

With this visualization, it becomes straightforward to identify how your customers naturally segment themselves into groups, offering immediate insights into targeted marketing opportunities—without needing deep data science expertise.

Summarizing business insights

To wrap up our analysis, let’s ask Copilot to provide a concise summary highlighting key insights and practical recommendations:

“Provide a concise summary of insights gained from this analysis, emphasizing the most influential customers (referrer_customer), clearly defined customer communities, and recommendations for targeted marketing campaigns.”

Copilot returns the following write-up:

Based on these insights, your marketing team should focus efforts on the key influencers identified within each customer community. By engaging and rewarding these influential referrers, your marketing messages will resonate deeply within their groups, naturally spreading through their trusted connections, ultimately increasing your campaign’s impact and effectiveness.

Conclusion

Social influencer analysis provides Excel users with a powerful yet intuitive method to uncover hidden insights within their customer referral data—insights that were traditionally locked behind specialized analytics tools.

However, as powerful as these methods are, they’re not without challenges. A significant limitation in performing this kind of analysis typically involves the initial step: collecting, cleaning, and structuring your referral data. In many organizations, referral data might be scattered across emails, customer databases, or CRM systems, each with inconsistent formats. Analysts commonly tackle these data challenges by standardizing naming conventions, automating data cleaning tasks with tools like Power Query or Python scripts, and establishing clear data-entry protocols to prevent future inconsistencies.

As a next step, consider pairing this influencer analysis with other customer data you already maintain such as demographics, purchasing history, or customer lifetime value. Integrating this additional context can provide even richer insights, allowing for more targeted and effective marketing campaigns. Finally, remember that the value of this type of analysis extends beyond just identifying your top customers. It builds organizational confidence in making data-driven decisions, empowering Excel users across the business to extract deeper value from their existing tools and data.

The post Copilot in Excel: How to do social influencer analysis with Python first appeared on Stringfest Analytics.