boxplot - Stringfest Analytics https://stringfestanalytics.com Analytics & AI for Modern Excel Wed, 09 Jul 2025 21:02:36 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 https://i0.wp.com/stringfestanalytics.com/wp-content/uploads/2020/05/cropped-RGB-SEAL-LOGO-STRINGFEST-01.png?fit=32%2C32&ssl=1 boxplot - Stringfest Analytics https://stringfestanalytics.com 32 32 98759290 Python in Excel: How to visualize seasonality https://stringfestanalytics.com/python-in-excel-how-to-visualize-seasonality/ Wed, 09 Jul 2025 21:02:27 +0000 https://stringfestanalytics.com/?p=15535 Visualizing seasonality is a critical analytical skill in business because many essential metrics like sales, revenue, or demand often follow predictable seasonal patterns. Understanding these patterns helps businesses anticipate periods of peak activity, manage inventory, allocate resources effectively, and create accurate forecasts. Python in Excel makes seasonal analysis especially straightforward. Unlike traditional Excel, where handling […]

The post Python in Excel: How to visualize seasonality first appeared on Stringfest Analytics.

]]>
Visualizing seasonality is a critical analytical skill in business because many essential metrics like sales, revenue, or demand often follow predictable seasonal patterns. Understanding these patterns helps businesses anticipate periods of peak activity, manage inventory, allocate resources effectively, and create accurate forecasts.

Python in Excel makes seasonal analysis especially straightforward. Unlike traditional Excel, where handling and grouping date-based data can become complex or cumbersome, Python seamlessly manages dates and periods, enabling intuitive grouping, aggregation, and visualization.

Ready to see this in action? Follow along using the famous airline passengers dataset by downloading the exercise file below:

 

Before we start visualizing seasonal patterns, we need to prepare our dataset by extracting date-related components. The original file contains a month column in date format, so we first convert it using pd.to_datetime() to ensure pandas recognizes it as a proper datetime object.

From there, we extract the year, month name, and month number, storing each in its own column. These new fields allow us to group, sort, and plot the data by month and year: critical for showing seasonality over time. This simple prep step makes the upcoming visualizations far easier to build and interpret in Python.

Seasonal overlay plot

With the data set up, our first visualization is a line chart showing airline passenger counts for each month, with separate lines representing each year. To highlight growth across this period, we’re labeling selected years (1949, 1955, and 1960) at their December data points. It would be possible to label every year, but neatly positioning all these labels is tricky in Python in Excel without access to external libraries like adjustText, which manage overlapping labels automatically.

This plot clearly reveals recurring seasonal patterns: passenger volumes consistently peak around mid-year, particularly July and August, before declining toward December. Overlaying multiple years helps to compare these seasonal cycles directly, allowing us to quickly assess consistency across time periods and identify overall growth trends. Seeing each year as its own line makes it easy to recognize both typical seasonal fluctuations and any unusual deviations, making this visualization especially valuable for exploratory analysis and trend identification.

Monthly passenger trends by year

To be fair, you could easily make a plot like this in Excel using PivotTables. But rather than linger on what’s simple in Excel, let’s keep moving and check out some visualizations Python handles even more gracefully.

Seasonal subseries plot

In the next plot, we’re first creating a boxplot for each month using seaborn’s boxplot(). These boxes illustrate the typical passenger counts—showing medians (the central line), interquartile ranges (the box itself), and overall distribution (the whiskers). Next, we layer a stripplot with individual data points jittered to prevent overlapping, providing extra transparency and detail about each month’s specific data.

The resulting visualization highlights seasonal patterns in the passenger dataset. You can quickly see that passenger numbers tend to rise consistently from January, peak strongly during mid-year (particularly July and August), and then gradually decrease toward year’s end. Additionally, the combination of boxplot and individual data points reveals the month-to-month variability and potential outliers—months like July show a higher median and greater spread, while months such as November show lower and tighter distributions. This helps analysts clearly understand not only the typical monthly passenger counts but also how consistent or variable each month’s pattern is.

Box plus strip

Small multiples plot

In this next visualization, we’re creating a small multiples plot (also known as a facet grid) to examine monthly passenger trends individually, year-over-year. Using seaborn’s FacetGrid, we’ve broken the data into twelve distinct subplots; one for each month. Within each subplot, the gray line shows how passenger counts for that month changed from year to year, allowing us to easily track long-term trends month-by-month.

Additionally, we’ve calculated and drawn a horizontal dashed line representing the average passenger count for each month across all years. This provides a useful benchmark, helping us quickly assess whether a given month in a particular year was above or below its long-term average.

Small multiples seasonal plot

This type of visualization is valuable because it separates seasonal effects from broader trends. We clearly see the growth pattern repeated across months, confirming an overall increasing passenger volume year by year. At the same time, individual facets let us assess each month separately, which highlights whether certain months grew faster or slower relative to their historical averages. This structured view helps analysts and decision-makers identify subtle variations or exceptional cases more quickly than by looking at a single aggregated plot.

Seasonal heatmap

To wrap things up, we have a seasonal heatmap visualization created with seaborn’s heatmap() function. The dataset has been pivoted so that each row represents a year, each column represents a month, and each cell displays the number of airline passengers. Colors ranging from pale yellow (low passenger counts) to deep blue (high passenger counts) illustrate the intensity of passenger traffic clearly and immediately.

This visualization helps us quickly grasp patterns and trends at a glance. As you scan from left to right, the seasonal pattern we’ve previously seen becomes very apparent—passenger numbers consistently peak around July and August, as indicated by the darker colors in those months. Scanning from top to bottom (earlier to later years), the deepening shades of blue illustrate the overall upward trend in passenger traffic across the entire dataset..

Seasonal heatmap

Conclusion

Being able to visualize seasonality using Python in Excel offers significant advantages to analysts. Python streamlines date management, simplifies the aggregation of data across time periods, and provides flexible and insightful visualizations that can deepen your analytical insights and decision-making abilities.

However, it’s important to recognize some limitations. Python in Excel currently doesn’t support certain advanced visualization features readily available through dedicated Python libraries, such as automated label adjustment or highly interactive plots. For tasks requiring very detailed customizations, you may still need standalone Python environments.

A natural next step would be exploring seasonal decomposition, a more robust and statistically rigorous technique for analyzing seasonality. Seasonal decomposition breaks down a time series into clear components—trend, seasonal, and residual—which sets you up perfectly for forecasting future values. Applying this method will further enhance your analytical capabilities and precision in forecasting.

If you have any questions about visualizing seasonality or Python in Excel more broadly, I’d love to hear from you! Drop your comments and questions below.

The post Python in Excel: How to visualize seasonality first appeared on Stringfest Analytics.

]]>
15535
How to make a boxplot by group in Excel https://stringfestanalytics.com/making-a-boxplot-by-group-in-excel/ Sun, 08 May 2022 20:46:00 +0000 https://stringfestanalytics.com/?p=8582 Data visualization is not an optional step in data analytics, as Anscombe’s Quartet makes painfully clear: Viagra kopen op het internet https://legdelink.nl/viagra-kopen-zonder-recept/ welke man heeft het niet eens geprobeerd. Het ontwijken van goedkope illegale aanbieders is een sport op zich, maar een alternatief zoals het kopen in een apotheek of winkel is er helaas niet. […]

The post How to make a boxplot by group in Excel first appeared on Stringfest Analytics.

]]>
Data visualization is not an optional step in data analytics, as Anscombe’s Quartet makes painfully clear:

Viagra kopen op het internet https://legdelink.nl/viagra-kopen-zonder-recept/ welke man heeft het niet eens geprobeerd. Het ontwijken van goedkope illegale aanbieders is een sport op zich, maar een alternatief zoals het kopen in een apotheek of winkel is er helaas niet.
De ervaringen van onze klanten liegen er niet om: Alles is absoluut van hoogwaardige kwaliteit. Daarbij is een doos bij ons snel opnieuw besteld en bezorgd dankzij dagbezorging. Pakketjes zijn klein genoeg om door de brievenbus te passen. Thuis blijven voor het bezorgmoment is onnodig. Het is een productnaam voor Sildenafil, bedoeld om prestaties in bed te verbeteren. Het bekende blauwe pilletje helpt tegen erectiestoornissen en enkele hartaandoeningen. Heren gebruiken het medicijn om langdurige erecties te kunnen waarborgen, zonder last te krijgen van verslapping.

To understand a dataset’s distribution, for example, seeing is truly believing.

While the Anscombe’s quartet post makes use of Python to demonstrate that famous statistics example, we’ll use Excel here to build an everyday data visualization: the boxplot, or box-and-whisker plot. This plot is used to visualize the distribution of a variable, making particularly clear its quartiles and any outliers.

To learn more about boxplots, check out this post:

Visualizing the distribution of housing sales prices

For this exercise, we’ll take a look at housing sales prices.

Download the exercise file here

You’ll notice the dataset contains an index column and is enclosed in a Table. I do this to all my Excel datasets and you should too. Read more about how and why to do that here:

Univariate boxplot

First we’ll start by visualizing the distribution of all values in the housing column. Because we are only visualizing one variable at a time, this is considered a univariate visualization.

The univariate boxplot is easy enough highlight the data in column B and select the boxplot chart option (Excel calls it “Box & Whisker”). After that, I’ll cut and paste the boxplot to a new worksheet:

We’ll do a few more things to spruce this up:

  1. Give the chart a helpful name (not Chart Title!)
  2. Format the y axis in currency (double-click on the y axis to get this started)
  3. Remove the x axis entirely (double-click and press Delete this time) … the number 1 is not helpful and since there’s only one entity along the x axis, we really don’t need to label it.

Bivariate boxplot

OK, now for a bivariate boxplot: what if we wanted to visualize the distribution of price by airco? We’re now getting into a bivariate boxplot, and this can be a little trickier to swing in Excel.

It’s not so bad with untidy data…

The crazy thing about this is that it’s really easy to make a bivariate boxplot in Excel when your data is not tidy. To understand what I mean, take a look at the below example from Spreadsheeto’s awesome tutorial on boxplots in Excel.

Do you see the difference in how the data is laid out on the left versus the right? On the left, we’ve got three columns where it should really just be one: subject. You may know this as “pivoted” data; tidy is another way to put it.

“English,” “Math” and “Physics” are really one variable: Subject. This is not tidy.
“no” and “yes” are not separate columns here. This is tidy.

No fear, there is a way yet to make the bivariate boxplot: the “grouping” variable must go immediately to the left of the numeric variable. So in this case, you’re going to have to move airco in front of price, as has already been done in the previous screenshot. From there, you’ll highlight both columns and insert another boxplot:

Make the same cosmetic adjustment to the bivariate boxplot as you did to the univariate (except this time there’s a reasonable x axis, so don’t delete that) and you should have something like this:

Get more dataviz… get more Excel

This isn’t the most straightforward “hack” to get a desired Excel chart, and to be honest I’m not crazy about having to physically move data around to get the chart I want. That’s why I really suggest at some point learning about R and Python for more reliable visualization and analysis. To get started as an Excel user, check out my book Advancing into Analytics:

All that said, I do live in the real world and understand the staying power of Excel (and that there’s staying power for a good reason!). Looking to learn more about data visualization in Excel? I’ve put together a learning path for just that. Get started here:

What questions do you have about data visualization in general or specifically with Excel? Let me know in the comments.

The post How to make a boxplot by group in Excel first appeared on Stringfest Analytics.

]]>
8582