plotnine - Stringfest Analytics https://stringfestanalytics.com Analytics & AI for Modern Excel Tue, 10 Dec 2024 23:25:42 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 https://i0.wp.com/stringfestanalytics.com/wp-content/uploads/2020/05/cropped-RGB-SEAL-LOGO-STRINGFEST-01.png?fit=32%2C32&ssl=1 plotnine - Stringfest Analytics https://stringfestanalytics.com 32 32 98759290 Python in Excel: Why use Python for data visualization? https://stringfestanalytics.com/python-in-excel-why-use-python-for-data-visualization/ Tue, 10 Dec 2024 23:25:39 +0000 https://stringfestanalytics.com/?p=14701 Excel has for decades been a mainstay for business and data professionals, providing a familiar environment for organizing, analyzing, and visualizing data. Its built-in charts and graphs offer a quick, accessible solution for conveying insights. In many cases, these standard visuals are still perfectly adequate: if all you need is a simple line graph or […]

The post Python in Excel: Why use Python for data visualization? first appeared on Stringfest Analytics.

]]>
Excel has for decades been a mainstay for business and data professionals, providing a familiar environment for organizing, analyzing, and visualizing data. Its built-in charts and graphs offer a quick, accessible solution for conveying insights. In many cases, these standard visuals are still perfectly adequate: if all you need is a simple line graph or a standard bar chart, Excel’s native functionality often provides the fastest path to a finished product.

As data grows more complex and the demands for more nuanced visualization options expand, however, some analysts find themselves looking beyond Excel’s native charting capabilities. This doesn’t mean Excel visuals become irrelevant; quite the opposite. They remain a critical tool in your arsenal, an easy go-to when time is short and the audience expects something clean, clear, and uncomplicated. Yet, as challenges evolve, so do the tools at our disposal. Python’s recent integration into Excel offers one such avenue for growth—not as a replacement for Excel’s familiar charts, but as an enhancement, adding another versatile option to the analyst’s toolkit.

Until recently, working with Python for data visualization typically meant operating outside Excel’s environment. If you wanted the flexibility and customization options offered by Python’s libraries you had to use external Python environments and tools and write scripts to manually load static images of those visualizations into Excel. This added complexity often deterred Excel-based analysts from experimenting with Python, keeping many locked into the point-and-click world of the spreadsheet.

That’s all changed now that Python can run natively inside Excel. By bringing Python directly to the place where so many analysts are already comfortable, Microsoft has lowered the barrier to entry. Suddenly, it’s possible to marry the best of both worlds: the familiarity and convenience of Excel’s interface with the expansive capabilities of Python’s visualization libraries. This means you can incrementally learn Python while still relying on Excel’s tried-and-true methods, switching between native charts and code-based visuals as needed. Instead of tossing out your Excel playbook, you can add Python as a complementary chapter.

A quick comparison of Excel’s native charts vs Python in Excel visualizations follows:

While Python integration in Excel is a significant step forward, it’s not a silver bullet. Within Excel’s environment, Python-generated charts currently lack some of the interactive bells and whistles analysts might enjoy when working in a standalone Python environment. For instance, popular Python-based tools for interactive elements—such as tooltips, sliders, or clickable legends—aren’t yet supported. If one of your main goals in adopting Python is to create fully interactive dashboards directly inside Excel, you may need to wait and see how this functionality evolves.

These limitations might feel like a letdown, but they shouldn’t overshadow Python’s fundamental advantages. At its core, Python’s visualization libraries bring a code-based approach to creating and refining charts. Instead of manually selecting colors, fonts, or axis options through a series of clicks and menus, you write a few lines of code that define these elements as parameters. This means your visualizations are inherently reproducible: when it comes time to update the data or tweak a detail, you only need to adjust a line or two of code. By streamlining repetitive tasks, Python helps reduce the time and effort spent clicking through formatting dialogs, making it easier to produce consistent, high-quality visuals on a regular basis.

Moreover, Python’s vast ecosystem, being open-source, encourages experimentation and collaboration. Libraries like Matplotlib, which is highly customizable, let you create anything from standard line charts to intricate multi-panel figures. Seaborn builds on Matplotlib to simplify the creation of statistically oriented plots and aesthetically pleasing defaults. Plotnine, inspired by the grammar-of-graphics approach popularized by R’s ggplot2, offers a more conceptual framework for building plots layer by layer. Each of these libraries brings unique strengths, and together they cover a wide spectrum of visualization needs. If you ever find Excel’s native charts too limited or cumbersome, Python provides an escape hatch—one that doesn’t force you to abandon your spreadsheets entirely.

A quick comparison of the Python packages currently available in Excel for data visualization follows:

Outside Excel, Python’s capabilities expand even further. In standalone Python environments or web-based dashboards, interactive elements like tooltips, zooming, and panning are easily accessible. You can integrate with tools like Bokeh, Altair, or Plotly to build rich, interactive visualizations that invite exploration. While these features aren’t currently available inside Excel’s Python integration, the ecosystem is dynamic and evolving. It’s entirely possible that, over time, the lines between what’s feasible inside Excel and outside it will blur, granting Excel analysts even more interactive capabilities without leaving their native environment.

In the meantime, think of Python as one more tool at your disposal. Excel’s charts aren’t going anywhere, and they remain a solid choice for many day-to-day tasks. Python simply broadens your horizons. By investing a bit of time in learning Python’s syntax and core visualization methods, you stand to gain a level of flexibility and efficiency that might otherwise be out of reach. Instead of viewing Python as a replacement for Excel’s charting features, view it as a supplement—an advanced set of features you can call upon when the situation demands something beyond the usual options.

Ultimately, the integration of Python into Excel marks an exciting milestone. Not because it turns Excel into a cutting-edge visualization platform overnight, but because it removes barriers. It encourages Excel’s massive user base to dip a toe into more programmatic, reproducible workflows. Over time, you can gradually incorporate Python’s libraries as you see fit, without losing the simplicity and comfort of Excel’s built-in charts. In doing so, you expand your analytical toolkit and position yourself to take advantage of whatever improvements—and increased interactivity—may come down the line.

What questions do you have about data visualization with Python in Excel? How do you see the strengths of native Excel charts compared to the new Python in Excel charts? Are there any features you wish Python in Excel offered? Let me know in the comments.

The post Python in Excel: Why use Python for data visualization? first appeared on Stringfest Analytics.

]]>
14701
How to understand the differences between matplotlib, seaborn and plotnine for Python in Excel data visualization https://stringfestanalytics.com/how-to-understand-the-differences-between-matplotlib-seaborn-and-plotnine-for-python-in-excel-data-visualization/ Tue, 10 Dec 2024 17:50:29 +0000 https://stringfestanalytics.com/?p=14618 Data visualization is a fundamental aspect of effective data analysis. And although Excel is a versatile tool, certain visualization types remain difficult or even impossible to create with its built-in features alone. That’s where Python libraries such as Matplotlib, Seaborn, and Plotnine come in. These libraries provide advanced visualization capabilities that can be seamlessly integrated […]

The post How to understand the differences between matplotlib, seaborn and plotnine for Python in Excel data visualization first appeared on Stringfest Analytics.

]]>
Data visualization is a fundamental aspect of effective data analysis. And although Excel is a versatile tool, certain visualization types remain difficult or even impossible to create with its built-in features alone. That’s where Python libraries such as Matplotlib, Seaborn, and Plotnine come in. These libraries provide advanced visualization capabilities that can be seamlessly integrated into Excel through Python in Excel.

But how do you decide which library to use? And what are some practical, time-saving techniques for creating visualizations that Excel’s native tools struggle with? This blog post will guide you through the essentials.

Download the exercise file below to follow along:

 

Since we’re covering a variety of visualizations, we’ll be using small, one-off datasets to keep things simple. If you need help applying these concepts to your own work—likely with larger datasets—feel free to ask any questions in the comments. AI can also likely assist in adapting these techniques to fit your needs.

For this post I assume you have some familiarity with inserting plots with Python in Excel, if that’s not the case, you can check out this post:

Matplotlib

Matplotlib is often called the “grandfather” of Python visualization libraries, and it serves as the foundation for many others. At its core, Matplotlib gives you low-level control over every aspect of a chart. You define what your chart looks like from scratch—everything from axes labels and tick marks to colors, grids, and legends is in your hands.

This level of flexibility comes at the cost of verbosity. Tasks that might take one or two lines in another library may take several in Matplotlib. However, if you’re looking for complete customization, it’s unmatched.

How It Works

Matplotlib operates through a figure-based approach: you first create a figure and an axis (or multiple axes) and then define what to draw on those axes. Think of it as creating a blank canvas and painting on it step by step.

Example 1: Creating a Gantt Chart for Project Timelines

A Gantt chart is a classic business visualization for tracking project progress. In Excel, creating one often involves tedious manual work, but with Matplotlib, it’s straightforward:

Gantt Chart Matplotlib

Here, you define a horizontal bar chart, setting each bar’s starting position and length to represent tasks’ timelines. The result is clean and precise—ready for inclusion in a project report or presentation.

Example 2: Custom Branding for Sales Trends

Suppose you’re creating an executive report and need visuals that align with your company’s branding. Matplotlib provides powerful customization options, allowing you to tailor colors, fonts, and annotations to match your corporate style. Here’s an example:

This flexibility is especially valuable for creating publication-ready visuals directly in Python.

Monthly sales trend matplotlb

Seaborn

If Matplotlib is the raw material, Seaborn is the refined product. Built on top of Matplotlib, Seaborn simplifies the process of creating aesthetically pleasing and statistically insightful visualizations. It’s designed to work seamlessly with Pandas DataFrames, which are the default data object in Python in Excel. This makes it a natural choice for business users analyzing data directly from Excel workbooks.

Seaborn automates many of the intricate details of plotting—like adding confidence intervals to line plots or managing subplots—so you can focus on interpreting the results. It excels at identifying relationships and patterns in data, making it especially useful for exploring trends, distributions, and correlations.

How It Works

Seaborn uses a dataset-oriented API, allowing you to pass an entire Pandas DataFrame as input. You simply define which columns represent your data of interest, and Seaborn handles the rest, from aesthetic defaults to the layout of the plot.

Example 1: Correlation Heatmap for Marketing Effectiveness

Heatmaps are one of Seaborn’s standout features. Let’s say you’re trying to understand how marketing spend correlates with ROI and customer engagement. A heatmap makes these relationships clear at a glance:

Seaborn correlation heatmap

This visualization highlights which variables are strongly correlated, providing actionable insights into which factors drive results.

Example 2: Small Multiples for Sales Trends by Region

If you’re monitoring sales trends across regions, Seaborn’s FacetGrid is an excellent tool for creating small multiples—individual plots for each region displayed in a neatly organized grid.

This approach simplifies the process of comparing sales trends across regions by automating the creation of separate plots for each group. It streamlines your workflow, eliminating the need to generate individual charts manually, and provides an intuitive way to visualize data patterns efficiently.

Seaborn’s tight integration with Pandas DataFrames makes it a natural fit for business users working with Python in Excel. Whether you’re analyzing correlations, exploring multi-dimensional relationships, or creating small multiples, Seaborn helps you extract insights with minimal effort and maximum impact.

Plotnine

Plotnine takes a completely different approach to visualization. It’s inspired by the “grammar of graphics” philosophy, popularized by R’s ggplot2 library. Instead of thinking in terms of “chart types,” Plotnine lets you build visualizations layer by layer, combining data elements like axes, geoms (shapes), and facets to create charts.

This structure is intuitive for users who think in terms of breaking down data relationships and is ideal for layered visualizations or multi-panel plots.

How It Works

In Plotnine, you define a plot by specifying the dataset, the aesthetic mapping (e.g., what the axes represent), and the geoms (the actual chart elements). Each of these components is added incrementally, giving you fine-grained control over the final output.

Example 1: Faceted Chart for Regional Sales Trends

If you want to compare sales trends across regions, Plotnine makes it simple to create side-by-side charts:

Faceted plots like this are perfect for breaking down KPIs by categories like region, product line, or customer segment.

Plotnine small muipltes

Example 2: KDE plot for distribution

A KDE plot (Kernel Density Estimate) visualizes the probability density of a continuous variable, smoothing the data to show its distribution without the discrete bins of a histogram. It’s useful for understanding the shape and spread of data or identifying multiple modes.

Here, plotnine builds the KDE plot using a layered approach. The ggplot function specifies the dataset and aesthetics, mapping “Sales” to the x-axis. The geom_density layer computes and plots the density curve, filling it with a semi-transparent blue color. Finally, labs adds a title and axis labels. This modular process combines data, aesthetics, and visual elements into a cohesive plot.

Plotnine density plot

Conclusion

Matplotlib, Seaborn, and Plotnine each bring distinct advantages to data visualization, especially for Python users working in Excel. Matplotlib stands out for its extensive customization options, making it ideal for specialized tasks like Gantt charts or branded visuals. However, this flexibility often comes with a steeper learning curve. The table below summarizes the key pros, cons, and differences among these three libraries.

Seaborn, with its seamless integration with Pandas DataFrames, simplifies the creation of insightful statistical visualizations, making it a natural choice for uncovering patterns and relationships with minimal effort. Plotnine’s structured, grammar-of-graphics approach excels at layered and faceted visualizations like small multiples or comparisons but may take some adjustment for users accustomed to Excel’s interface.

These libraries complement Excel’s capabilities, allowing you to overcome its limitations and tell more impactful data stories. The best choice depends on your specific needs—whether it’s control, simplicity, or advanced layering.

Exploring these tools can feel overwhelming at first, but they open up a world of possibilities for enhancing your data visualizations. If you’re unsure which library to start with or how to incorporate them into your workflows, feel free to ask in the comments. I’d also love to hear about your experiences using Matplotlib, Seaborn, or Plotnine—what challenges have you faced, and how have these tools helped? Let me know below.

The post How to understand the differences between matplotlib, seaborn and plotnine for Python in Excel data visualization first appeared on Stringfest Analytics.

]]>
14618
Python in Excel: Creating layered and faceted visualizations with Plotnine https://stringfestanalytics.com/python-in-excel-creating-layered-and-faceted-visualizations-with-plotnine/ Mon, 09 Dec 2024 18:23:25 +0000 https://stringfestanalytics.com/?p=14640 In a previous post, we explored the plotnine package in Python in Excel, highlighting how its implementation of the Grammar of Graphics allows for an almost limitless array of possibilities in creating any plot imaginable: In this post, we’ll focus on how to craft layered (i.e., multiple elements in a single plot) and faceted (i.e., […]

The post Python in Excel: Creating layered and faceted visualizations with Plotnine first appeared on Stringfest Analytics.

]]>
In a previous post, we explored the plotnine package in Python in Excel, highlighting how its implementation of the Grammar of Graphics allows for an almost limitless array of possibilities in creating any plot imaginable:

In this post, we’ll focus on how to craft layered (i.e., multiple elements in a single plot) and faceted (i.e., small multiples) plots. These capabilities are particular strengths of Plotnine and represent areas where typical Excel users can achieve significant gains. You can follow along with the exercise file below:

 

Layered regression plot

First, let’s analyze the relationships between body mass and flipper length by species, using a regression line for each group. Plotting separate regression lines for each species provides a more accurate representation of the relationship between flipper length and body mass within each group. In contrast, a single regression line across all species could obscure these distinctions and misrepresent the trends specific to individual species.

In the following code, we add a linear regression fit line for each group by including method='lm' in the geom_smooth() function, which is used to create the smoothing line geometry. The key to breaking down and layering the plot by species lies in assigning the color aesthetic to Species in the overall plot aesthetics.

Layered regression plot

Faceted boxplot

Next, we’ll create a faceted boxplot. Faceting refers to creating small multiples—a series of similar plots displayed side by side, each showing a subset of the data. In this example, each subplot, or facet, corresponds to a different species, allowing for easy comparison of bill length by sex across species.

In Plotnine, the function facet_wrap('~species') creates a separate panel for each unique value of the species variable. The ~ symbol is shorthand for “by,” so ~species can be interpreted as “facet by species.” The facet_wrap() function arranges these small multiples in a grid, with each panel displaying data for a single species. By default, Plotnine determines the layout based on the number of facets.

Faceted boxplot plotnine

Faceting provides several advantages. By segmenting the data into smaller, comparable groups, it becomes easier to identify differences and similarities across categories. This approach avoids clutter by organizing the data into separate, tidy visualizations, ensuring clarity. Each subplot isolates trends within a single group, making them more apparent and eliminating distractions from other categories.

Layered histogram and density

Next, we’ll create a plot that combines a histogram and a density plot to display the distribution of body mass for different penguin species. Adding both a histogram and a density plot to the same visualization can be incredibly helpful for understanding the data. The histogram provides an intuitive, binned representation of the frequency of data points, while the density plot overlays a smoothed curve that represents the underlying distribution of the data. Together, they offer a comprehensive view: the histogram reveals specific counts within intervals, and the density plot highlights overall trends and patterns that might not be as obvious in the histogram alone.

In this code, the aes(y='..density..') in the geom_histogram() layer ensures the histogram is scaled to match the density plot. Instead of showing raw counts, the histogram is normalized to display probabilities, making it directly comparable to the density curve. The position='identity' parameter ensures that the histogram bars for each species are overlaid on top of one another, rather than being stacked, so their individual contributions to the overall distribution are clear. The alpha parameter, which controls the transparency of the layers, is particularly useful here. By setting alpha=0.3 for the histogram and alpha=0.5 for the density plot, you can see overlapping elements without any one layer obscuring the others, allowing for better visual interpretation.

layered histogram density Plotnine

Faceted regression plot

In an earlier example we created a layered regression plots, where all species share the same axes and data points were differentiated by color. This provided a unified view of the data, allowing for direct comparison of relationships across groups in a single plot.

Faceted regression plotnine

However, this example uses faceting, splitting the plot into separate panels for each species. This is achieved with the facet_wrap('~species') function, which creates a grid of plots, each focusing exclusively on one species. The ~ symbol specifies faceting “by” species, making it the key line of code that transitions the plot from layered to faceted.

Layering and faceting each have their pros and cons in this context. A layered plot allows for a direct, side-by-side comparison of regression trends across groups on the same axes, making it easier to detect overarching patterns or differences in slopes. However, it can become visually cluttered, especially if the groups overlap or have very different scales, making individual trends harder to discern.

In contrast, faceting separates the data into individual panels, reducing clutter and providing a clearer view of each group’s regression trend. This makes it easier to focus on within-group relationships without distraction from other groups. The downside is that faceting requires viewers to compare across separate panels, which can make subtle differences between groups less obvious.

Faceted density plot

Last but not least, we’ll create a faceted density plot that visualizes the distribution of penguins’ body mass across different islands, with the density curves differentiated by species. The mechanics of the code combine density plots and faceting to highlight patterns within the data effectively.

By using faceting, the plot separates the data by island, making it easy to identify location-specific patterns in body mass distributions. This avoids the visual clutter that might occur in a single, layered plot, especially if there are substantial differences between islands. The use of transparency with alpha ensures that overlapping curves are distinguishable, enhancing the interpretability of the plot without sacrificing detail.

Faceted density plotnine

Conclusion

Plotnine’s Grammar of Graphics framework makes layered and faceted plots standout features for Python in Excel. By breaking visualizations into reusable components, Plotnine simplifies the process of creating advanced charts that seamlessly combine trends, comparisons, and distributions—all within a single workflow. These plots not only improve efficiency but also enhance the clarity and depth of your data storytelling, making them invaluable for uncovering insights and presenting findings effectively.

What questions do you have about layered and faceted plots in Plotnine specifically, or about data visualization with Python in Excel more broadly? Let me know in the comments.

The post Python in Excel: Creating layered and faceted visualizations with Plotnine first appeared on Stringfest Analytics.

]]>
14640
Python in Excel: How to understand the plotnine package https://stringfestanalytics.com/python-in-excel-how-to-understand-the-plotnine-package/ Mon, 09 Dec 2024 16:42:07 +0000 https://stringfestanalytics.com/?p=14638 Python in Excel offers powerful tools for creating impactful visualizations, and one standout option is the plotnine package. In this post, we’ll delve into the origins and philosophy of plotnine, explore its core building blocks, and demonstrate some easy and impressive visualizations you can create—perfect for Excel users looking to enhance their data storytelling. To […]

The post Python in Excel: How to understand the plotnine package first appeared on Stringfest Analytics.

]]>
Python in Excel offers powerful tools for creating impactful visualizations, and one standout option is the plotnine package.

In this post, we’ll delve into the origins and philosophy of plotnine, explore its core building blocks, and demonstrate some easy and impressive visualizations you can create—perfect for Excel users looking to enhance their data storytelling. To follow along, download the exercise file below:

 

The building blocks of Plotnine

Plotnine, inspired by the ggplot2 library in R, is a Python library built on the Grammar of Graphics philosophy. This approach simplifies visualization by breaking down the process into clear, reusable components that define what you want to plot and how you want to display it. Here are the key building blocks of Plotnine:

Data

The foundation of any Plotnine visualization is your dataset, typically a Pandas DataFrame. The data is mapped to visual elements (like points or lines) using aesthetics, such as x, y, color, and size. This close integration with DataFrames makes Plotnine ideal for Excel users who already organize their data in tabular formats.

Aesthetic Mappings (aes)

The aes() function defines how variables in your dataset are mapped to visual properties of the plot. For instance, you might map one column to the x-axis, another to the y-axis, and yet another to the color or size of the points. This is akin to choosing which fields to visualize when creating an Excel chart. Examples:

  • Assigning colors based on a categorical variable (color=category_column)
  • Mapping a column to the x-axis (x=column_name)

Geometric Objects (geoms)

Geometric objects define the type of plot you want to create—scatterplots, line charts, bar charts, etc. Each geom_ function represents a different visual layer. For instance:

  • geom_point(): Scatterplot
  • geom_line(): Line chart
  • geom_bar(): Bar chart

You can layer multiple geoms on the same plot to add complexity, such as adding trendlines or annotations.

Plotnine, like ggplot2, allows for a variety of additional layers to enhance your plots, including scales, themes, and statistical transformations. A comprehensive overview of these can be found on the Quebec Centre for Biodiversity Science website. While this resource is focused on ggplot2 in R, the principles directly translate to plotnine in Python.

Grammar of graphics layers

Plotnine refactors the ggplot2 package into Python by implementing a similar Grammar of Graphics framework. It provides a Pythonic API that mirrors ggplot2’s syntax, while integrating seamlessly with Pandas for data manipulation. This layered structure allows users to systematically combine data, aesthetic mappings, and geometric objects, making it easy to create sophisticated and customizable visualizations.

For Excel users, plotnine offers a bridge to powerful, scriptable data visualization, leveraging familiar concepts in a more flexible programming environment. By learning plotnine, you can unlock new ways to present and analyze your data with clarity and impact.

Data and package import

Let’s start by loading two datasets into Python in Excel. I’ll also import various components from plotnine to craft and customize visualizations. Each import, like geom_point or theme_minimal, adds specific plotting or styling features, enabling the creation of complex plots.

Data import Plotnine

Let’s dive in! We’ll explore a variety of common plots used in data analysis, ranging from those that are relatively straightforward to create in base Excel to others that may require a bit more effort.

Scatterplot

Let’s start with a scatterplot. The aes() function defines the plot’s aesthetics: x maps to the “Advertising_Spend” column, y maps to “Monthly_Sales,” and color assigns different colors to points based on the “Store” column, enabling visual comparisons across stores. The geom_point() function adds the data points, while labs() customizes the plot’s title and axis labels.

Although creating a scatterplot in Excel is straightforward, conditionally coloring points by a category, like “Store,” can be cumbersome. In contrast, with the grammar of graphics, it’s as simple as mapping “Store” to color in the aesthetics.

Line plot

Now let’s create a line plot. This time, we use both geom_line and geom_point to add lines connecting the data points and to highlight each individual point on the plot. We also apply theme_classic for a clean, minimalist chart style. As before, we’ll add chart and axis labels to keep the visualization clear and informative.

If you’d like to customize features like line colors, point styles, or other elements, the grammar of graphics makes these adjustments simple and systematic.

KDE plot

Next, let’s use a Kernel Density Estimate (KDE) plot to visualize the distribution of advertising spend in the dataset. Like a histogram, a KDE plot illustrates the distribution of a variable, but instead of grouping data into discrete bars, it uses a smooth curve to estimate the probability density, offering a more continuous view of the data.

In this code, geom_density() generates the KDE plot and fills it with a semi-transparent purple (fill='purple', alpha=0.5) to enhance visibility.

Violin plot

Next, let’s use a violin plot to visualize the distribution of customer ratings by store. A violin plot is similar to a boxplot in that it displays the range, center, and spread of a dataset, but it also incorporates a Kernel Density Estimate (KDE) to show the full distribution shape on either side of the plot, making it easier to see patterns like multimodality or skewness.

In this code, geom_violin() creates the violin plot, with draw_quantiles=[0.5] adding a line to represent the median of each distribution. The fill='Store' aesthetic colors each violin by store, and alpha=0.7 makes the fill semi-transparent for better visual appeal.

Bar plot

Last but not least, let’s explore the familiar bar plot. Here, we visualize the average monthly sales for each store. The data is first preprocessed by calculating the mean monthly sales for each store using groupby and mean(), then sorted in descending order of sales for clarity.

Using the preprocessed data, the ggplot function defines the plot, and geom_bar(stat='identity') creates the bar plot, displaying exact sales values rather than counts. For visual enhancement, scale_fill_brewer() applies a sequential blue color palette, with the order reversed using direction=-1. The theme_classic() function ensures a clean and simple design, while theme(axis_text_x=element_text(rotation=45, hjust=1)) rotates the x-axis labels for improved readability.

Conclusion

Plotnine brings the power of ggplot2’s Grammar of Graphics into Python, offering Excel users a way to create visualizations that transcend the boundaries of traditional charting tools. Whether you’re crafting scatterplots, violin plots, or more complex visuals, plotnine’s flexibility and depth make it easy to transform raw data into clear, compelling stories.

While mastering the Grammar of Graphics requires an investment of time, the rewards are well worth it. By understanding its building blocks, you unlock a versatile system of interchangeable parts, enabling you to design virtually any visualization you can imagine—all while working seamlessly alongside Excel.

Have questions about using plotnine or data visualization with Python in Excel? Drop them in the comments, and happy plotting!

The post Python in Excel: How to understand the plotnine package first appeared on Stringfest Analytics.

]]>
14638