tables - Stringfest Analytics https://stringfestanalytics.com Analytics & AI for Modern Excel Tue, 26 Mar 2024 18:39:54 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 https://i0.wp.com/stringfestanalytics.com/wp-content/uploads/2020/05/cropped-RGB-SEAL-LOGO-STRINGFEST-01.png?fit=32%2C32&ssl=1 tables - Stringfest Analytics https://stringfestanalytics.com 32 32 98759290 How to create named ranges and tables in Excel with openpyxl https://stringfestanalytics.com/how-to-create-named-ranges-and-tables-in-excel-with-openpyxl/ Tue, 26 Mar 2024 18:39:51 +0000 https://stringfestanalytics.com/?p=13105 Working with Python code becomes significantly more manageable when objects are assigned clear, sensible names, and Excel is no different. This blog post covers how to create named ranges and tables in Excel using openpyxl. To begin this demonstration, we will import several modules from openpyxl, along with a sample of rows from the penguins […]

The post How to create named ranges and tables in Excel with openpyxl first appeared on Stringfest Analytics.

]]>
Working with Python code becomes significantly more manageable when objects are assigned clear, sensible names, and Excel is no different. This blog post covers how to create named ranges and tables in Excel using openpyxl.

To begin this demonstration, we will import several modules from openpyxl, along with a sample of rows from the penguins dataset available in the seaborn package.

import openpyxl
from openpyxl.worksheet.table import Table, TableStyleInfo
from openpyxl.utils import get_column_letter
from openpyxl.utils.dataframe import dataframe_to_rows
from openpyxl import Workbook
wb = Workbook()

import seaborn as sns
penguins = sns.load_dataset('penguins')
penguins = penguins.sample(frac=.05, random_state=1234)
penguins.head()
	species	island	bill_length_mm	bill_depth_mm	flipper_length_mm	body_mass_g	sex
64	Adelie	Biscoe	36.4	17.1	184.0	2850.0	Female
237	Gentoo	Biscoe	49.2	15.2	221.0	6300.0	Male
115	Adelie	Biscoe	42.7	18.3	196.0	4075.0	Male
97	Adelie	Dream	40.3	18.5	196.0	4350.0	Male
37	Adelie	Dream	42.2	18.5	180.0	3550.0	Female

Creating named ranges with openpyxl

The first task is to compile a list of all unique values in the penguins dataset’s species column and export them to an Excel range. If your Python source data is stored as a list, this is straightforward.

The code, penguins['species'].unique().tolist(), retrieves the unique species values and converts them from a numpy array to a list:

species = penguins['species'].unique().tolist()
species
['Adelie', 'Gentoo', 'Chinstrap']

Next, we’ll transfer this data to Excel. Create a worksheet named ws1 and add the species list using ws1.append(species). Following that, define a range in openpyxl with specific code to name the new range in Excel and identify the cells it encompasses:

ws1 = wb.create_sheet('species')
ws1.append(species)

new_range = openpyxl.workbook.defined_name.DefinedName('species', attr_text='species!$A$1:$C$1')
wb.defined_names.append(new_range)

With the range defined in openpyxl, it’s time to add it to the workbook using wb.defined_names.append(new_range).

Creating named tables with openpyxl

Named ranges are incredibly useful for managing one-dimensional data, such as lists.

However, for datasets that include rows, columns, and headers, storing your data in an Excel table is more appropriate. Fortunately, creating tables is also possible with openpyxl.

First, create a new worksheet named penguins. To transfer the penguins DataFrame to this worksheet, use a loop with the dataframe_to_rows() function.

ws2 = wb.create_sheet('penguins')

for r in dataframe_to_rows(penguins, index=False, header=True):
    ws2.append(r)

I’ve chosen to omit the index column and include the header row by setting the index to False and the header to True.

After adding the DataFrame to the worksheet, like with the range, you must define a table object and designate the cell range it covers. Tables offer style options, so begin by setting the style with TableStyleInfo(name='TableStyleMedium9', showRowStripes=True). There are numerous ways to style a table, so feel free to explore and consult the documentation for additional information.

style = TableStyleInfo(name="TableStyleMedium9", showRowStripes=True)

Next, define the table with the table function, naming it “penguins” in Excel and specifying its cell range. This can be achieved with precise openpyxl references to include as many active rows and columns as present in the worksheet.

table = Table(displayName="penguins", 
              ref="A1:" + get_column_letter(ws2.max_column) + str(ws2.max_row))

Once the table is defined, apply the chosen style to it.

table.tableStyleInfo = style

Finally, add the table to the penguins worksheet with ws2.add_table(table) and save the workbook as ranges-tables.xlsx:

ws2.add_table(table)
wb.save('ranges-tables.xlsx')

Open the resulting workbook in Excel, navigate to Formulas on the home ribbon, and open the Name Manager. You’ll find the table and range ready for use..

Name manager results from openpyxl

If you haven’t been utilizing ranges and tables in Excel, the ability to quickly create them via openpyxl should provide ample motivation. This approach lays the groundwork for combining pandas with openpyxl to produce comprehensive analyses and reports.

What questions do you have about working with named ranges, tables, or other Excel entities using openpyxl? Feel free to share your inquiries in the comments. Additionally, you might find my LinkedIn Learning course helpful for an introduction to automating Excel tasks with Python.

The post How to create named ranges and tables in Excel with openpyxl first appeared on Stringfest Analytics.

]]>
13105
Copilot for Excel: How to format data https://stringfestanalytics.com/copilot-for-excel-how-to-format-data/ Thu, 21 Mar 2024 19:28:42 +0000 https://stringfestanalytics.com/?p=13040 Among the numerous exciting features that Copilot is set to introduce for Excel, one significant requirement is that Excel data must be organized in tables. This is crucial because Copilot will not function unless the data is formatted in this manner. Excel tables provide numerous advantages, including their aesthetic appeal. With just a few clicks, […]

The post Copilot for Excel: How to format data first appeared on Stringfest Analytics.

]]>
Among the numerous exciting features that Copilot is set to introduce for Excel, one significant requirement is that Excel data must be organized in tables. This is crucial because Copilot will not function unless the data is formatted in this manner.

Excel tables provide numerous advantages, including their aesthetic appeal. With just a few clicks, you can transform your dataset into a beautifully formatted table featuring banded rows, filter buttons, and more. Copilot enhances this experience, making it remarkably simple to modify your table’s appearance after creation.

Let’s practice with a small dataset that contains basic information about the five boroughs of New York City. Follow along by downloading the exercise file provided.

 

Ensure this file is properly configured to work with Copilot before proceeding.

The first aspect that immediately catches one’s attention regarding the dataset is the formatting of the population column. It lacks commas as thousands separators, which makes it difficult to read. To fix this, I plan to request Copilot to format the population column by incorporating commas as thousands separators, without any decimal points.

Format population column Copilot

This is much easier to read. Let’s see what else we can tweak.

Given that this is New York City data, I’ll format the text using the “Broadway” font. While I wouldn’t typically recommend this font for data display, it’s OK to have a little fun with fonts while learning. Generally, a simpler, sans-serif font like Segoe UI or Roboto works best for clear, readable tables.

Change table font in Copilot

For now, I will undo this change for readability with Ctrl + Z.

Next, I’ll enhance the appearance of the table’s headers and footers. First, I’m adding a filter button to the dataset. Although it’s relatively small and can be easily inspected visually, it might still be beneficial to sort the data by population or land area. So, I’m planning to have Copilot incorporate the filter button into the dataset.

Add filter buttons Copilot

By default, Excel tables don’t come with totals included. However, incorporating them using Copilot is a straightforward process. Copilot can sum the columns in the dataset to create a total row. Unfortunately, at the moment, Copilot isn’t particularly effective in modifying the total row. The most reliable method continues to be the use of the dropdown menu available in each total cell.

Add total row Copilot

There are a variety of formatting options you might consider, including changing colors, adding borders, and more. While Copilot excels in some areas, others may require manual intervention.

For seasoned Excel users, these tips may seem basic. You’re likely already familiar with applying formatting changes through keyboard shortcuts or menu options, which can be faster than using Copilot.

However, Copilot’s real value lies in its intuitive understanding and execution of your commands, no matter how complex. The goal is to streamline your workflow by simply typing out your requirements. Starting with formatting might appear basic, but given its importance in data analysis, it’s an excellent starting point.

Do you have any questions about formatting your data with Copilot? Or are you interested in delving into more advanced techniques? Feel free to share your thoughts or questions in the comments.

The post Copilot for Excel: How to format data first appeared on Stringfest Analytics.

]]>
13040
How to calculate proportions in Excel with the PERCENTOF() function https://stringfestanalytics.com/how-to-calculate-proportions-in-excel-with-the-percentof-function/ https://stringfestanalytics.com/how-to-calculate-proportions-in-excel-with-the-percentof-function/#comments Sat, 17 Feb 2024 21:59:40 +0000 https://stringfestanalytics.com/?p=12576 One of the more recent additions to Excel which I haven’t seen receive much coverage is the PERCENTOF() function. This function calculates the percentage that a subset represents of a given dataset. Although the concept is relatively straightforward, the goal of this post is to explore a method to enhance the readability of its output […]

The post How to calculate proportions in Excel with the PERCENTOF() function first appeared on Stringfest Analytics.

]]>
One of the more recent additions to Excel which I haven’t seen receive much coverage is the PERCENTOF() function. This function calculates the percentage that a subset represents of a given dataset. Although the concept is relatively straightforward, the goal of this post is to explore a method to enhance the readability of its output for users. Please refer to the accompanying exercise file to follow along:

 

In this example, we are given the frequencies of people’s preferred colors from a set of options. While the raw numbers provide initial insights, converting these numbers into proportions or percentages facilitates easier comparison between different groups or categories.

This normalization process aids in understanding the relative significance of the numbers, regardless of total quantities involved. Moreover, it simplifies the communication and comprehension of data by converting it into a standardized format, enhancing its accessibility across various contexts.

There are several methods to calculate the percentage of a total in Excel, among which the PERCENTOF() function is now likely the most straightforward with its recent introduction. Let’s begin with the basics.

In our first example, we aim to determine the proportion of votes for each color.

The PERCENTOF() function is relatively simple to use. We select a subset of data (usually a single cell), followed by the entire data set, and PERCENTOF() efficiently calculates the percentage:

One-way proportion table

That was simple enough.

Now, let’s proceed to the next worksheet in the workbook, titled two-way. This dataset breaks down the color by sex, distinguishing between male and female participants. Analyzing a “two-way” proportion breakdown of this dataset is not significantly more complicated than the first example; we simply need to reference both the Male and Female columns in the source data.

Two way proportion table with PERCENTOF

This is going great, but I have one significant concern which pertains to the formatting. Specifically, I would prefer if the percentages were placed immediately next to the raw figures, rather than be placed together toward the end of the table.

An attempt to simply cut and paste the percentages next to the figures might seem straightforward, but it will lead to a circular reference issue, as the percentage column will be erroneously included as part of the raw data.

This issue underscores a limitation with Excel’s structured references, which do not allow for referencing nonconsecutive columns, thereby causing complications.

To navigate around this problem, we can employ a workaround using a named range. Essentially, we can create a named range that encompasses both the Male and Female columns, and then utilize this range as the input for the PERCENTOF() function.

To proceed with this solution, navigate to the Formulas tab on the ribbon, then click on Name Manager. Choose “New” to establish a new range. Assign a name to this range, such as male_female, and then specify the Male and Female columns from the two-way table in the following manner: =colors_two_way[Male], colors_two_way[Female].

Creating a nonadjacent named range

Now you can arrange the columns in the desired order, ensuring that the proportions are directly adjacent to the raw counts. Modify your PERCENTOF() calculations so that the named range for data_all is updated to male_female. This adjustment will enable you to calculate proportions using a non-adjacent range within the table.

Percentof with nonadjacent range in Excel table

Your calculations will update automatically if you add or remove rows from the table. Additionally, you can rename the source columns, and the named range will adjust dynamically.

Do you have any questions about deriving proportions with the PERCENTOF() function in Excel? Please share them in the comments.

The post How to calculate proportions in Excel with the PERCENTOF() function first appeared on Stringfest Analytics.

]]>
https://stringfestanalytics.com/how-to-calculate-proportions-in-excel-with-the-percentof-function/feed/ 2 12576
How to select two noncontiguous columns in an Excel table https://stringfestanalytics.com/how-to-select-two-noncontiguous-columns-in-an-excel-table/ Fri, 16 Feb 2024 22:30:01 +0000 https://stringfestanalytics.com/?p=12589 Excel tables offer several benefits for data storage and manipulation, one of the most significant being the ability to use structured references to programmatically select different elements of a dataset. Consider, for instance, this product launch dataset. You can obtain your own copy by downloading the exercise file provided below: When the columns you wish […]

The post How to select two noncontiguous columns in an Excel table first appeared on Stringfest Analytics.

]]>
Excel tables offer several benefits for data storage and manipulation, one of the most significant being the ability to use structured references to programmatically select different elements of a dataset.

Consider, for instance, this product launch dataset. You can obtain your own copy by downloading the exercise file provided below:

When the columns you wish to analyze or report on in Excel are adjacent, utilizing Excel tables simplifies the process. All you need to do is use the colon (:) operator to define the start and end of the column selection range:

Refer to consecutive columns Excel table

We have the opportunity to apply various manipulations to this selection, including utilizing the FILTER() function for refinement:

Filtering consecutive columns

When it comes to selecting nonconsecutive columns from an Excel table, the process becomes a bit more complex. Despite the utility of structured references, they fall short in this particular scenario.

However, we can devise our own solution by leveraging a few Excel functions. This approach allows for a customized method to achieve the desired outcome, circumventing the limitations of structured references.

Choosing columns by position with CHOOSECOLS()

The initial step in implementing this workaround involves utilizing the CHOOSECOLS() function. This function is designed to accept an array along with a specified set of column numbers. Once these inputs are provided, Excel will proceed to return the outcomes corresponding to those specified columns.

Basic example of CHOOSECOLS

Choosing columns by index number with MATCH()

While manually selecting columns by index number is technically feasible, enhancing user-friendliness involves specifying the desired columns in a list. This approach allows the utilization of the MATCH() function to ascertain the index number of each specified column, thereby facilitating the retrieval of corresponding data:

Using CHOOSECOLS with MATCH

We can now even combine this with FILTER() to get selected rows and columns:

Combining FILTER, MATCH and CHOOSECOLS

Unfortunately, it’s disappointing that we cannot utilize structured references for noncontiguous columns, which would have offered a more flexible solution. However, the combination of functions provided does serve as an adequate alternative for most cases.

Do you have any questions about handling nonconsecutive columns in an Excel table? Feel free to share them in the comments below.

The post How to select two noncontiguous columns in an Excel table first appeared on Stringfest Analytics.

]]>
12589
How to understand data headers in Python in Excel https://stringfestanalytics.com/how-to-understand-data-headers-in-python-in-excel/ https://stringfestanalytics.com/how-to-understand-data-headers-in-python-in-excel/#comments Tue, 28 Nov 2023 21:27:39 +0000 https://stringfestanalytics.com/?p=12092 If you’re familiar with using tables in Excel, you might know that table headers are structurally different from the data rows. These headers require distinct code for manipulation, setting them apart from the rest of the data. This concept of separation between a dataset’s headers and its data isn’t unique to Excel; it’s also a […]

The post How to understand data headers in Python in Excel first appeared on Stringfest Analytics.

]]>
If you’re familiar with using tables in Excel, you might know that table headers are structurally different from the data rows. These headers require distinct code for manipulation, setting them apart from the rest of the data.

This concept of separation between a dataset’s headers and its data isn’t unique to Excel; it’s also a key feature of Pandas DataFrames in Python. Pandas DataFrames are the cornerstone for data handling in Python, much like tables are in Excel.

This post aims to explore the subtleties of header management, particularly focusing on how Python interacts with Excel headers.

To get the most out of this guide, please download the provided exercise file:

Python in Excel is optimized for DataFrames

In our example, we’re working with a dataset that contains text-based information about the top five all-time leading scorers in the NBA. To efficiently manage and analyze this data in Python, we’ll assign these cells to an object named players. By default, when importing data into Python from Excel, it is stored as a DataFrame:

Players dataframe with no header

DataFrames in Python are specially designed to handle tabular data, which coincidentally aligns with the typical data format used in Excel.

Recognizing this synergy, Python’s integration with Excel automatically converts input ranges into DataFrames. While this feature is generally efficient, it’s not without its quirks.

Consider our current scenario: we attempt to access column names in our players DataFrame using its columns attribute. Ideally, we expect to see column headers like ‘name_first’, ‘name_last’, etc.

Instead, we’re met with numeric indices: 0, 1, 2, and so forth. This outcome signals that Python, when interfacing with Excel, didn’t recognize the first row of our dataset as a header.

Players columns are not numbered

Why does this happen? The issue mirrors a common challenge in Excel itself, where tables sometimes fail to identify the first row as headers. In our case, the entire dataset comprises text data. Without a distinct format or marker, Excel doesn’t automatically distinguish the first row as containing header information.

Adjusting for source headers with xl()

Earlier, we discussed how Python interfaces with Excel data, particularly through the xl() function. This function is pivotal in converting Excel ranges or other inputs into objects that Python can work with. While xl() isn’t highly customizable, it does offer a crucial headers parameter.

By default, this parameter is set to False, which means Python doesn’t treat the first row of the Excel range as headers.

To address our earlier issue where Python failed to recognize the first row as headers, we can adjust the headers parameter within the xl() function. Setting headers=True explicitly informs Python that the first row of our Excel data should be treated as the header row. This simple yet effective change can significantly alter how Python interprets the data.

Upon implementing this adjustment, you’ll notice a marked difference. Instead of the DataFrame columns being numerically indexed as 0, 1, and so forth, they are now appropriately labeled with the actual column names as defined in the first row of the Excel sheet.

This change enhances the readability and usability of the DataFrame, aligning it more closely with our expectations and the structure of the original Excel data.

Yet another reason to use Excel tables

A common dilemma in Excel, particularly when determining if the first row should be treated as headers, often arises with Excel tables. To proactively address this issue when working with Python in Excel, consider using an Excel table instead of raw data ranges. This method provides a clear indication of header rows right from the start.

Here’s how to transform your data into an Excel table, ensuring that headers are correctly identified:

  1. Begin by selecting any cell within your player data.
  2. Navigate to the ‘Insert’ menu and select ‘Table’.
  3. Upon doing this, Excel will prompt you with a confirmation dialog, asking if the first row of your data contains headers. Make sure to affirm this by clicking the option that indicates the first row is indeed the header row.
Confirm data has headers for table

By converting your data into an Excel table and explicitly acknowledging the first row as headers, you’re essentially “nipping the issue in the bud.” This approach significantly improves the data’s structure and clarity, ensuring that when this table is subsequently imported into Python, the headers are accurately recognized and treated appropriately.

Now, when you point to this tabled data source, the xl() function will instantly set headers to True, as you have asserted this to be the situation by storing the data in an Excel table:

Python in Excel dataframe from table

By opting to store your data in an Excel table before importing it into Python, you unlock several key benefits. One significant advantage is the dynamic nature of tables: any expansion in the source data, like adding new rows or columns, automatically reflects in the Python environment.

This dynamic update is a stark contrast to using a fixed alphanumeric range (e.g., A1:D6), which doesn’t automatically adjust to changes in the data size.

Excel tables offer a more flexible and responsive approach to data management, ensuring that your Python DataFrames remain up-to-date with any modifications made in the Excel source. This capability is particularly valuable in data analysis and processing, where data sets are often subject to changes and expansions.

You can get a completed file for this exercise below:

I’m eager to hear your thoughts and questions about this integration between Python and Excel, specifically regarding header management. Do you now see how leveraging tables can significantly enhance your experience with headers and DataFrames? Share your insights or queries in the comments section. I’m here to assist and clarify any aspect of this process.

The post How to understand data headers in Python in Excel first appeared on Stringfest Analytics.

]]>
https://stringfestanalytics.com/how-to-understand-data-headers-in-python-in-excel/feed/ 1 12092
How to get the most of Copilot for Excel (three Excel features you need to know) https://stringfestanalytics.com/how-to-get-the-most-of-copilot-for-excel-three-excel-features-you-need-to-know/ https://stringfestanalytics.com/how-to-get-the-most-of-copilot-for-excel-three-excel-features-you-need-to-know/#comments Thu, 02 Nov 2023 18:29:10 +0000 https://stringfestanalytics.com/?p=11880 Copilot for Excel is now live, bringing a suite of AI-powered features directly into your Excel experience. It’s time to embrace the future of data analysis and ensure you’re fully prepared to utilize Excel to its fullest potential. Here are three key features that are integral to maximizing your AI-driven insights with Copilot for Excel […]

The post How to get the most of Copilot for Excel (three Excel features you need to know) first appeared on Stringfest Analytics.

]]>
Copilot for Excel is now live, bringing a suite of AI-powered features directly into your Excel experience. It’s time to embrace the future of data analysis and ensure you’re fully prepared to utilize Excel to its fullest potential.

Here are three key features that are integral to maximizing your AI-driven insights with Copilot for Excel

Excel tables

Your data should already be structured in Excel tables. Copilot for Excel is specifically engineered to synergize with table-formatted data. This format is crucial for leveraging the AI’s ability to generate formulas and functions automatically. To tap into the precision and efficiency Copilot offers, your data must be in tables.

PivotTables and Charts

Copilot for Excel doesn’t just output raw data; it utilizes PivotTables and Charts to communicate its analysis. Familiarize yourself now with these tools to modify, understand, and update them effectively. They are your portal to infuse Copilot’s data with your unique insights and expertise, making the AI’s findings not just precise but also customized for your use cases.

Power Query

The strength of AI tools, including Copilot for Excel, lies in their ability to process clean and orderly data. Power Query is your indispensable companion for ensuring your data is in the best shape possible. Hone your skills in connecting, transforming, and loading data with Power Query. In particular, focus on the Unpivot function, which is essential for reformatting complex nested tables into a more comprehensible structure.

Those familiar with OpenAI’s Advanced Data Analysis or Excel’s existing Analyze Data feature know the revolutionary impact of AI on data interpretation.

Don’t be caught unprepared! Arm yourself with these Excel capabilities, and fully exploit the capabilities of Copilot for Excel.

To learn more about modernizing your approach to data analytics in Excel, check out my forthcoming book, Modern Data Analytics in Excel:

The post How to get the most of Copilot for Excel (three Excel features you need to know) first appeared on Stringfest Analytics.

]]>
https://stringfestanalytics.com/how-to-get-the-most-of-copilot-for-excel-three-excel-features-you-need-to-know/feed/ 2 11880
How to use structured references in Excel tables https://stringfestanalytics.com/how-to-use-structured-references-in-excel-tables/ Sun, 19 Feb 2023 16:11:31 +0000 https://stringfestanalytics.com/?p=10623 In a previous blog post I shared some reasons for storing data in Excel tables and why to get started now. If you’ve made the leap to tables, perhaps you’re perplexed by the format of their formulas. This post on these so-called structured references is for you. Let’s demonstrate on a small dataset about the […]

The post How to use structured references in Excel tables first appeared on Stringfest Analytics.

]]>
In a previous blog post I shared some reasons for storing data in Excel tables and why to get started now. If you’ve made the leap to tables, perhaps you’re perplexed by the format of their formulas. This post on these so-called structured references is for you.

Let’s demonstrate on a small dataset about the five boroughs of New York City:

 

Structured versus unstructured references

Unstructured references change by location

Think of the typical Excel formula and you’re probably imagining SUM(A1:A4), B5/E5 and so forth. These are known as unstructured references because they point to a location in the workbook, which could change. For example, we could take the sum of A1:C1 below, but add any row or columns and that formula changes.

 

Unstructured references have their place, but it’s worth keeping in mind that this property makes them unstable.

Structured references are based on fixed names

In contrast, structured references refer to specific named ranges in the workbook. These references do not change based on the location of the data.

For example, I’ve converted A1:C1 into a named range called my_range. Now, no matter where this range is located, references to it remain the same.

Named range structured reference example

Structured references in Excel tables

As a two-dimensional named range, Excel tables similarly take structured references. Typically you’ll see these expressed as aggregations (grand totals, averages, etc) or calculated columns. Let’s look at each.

Aggregating with structured references

Taking grand totals or averages are examples of aggregation. We can refer to a column in aggregation using the following structured reference notation:

table_name[column_name]

Let’s combine this notation with SUM() to find the total population of New York.

This table is named nyc. You can confirm that with the Name Manager. We can find the total of the population column like so:

=SUM(nyc[population])

 

If you’re used to writing classic formulas like =SUM(B2:B6), this notation can take some getting used to. But in the long run, the stability and readability of structured references make them hard to beat.

Creating calculated columns

Now let’s take a look at creating calculated columns. For example, we could divide population by area to get a population density column.

To get started, type density in cell D1 and you’ll see a new column automatically gets inserted into your table. Pretty nifty! Next, press = in cell D2 to start the typical formula, but instead of typing in your cell references, hit the left arrow key to grab the relevant columns:

Calculate columns via arrows and structured references

Once you hit Enter, this calculation will automatically apply to the entire column.

This makes the structured reference for creating calculated columns:

[@column_name]

Staying structured

There are a number of other structured references for Excel tables, such as how to pull the header names from a table, its totals and more. For now, focus on these basics and don’t abandon them. It can be tempting to revert back to the familiar, unstructured references. Sticking with structured, however, is the short-term pain, long-term gain you need to succeed as an analyst.

What questions to you have about structured referenes in tables? Do you have examples to share? Still debating whether tables are worth the effort? Let me know in the comments.

The post How to use structured references in Excel tables first appeared on Stringfest Analytics.

]]>
10623
Table time! Reasons to start using Excel tables now https://stringfestanalytics.com/table-time-reasons-to-start-using-excel-tables-now/ Sun, 19 Feb 2023 01:52:37 +0000 https://stringfestanalytics.com/?p=9258 One of the first practices I encourage Excel users to pick up is to store all their data in tables. Tables can take some getting used to, especially for those used to the traditional relative and absolute references (A1:B2, $B$4:$C$10, etc). But the adoption is totally worth it — both for what they can do […]

The post Table time! Reasons to start using Excel tables now first appeared on Stringfest Analytics.

]]>
One of the first practices I encourage Excel users to pick up is to store all their data in tables.

Tables can take some getting used to, especially for those used to the traditional relative and absolute references (A1:B2, $B$4:$C$10, etc). But the adoption is totally worth it — both for what they can do in Excel, and what they prepare you for next.

In this introductory demo to Excel tables, you’ll be working with real world wholesale customers data from the UC Irvine Machine Learning repository.

 

They remind you to include headers

Receiving a dataset without any column headers is pretty useless, right? How are you supposed to make any sense of the data when the data isn’t measured?

Yet I’m sure you’ve received a dataset breaking this very cardinal rule.

A dataset without a header row is like a table without a … mind?

Excel tables serve as a great reminder that a table is only as good as its headers.

Take a look at the start worksheet of the demo workbook — we have a series of data in columns A:F but with no headers — those are tucked away in columns H:L for now. Not a great look.

Go ahead and click anywhere in the main data source, then head to Insert > Tables > OK. The keyboard shortcut is Ctrl + T.

You may notice that, automatically, the Create Table menu asks if your data has headers. If it doesn’t a header column is automatically added to the table:

From here, you can go ahead and copy-paste the headers from columns H:M into the main table. It’s now clear what’s being measured in each column.

Headers and footers are separate elements of the table

Are these header columns part of the dataset proper? To quote a famous meme… “yes, but actually no.” They’re more like metadata that tells us what the data is, rather than the data itself. Classic Excel formulas have no way to programatically distinguish data from headers… but Excel tables do.

Let’s try it out. Head over to cell H3 in your worksheet and hit the equals sign =. Drag cells A1:F1 as your reference, and you’ll see that the formula becomes Table1[#Headers].

Once you’ve established this reference to the headers of Table 1, you can even use “downstream” as a reference to other functions, such as UPPER() to dynamically convert the case of all headers.

Viewing the table footers

If every story contains a beginning, middle and end, then every Excel table consists of headers, data and footers… but footers need to be checked on manually.

To do this, click anywhere in the table and go to Table Design on the ribbon, then check on Total Row in the Table Style Options group:

Turning on totals Excel table

By default, this row will take a sum of the last column in your data, but you can click the dropdown menu on any of the columns to change:

Summarize additional column in Excel totals

Now that you have assembled the three elements of the table, confirm that you can “reconstruct” it using the following formulas:

Formula What it refers to
=Table1[#Headers] Table headers
=Table1 Table data
=Table1[#Totals] Table footers
=Table1[#All] Table headers, data and footers

As you advance your table skills, you’ll find still other useful formula references. But they all rely on the simple structure of headers, body and footers.

It forces you to use named ranges

It’s great to be able to programatically refer to the elements of a table… but not so great that this table has an uninspired name!

To be fair, it’s still better to refer to Table1 than cells A1:F22, for example. But if you’re going to name a range anything, it should be descriptive about the underlying data! Let’s go ahead and do this by heading to Formulas on the home ribbon, then Name Manager > Edit. Change the name to sales, then click OK. Click out of the Name Manager and you’ll see that every reference to Table1 has been changed to sales.

Table headers have been renamed Excel

Excel tables must be named something. And while you can use the defaults, it’s a big win with little effort to provide a more meaningful name.

They look good

As an Excel user, your numbers are often only as good as their formatting. Using tables will instantly improve the aesthetics of your data by adding banded rows. Quickly and easily change the look and feel of the table by heading to Table Design in the ribbon:

This sure beats the tedious and error-prone process of trying to make the data look good on your own.

And, unlike a Maybelline makeover, Excel tables will stay put and looking good at the end of even the longest day…

You can easily add or remove data from the table

We’ve all been there before: you’ve got a total of some range in your workbook, that range changes because data is added to or deleted, that total breaks.

No more with Excel tables!

Here I have taken the sum of the fresh field using Excel tables’ structured references. I can add and remove data from sales and this formula maintains its integrity. Moreover, so does the total at the bottom, and it’s easy to exclude that total from my outside structured reference in cell I443.

Because we are referring to the data by object name, rather than location, it’s much less likely that manipulating the placement and dimensions of our table is going to adversely impact downstream formulas.

It’s the portal into Power Query

Does this sound like a familiar workflow? Every week you get a data extract that you need to report on. But first you’ve got to filter out rows, merge data sources, calculate columns… if so, you need to be using Power Query!

And guess what? The first step to do that is… to use tables.

Click anywhere in the sales dataset, then go to Data > Get Data > From Table/Range. If this data weren’t already in a table, you’d have to do it now anyway!

Insert from table/range in Excel Power Query

You should now see a new screen called the Power Query Editor. This is a workhorse of a tool for data cleaning and preparation. For a more detailed tour of Power Query and the interface, check out this post.

One annoying thing about Power Query is that it imports the Totals row from the underlying Excel table, which can cause errors and easily double-count your data. To prevent this, go to Home on the ribbon, then Remove Rows > Remove Bottom Rows > 1:

Remove bottom 1 row Power Query

Take a look at the Applied Steps menu to the right of your data and you’ll see “Removed Bottom Rows” has been added. Every action taken on the data can be traced and repeated from this menu, making Power Query a boon for automation. And it all started by converting your input data to a table!

One last thing to save your query — head to the Home tab of the ribbon, then click Close & Load. A new, “Power Queried” version of your data will get loaded to another worksheet as… a table!

Power Query resulting table

This table has all of the same magical properties as the one you’ve seen before. That said, I wouldn’t make many changes to this one as any updating of your Power Query steps could wipe them out on refresh.

Tables really are the gateway to the automation dream factory that is Power Query.

Don’t table learning more tables

Were you able to follow along with all the cool stuff in this demo? Check your work below:

And while you’re at it, keep exploring tables! Try converting some of your existing workbooks to tables. Check out Microsoft’s documentation on the subject. And take those first steps into Power Query.

If you’re just getting started with tables, what questions do you have? Do you see any drawbacks from this approach to classic cell references? If tables have already won you over, what benefits did I miss? Please do let me know in the comments.

The post Table time! Reasons to start using Excel tables now first appeared on Stringfest Analytics.

]]>
9258
How to load Excel data into Power Query https://stringfestanalytics.com/how-to-load-excel-data-into-power-query/ Sun, 03 Jul 2022 20:34:44 +0000 https://stringfestanalytics.com/?p=9422 I liken Power Query to the Narnia of Power Query — a magical place, but not so easy to locate: Here’s how to load your Excel data into Power Query…and an example what to do with it when you get there: First, the Data tab The first steps to Narnia, I mean Power Query, is […]

The post How to load Excel data into Power Query first appeared on Stringfest Analytics.

]]>
I liken Power Query to the Narnia of Power Query — a magical place, but not so easy to locate:

Here’s how to load your Excel data into Power Query…and an example what to do with it when you get there:

First, the Data tab

The first steps to Narnia, I mean Power Query, is to place your cursor anywhere inside the data, then go to Data > From Table/Range.

To get your Excel data into Power Query, select any cell in the range and click Data > From Table/Range.

Once you click on this, a Create Table menu will appear. Go ahead and click OK:

This dataset must become a table to be used in Power Query

Moral of the story: Power Query forces any Excel data that comes its way to be in a table first. That’s why it’s good to understand a bit about tables before you use Power Query.

A couple of clicks and congratulations, you’re in the Power Query editor, which should look something like this:

There’s a lot going on in the Power Query editor. Don’t panic; you got this!

Now what?

I hope you can use this walkthrough as a jumping-off point into other tutorials. If you’re good at stopping here, you can back out of Power Query simply by selecting the X at the upper corner of the editor, then selecting Discard. You’ll be launched back to the original Excel worksheet, and you’ll see that the tips dataset is now a table.

Power Query discard changes

Now that you know the secret location to Power Query, I hope you come back often!

What questions do you have about Power Query for Excel? Let me know in the comments.

The post How to load Excel data into Power Query first appeared on Stringfest Analytics.

]]>
9422
How to add data to the left of an Excel table https://stringfestanalytics.com/how-to-add-data-to-the-left-of-an-excel-table/ Sat, 25 Jun 2022 16:48:09 +0000 https://stringfestanalytics.com/?p=9323 This is something that happens to me all the time: I’ve got data in an Excel table and want to add an index column (because I believe every dataset should have an index column). Index columns typically come first in a dataset, which means adding data to the left of the table. But here’s the […]

The post How to add data to the left of an Excel table first appeared on Stringfest Analytics.

]]>
This is something that happens to me all the time: I’ve got data in an Excel table and want to add an index column (because I believe every dataset should have an index column). Index columns typically come first in a dataset, which means adding data to the left of the table.

But here’s the thing: Excel tables don’t add data to the left so good.

Here’s what I mean and how to get around it. Download the exercise file below to follow along.

 

Here we have a set of restaurant transactions; I would like to add an index column id to the left of the data.

Our goal is to add an index column id to this dataset (download here)

Let’s take a look at adding this unique identifier to the dataset a few different ways.

Adding data to the right of a table is easy…

The first, easier option is to add data to the right of the table. All you have to do is start typing! The table will automatically include your new column as part of the table.

Because we’re making an index column in this case, we can even use Flash Fill to quickly populate its contents.

Adding data to the right of a table is as simple as to start typing.

But what about adding data to the left of the column? Index columns are generally over in this part of the table.

Unfortunately, adding data here is a little bit trickier, but nothing we can’t handle.

Adding data to the left is trickier

If you insert a column to the left of the table and start typing… no dice. The data does not become part of the table.

Data added to the left does not automatically become part of the table.

What are the workarounds?

Like with most things in Excel, there are several ways to get around this. Here are three options. If you have another, let us know in the comments.

1. Cut and paste the column

This one may be the most straightforward: simply select the id column right of the table, right-click to cut, then right-click on the current first column and select Insert Cut Cells:

The classic cut-and-paste is one way around this issue

This works well if the column has already been created inside the table and you simply want to relocate it.

2. Resize table

Here’s another possibility: the column you want to add is already created and placed to the left of the table, you just want to extend the table over to include it.

In this case, click anywhere in the table and select Table Design > Resize Table and change the formula reference to include the relevant column.

Use the Resize Table feature to extend a table over to a contiguous left column.

3. Right-click to add to the left

Maybe you just want to create the leftward column first, and then add the data? Here’s how to do that: right click any cell in the current leftward-most column, then select Insert > Table Columns to the Left. From here, you have a column ready to use.

You can also right click on the table to add a column to the left.

Recap

While it’s not quite as simple to add data to the left of a table as it is to the right, there are still some options:

  • Once the column has been created inside the table, cut and paste it to the left
  • If the column has been created outside of the table, move it to the left of the table and resize the table to include it
  • If the column has not been created yet (inside or outside of the table), right-click on the table to add a column to the left.

Tables, the bridge to Power Query

Tables in and of themselves are a powerful tool for data management and analysis. But on top of that, they also serve as the gateway to that magical oasis of data cleaning known as Power Query. Is your team using Power Query for Excel data cleaning and reporting yet? Get started with my Click-and-clean data in Excel Power Query workshop:

How are you using tables now? Which method makes the most sense to you for adding data to the left? Let me know in the comments.

The post How to add data to the left of an Excel table first appeared on Stringfest Analytics.

]]>
9323