data model - Stringfest Analytics https://stringfestanalytics.com Analytics & AI for Modern Excel Wed, 25 Sep 2024 13:17:28 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 https://i0.wp.com/stringfestanalytics.com/wp-content/uploads/2020/05/cropped-RGB-SEAL-LOGO-STRINGFEST-01.png?fit=32%2C32&ssl=1 data model - Stringfest Analytics https://stringfestanalytics.com 32 32 98759290 How to load multiple tables to the Data Model at once in Excel https://stringfestanalytics.com/how-to-load-multiple-tables-to-the-data-model-at-once-in-excel/ Wed, 25 Sep 2024 13:14:37 +0000 https://stringfestanalytics.com/?p=14313 In my training sessions, I often have learners practice loading multiple datasets into Power Query and the Data Model, one file at a time. A common question arises: how can we load several datasets simultaneously? Here’s an answer. To follow along, download the exercise files below:   To effectively manage all the files you want […]

The post How to load multiple tables to the Data Model at once in Excel first appeared on Stringfest Analytics.

]]>
In my training sessions, I often have learners practice loading multiple datasets into Power Query and the Data Model, one file at a time. A common question arises: how can we load several datasets simultaneously? Here’s an answer.

To follow along, download the exercise files below:

 

To effectively manage all the files you want to load into the Data Model, ensure they are located in the same folder, such as in the exercise folder provided above. Make sure to unzip the folder and place it in a location that’s easy to find in your workbook.

Next, head to Data > Get Data > Query Options:

Query options Excel

This allows you to customize some of the internal settings of your Power Query Editor. Go to Data Load under Global settings, and then under Default Query Load Settings, set it to Load to Data Model and Load to worksheet by default. I’ve selected Load to worksheet because if we are using Power Pivot, the data sources we import are typically intermediary and primarily relevant within the context of the Data Model. Click OK to apply these settings.

Power Query data load options

You may need to save your workbook and/or restart Excel for these changes to take effect.

Next, head back to Data, and this time From File > From Folder.

Get data from folder Power Query

Navigate to the exercise folder’s location and click OK. You won’t see any files listed in the explorer because we are searching for folders and subfolders, not individual files.

Folder import Power Query

You should now see all the files listed in the folder. There are some hidden files that originate from this folder being stored on GitHub, which you can safely ignore. We want to focus on the three Excel workbooks. Go ahead and click on the first one and select “Add as New Query.”

Add as new query Power Query

You will now see a file icon in the data area. Go ahead and right-click on the file name to confirm that it is an Excel workbook.

Add as Excel workbook

We have now drilled into the data sources found in this workbook. You’ll notice there appears to be one table in the workbook. Go ahead and click on “Table” under Data to drill in and view the actual dataset in the workbook.

Drill on table Power Query

The first row of data in the orders table are the headers of this dataset, but Power Query might not have automatically detected that. If that’s the case, go to the Home tab in the Power Query editor and select “Use First Row as Headers” to promote the first row to the header.

Use first row as header

Awesome work! Now, return to the superstore query and apply the same process to the next workbook, people. This one contains multiple worksheets, so make sure to check which one actually has the data you need.

For a preview, click on the cell just outside the green Table hyperlink on each sheet to see a preview below. It appears that the data we want is in the first worksheet, although we have the same issue with the headers not being promoted. Go ahead and promote those headers too once you’ve imported the data.

Pick  the correct dataset Power Query editor

Finally, repeat the process for the last query. Once you’re done, it’s time for some cleanup. I’ll have you double-click on each query’s name to change it from the filepath to just the name of each individual dataset: orders, people, and returns, respectively.

Lastly, you can delete the superstore dataset by right-clicking it and selecting delete. Don’t worry if you need to add another dataset from this folder. You can always recreate the process by starting with a copy of one of these queries.

Your Power Query editor should now look like this:

Rename and delete the last query

Finally, go to File > Close & Load > Close & Load To with any of these queries selected, and you’ll notice that our defaults are indeed set to create a connection only AND to add to the Data Model. Nice work!

Add data to data model

From here, navigate to Power Pivot > Manage Data Model, and you’ll see that all tables have been added to the Data Model in one go. This is a great timesaver, and we’ve also learned how Power Query can efficiently handle entire folders of files. Amazing!

Orders people returns in Data Model

What questions do you have about setting up your data quickly for Power Query and Power Query for Excel? Let me know in the comments. If you’d like a basic overview of these tools be sure to check out my book Modern Data Analytics in Excel:

The post How to load multiple tables to the Data Model at once in Excel first appeared on Stringfest Analytics.

]]>
14313
How to build an Excel data model that matters: Review of Data Modeling in Excel https://stringfestanalytics.com/how-to-build-an-excel-data-model-that-matters-review-of-data-modeling-in-excel/ Tue, 12 Dec 2023 02:48:19 +0000 https://stringfestanalytics.com/?p=12348 Data modeling, contrary to be nearing obsolescence, remains crucial for data analysts aiming to enhance their reporting speed and deepen their insights into business operations. Bernard Obeng Boateng’s Data Modeling with Microsoft Excel has emerged as a fantastic resource for this topic. Despite Power Pivot’s presence in Excel for over a decade, its capability to […]

The post How to build an Excel data model that matters: Review of Data Modeling in Excel first appeared on Stringfest Analytics.

]]>
Data modeling, contrary to be nearing obsolescence, remains crucial for data analysts aiming to enhance their reporting speed and deepen their insights into business operations. Bernard Obeng Boateng’s Data Modeling with Microsoft Excel has emerged as a fantastic resource for this topic.

Despite Power Pivot’s presence in Excel for over a decade, its capability to build relational models is not widely recognized. Boateng’s book addresses this gap, offering a comprehensive guide on utilizing this underexplored feature. His empathetic approach is particularly noteworthy; he acknowledges the typical challenges faced by analysts, such as slow reports and cumbersome dashboards, guiding the reader without presuming their level of expertise. This reader-centric style makes the book an accessible and engaging read.

From Basics to Advanced Techniques

Boateng delves into the ‘why’ before the ‘how’, helping readers understand the advantages of using a relational data model over flat tables with practical, real-world examples. He skillfully demystifies the process of arranging and normalizing data, addressing the often-overlooked related area of data preparation with Power Query.

An intriguing part of the book focuses on cube functions, a lesser-known yet valuable aspect of Excel data modeling. This inclusion is particularly beneficial for readers unfamiliar with the transformation of data model logic into practical worksheet formulas.

Beyond Data Modeling: Visualization and Dashboards

Boateng doesn’t stop at the back end; he extends his discussion to front end applications, such as data visualization and dashboards. This section links the importance of data modeling to its ultimate purpose: creating meaningful, user-focused end products. It demonstrates that data modeling is not just about the technical process but also about translating data into impactful visual narratives.

Data Modeling with Microsoft Excel stands out as an essential guide for anyone seeking to master data modeling in Excel. It covers the entire process, from initial data markup to the creation of final dashboards, providing a well-scoped, practical approach for professional implementation.

In conclusion, Boateng’s book is a significant contribution to the field, offering a perfect balance of depth and accessibility. It’s a must-read for those looking to harness the full potential of Excel for data modeling.

Have you delved into Boateng’s book? What are your impressions, and what other resources have you found helpful in relational data modeling with Excel? Share your experiences and recommendations in the comments below.

The post How to build an Excel data model that matters: Review of Data Modeling in Excel first appeared on Stringfest Analytics.

]]>
12348
How to understand filter direction in Excel Power Pivot https://stringfestanalytics.com/how-to-understand-filter-direction-in-excel-power-pivot/ Tue, 28 Mar 2023 20:22:36 +0000 https://stringfestanalytics.com/?p=11142 If you’re using Power Pivot, you’re probably dealing with table relationships and applying filters to refine results. This blog post explores filter direction in Power Pivot and how it enables filtering across tables. Let’s look at the relationship between filter direction and cardinality using the people and orders tables in the Superstore dataset:   Head […]

The post How to understand filter direction in Excel Power Pivot first appeared on Stringfest Analytics.

]]>
If you’re using Power Pivot, you’re probably dealing with table relationships and applying filters to refine results. This blog post explores filter direction in Power Pivot and how it enables filtering across tables.

Let’s look at the relationship between filter direction and cardinality using the people and orders tables in the Superstore dataset:

 

Head to Diagram View in this workbook and you’ll see that people and orders are related by the region field. If you look carefully at the line establishing this relationship, you’ll see a small arrow pointing from people to orders:

Orders and people filter direction

This arrow displays how filters on one table impact the other. Filters on the left table will affect the right table, but not the other way around. Let’s see it in action.

Filtering orders with people

First, let’s confirm what happens when a field from the people table is used to filter orders. I will add a PivotTable based on the Data Model into the workbook, placing region from people in the Filters area and Sum of sales from orders in Values:

People can filter orders

As you’d likely expect, placing a filter on region on East, for example, will return only the sum of sales that are found in the East region:

People did filter orders

The fancy way to explain what’s going on here is that the filter gets propogated from people to orders. But isn’t this always what happens when you create a filter? What’s the alternative?

Filtering people with orders

To answer this question, let’s create another PivotTable with region from orders in the Filters area and person from people in the Rows:

Orders can't filter people...

Go ahead and try filtering the data so you only get records from the East region. When you do, no records are dropped.

No, orders didn't filter people

Take a look at the people table and it’s pretty clear that the only person associated with this region is Chuck Magee:

People dataset screenshot

What gives? Why isn’t orders able to filter people? It comes back to filter direction — things just don’t flow this way in the data model!

How was this relationship determined?

Power Pivot’s filter direction is based on relationship cardinality. One-to-many relationships, such as between people and orders, have a direction from the “one” side to the “many” side. That means people can filter orders, but not the other way around.

What’s the reasoning here? It’s more efficient to start from the primary (“one”) table and filter down to the related (“many”) table because the primary table contains unique records. In our case, region and person are only listed once in people, but many times in orders, so it makes more sense to have people filter orders than vice versa. This approach reduces the amount of data that needs to be processed and improves performance.

How can you change filter direction?

You cannot change the filter direction in Power Pivot directly. It is pre-determined by the cardinality of the relationship between the tables. If, for some reason you need to change the filter direction indirectly, check out the CROSSFILTER() function in DAX.

The rhythm guitar of Power Pivot

Jazz guitarist Irving Ashby once said: “Rhythm guitar is like the vanilla in a cake. You can’t taste it, but you know when its been left out.” Filter direction functions similarly in your Data Model. Most of the time, it works just as you would think — so smoothly you forget it’s even there. But when something’s wrong, things get confusing, fast. A proper understanding of filter direction can significantly impact the accuracy and performance of your data analysis.

Has filter direction ever gone awry in your work with Power Pivot? What questions do you have about it? Let me know in the comments.

The post How to understand filter direction in Excel Power Pivot first appeared on Stringfest Analytics.

]]>
11142
How to load a data model from Excel to Power BI https://stringfestanalytics.com/how-to-load-a-data-model-from-excel-to-power-bi/ Wed, 08 Mar 2023 16:00:32 +0000 https://stringfestanalytics.com/?p=9902 One of the best things about Power Pivot and Power Query is that once you’ve learned how to use them in Excel, you can easily transfer that knowledge to Power BI: But not only do your skills transfer over — sometimes the actual work does too. For example, did you know you can import a […]

The post How to load a data model from Excel to Power BI first appeared on Stringfest Analytics.

]]>
One of the best things about Power Pivot and Power Query is that once you’ve learned how to use them in Excel, you can easily transfer that knowledge to Power BI:

But not only do your skills transfer over — sometimes the actual work does too. For example, did you know you can import a data model that was created in Excel Power Pivot into Power BI? Here’s how.

 

Why move from Excel to Power BI?

If both Power BI and Excel contain the same “power”-ful tools, you might be asking, then why ever make the move? To answer the question, let’s take a look at the history of Excel.

Power BI as the third piece of “modern Excel”

You may have heard the term “Modern Excel” to describe a suite of impressive data analytics features rolled out starting in Excel 2010. Among the most famous are Power Query for data cleaning and Power Pivot for data modeling.

But there’s a lost sibling in this story: Power View. Initially built for Excel, Power View was conceived as a tool for building interactive dashboards and reports. Over time, the idea of Power View was merged into Power BI entirely. In fact, newer version of Excel do not even include Power View.

That means Power BI boasts many more features for building interactive dashboards than Excel. At the same time, it’s a less familiar tool that may be difficult for analysts to build and share with. That’s why it’s not a bad idea to start building your data model in Excel — if you hit a point of dashboard sophistication where Power BI is just a better choice, you can always transfer it there. Let’s see how that’s done now.

Making the move

Your exercise files contain a workbook called superstore_data_model.xlsx. This is a simple data model consisting of three tables, with a couple of useful measures and calculated columns created as well. With this file closed in Excel, open Power BI Desktop and head to Import > Power Query, Power Pivot, Power View:

Import from Power Pivot in Power BI

 

From here, you can navigate to and select superstore_data_model.xlsx.

Loading to Power BI

Upon importing your model to Power BI, you’ll get a warning sign that Power BI will do its best to import the data. Click Start:

 

First, you have the option to either make a copy of the Excel data, or keep a live connection. While the ability to change the Excel workbook is more convenient, it comes at the expense of slower performance in Power BI. For the sake of simplicity, I will make a copy of the data for the exercise file, rather than keep the connection.

Keep connection Excel to Power BI

 

You should now see a message that Power BI successfully imported your workbook along with its queries, data model tables, and any KPIs or measures created:

Import Excel workbook contents menu

 

While the previous screen doesn’t state it, our data model’s calculated columns were also successfully imported. We can check that in Data View:

Order size DAX calculated column

 

Finally, head to Model View to confirm the relations between tables were properly defined:

View of data model in Power BI

 

Have you imported a data model from Excel to Power BI? Where do you prefer building data models and why? Let me know in the comments.

The post How to load a data model from Excel to Power BI first appeared on Stringfest Analytics.

]]>
9902
How to add tables to the data model in Excel https://stringfestanalytics.com/how-to-add-tables-to-the-data-model-in-excel/ Mon, 19 Sep 2022 23:38:44 +0000 https://stringfestanalytics.com/?p=10010 Het kopen van Cialis Generiek, ook wel bekend onder de naam Tadalafil, is een goedkoper alternatief voor het bestellen van de originele merkgeproduceerde hulpmiddel. Dit houdt echter niet in dat er veel verschillen zijn tussen de middelen. Welke prijs cialis online in Nederland vraagt zal niet gelijk zijn aan de prijs apotheek winkeliers in het […]

The post How to add tables to the data model in Excel first appeared on Stringfest Analytics.

]]>

Het kopen van Cialis Generiek, ook wel bekend onder de naam Tadalafil, is een goedkoper alternatief voor het bestellen van de originele merkgeproduceerde hulpmiddel. Dit houdt echter niet in dat er veel verschillen zijn tussen de middelen. Welke prijs cialis online in Nederland vraagt zal niet gelijk zijn aan de prijs apotheek winkeliers in het buitenland willen voor dit soort producten. Cialis pillen verschillen op het cruciale punt dat ze geproduceerd zijn door een ander farmaceutisch bedrijf dan het origineel. Buitenlandse productie vermijd octrooirechten.

If you’ve ever cursed your workbook’s labryinthine lookups and poorly performing PivotTables, you need Power Pivot. But, just like with its twin tool Power Query, it’s not so clear how to even load your data into it. Let’s take a look at how to do that with the three Superstore data files made famous by Tableau.

 

Setting up the Power Pivot add-in

Before getting started, check your home ribbon for a Power Pivot tab. If you don’t see it, follow these instructions to turn it on. Please note that Power Pivot is currently not available for all versions of Excel such as those for Mac.

Power Pivot versus the data model

Confused that you enabled something called Power Pivot to use another something else called the data model? I hear you. The difference is that the data model is just one element of the larger Power Pivot add-in:

Power Pivot vs Data Model

As the diagram suggests, the data model is responsible for managing individual tables and relationships between those tables. Other PowerPivot tools such as KPIs and calculated measures in DAX are layered atop that “core” data model.

Loading to the data model through Power Query

While there are a number of ways to load tables to the data model, we will do so through Power Query. This way we can build any repeatable data cleaning processes atop these tables as the need arises.

Let’s start with the orders table. Click on any cell in the table and select Data > From Table/Range in the Get & Transform Data menu:

Get Data From Table/Range menu

From inside the Power Query Editor, select Close & Load To on the home tab of the ribbon:

Power Query Close & Load to

I would suggest selecting “Only Create Connection” as the output option so these queries aren’t needlessly printed to another worksheet. In any case, you must check off “Add this data to the Data Model” at the bottom to load the table into Power Pivot:

Click OK. You can follow the same steps to load the other two tables to the data model.

Viewing the data model

Next, let’s confirm that all tables have indeed been added to the data model. Navigate to the Power Pivot tab of the ribbon and click Manage:

Manage Data Model in Power Q

Here is where we can manipulate and view the data model in a number of ways. Head over to the View group of the home tab and select Diagram View:

Power Pivot Diagram View

This will present a visual schema of each table in the data model and how they are related:

Power Pivot diagram view with unrelated tables

Currently, there are no relationships between these tables and we know this because there are no lines connecting them. Let’s do that now.

Creating a relationship between tables

Power Pivot and the data model is all about leveraging relationships between tables for enhanced data analysis. But we need to create those relationships first before manipulating with them. Think of the shared or “lookup” column that you would use to connect one data source to another with VLOOKUP(). Similar idea here to create the relationships.

First, we will connect returns to orders via the Order ID column. This can be done simply by clicking and dragging from one Order ID to the other. You should see a line form to denote the relationship between the tables.

Create relationship between returns and orders in Power Pivot

We can do the same with the other two tables by connecting orders and people based on Region.

Power Pivot people and orders relationship

Once the relationships are formed you should see something like this:

Power Pivot relationship diagram of superstore data

Loading the results back to Excel

Nice work! You have now created a data model with relationship right from your Excel workbook. Head back to the Home tab of the Power Pivot editor, then click PivotTable. Go ahead and add the PivotTable to a new worksheet when prompted.

Power Pivot load to PivotTable

You now have a “Power” PivotTable in your workbook that will allow you to use the full power of DAX! This PivotTable will allow you to work with columns and measures from all three tables at once:

Because our source tables are located in the same workbook as the data model, you will see the table names listed twice in this PivotTable. You want to use the tables which include the cylindrical database icon as those that are part of the data model.

Illustration of which tables to use in the Excel data model

For example, we can pull data from both the returns and orders tables into the same PivotTable without a problem:

Example of "Power" PivotTable using Superstore data

Recap

There is much more you can do with Power Pivot than our little PivotTable above, and even better ways to make the same PivotTable! But it all starts with loading in and setting up the data model, which you did successfully here. Nice work!

What questions do you have about Power Pivot? Do you see where this could be helpful for you? Let me know in the comments.

The post How to add tables to the data model in Excel first appeared on Stringfest Analytics.

]]>
10010
A tour of the Power Query Editor in Excel https://stringfestanalytics.com/a-tour-of-the-power-query-editor-in-excel/ Sat, 13 Aug 2022 14:19:58 +0000 https://stringfestanalytics.com/?p=9638 Power Query is possibly the most exciting new Excel feature of its generation… but you might never know it considering how hidden it is from the Excel interface! If you’re looking to connect a data source to Power Query, check out this post: If you’ve already loaded the data into Power Query and are wondering […]

The post A tour of the Power Query Editor in Excel first appeared on Stringfest Analytics.

]]>
Power Query is possibly the most exciting new Excel feature of its generation… but you might never know it considering how hidden it is from the Excel interface! If you’re looking to connect a data source to Power Query, check out this post:

If you’ve already loaded the data into Power Query and are wondering what the heck to do next, here’s that post: a tour of the Power Query editor.

 

You can find a completed copy of the workbook here.

Your data is loaded… now what?

When you download the starter file, you will see a penguins table that needs to get loaded into Power Query. To do that, click anywhere in the table and then select Data > From Table/Range. If you need help here, check out this post on Power Query.

Once the data’s loaded, you will see the Power Query Editor menu open and something like this:

Penguins dataset in the Power Query editor start

Interesting, right? It kinda looks like Excel, but also not. So what are we supposed to do now?

Let’s walk through each element to understand what you’re dealing with.

The Ribbon tab

On the top of the menu (labeled “Power Query Editor” in case you really weren’t sure), you’ll see the familiar ribbon interface:

You should see four tabs on the ribbon here, which are:

Home

Just like with regular Excel, the Home tab has the most fundamental operations in Power Query. But rather than see options to change font size or color here, for examples, most of these tasks are about data cleaning and transformation: choosing rows, deleting columns, merging and more.

Transform

But wait, there’s more! The Transform tab provides even more options for data cleaning and transformation. Notice how many more menu options there are in Power Query than regular Excel specifically for cleaning data. We’ll play around with these options here in a moment.

Add Column

This is really another column specifically for data transformation, right? In particular, it’s to create new columns: whether that’s from a calculation, by using Excel’s pattern-matching algorithm or something else.

View

Finally, the View tab allows us to customize what we see in the Power Query Editor. As I highlighted in this post, these features allow us some nifty data profiling possibilities. You can learn more about those features at this post:

For now, simply click on “Formula Bar” under the Layout group. You will see a formula bar appear on top of your dataset, similar to Excel that you can easily expand with the dropdown icon:

Power Query formula bar

The Formula Bar

If you’re following along with the demo, you should see something like this in the formula bar:

= Table.TransformColumnTypes(Source,{{"species", type text}, {"island", type text}, {"bill_length_mm", type number}, {"bill_depth_mm", type number}, {"flipper_length_mm", Int64.Type}, {"body_mass_g", Int64.Type}, {"sex", type text}, {"year", Int64.Type}})

Doesn’t look like your typical Excel function, right? That’s because it’s not… it’s the M programming language used only in Power Query.

Does that mean you need to learn how to code in Power Query? Nope! Using the home ribbon and other features, you can accomplish nearly all your day-to-day tasks without dipping into M at all. However, as you make changes to your query you will see the formula in this bar change, which you are able to edit as necessary.

Having the entire query expressed as code is great because it’s easy to track exactly what is happening, when and repeat the process. Having this code generated is great so you don’t have to write it.

Query properties

OK, let’s move from the top of the Editor and over to the left:

Power Query editor Query Properties

At the top-left of the Editor you will see a list of queries available to you. Right now we’ve only imported one data source into Power Query, but if you had more they’d be listed here. Keep in mind this source data can be everything from a csv file to a database to a web page.

Then, wayy toward the bottom-left you will see the number of rows and columns in your current query. There is also a rather cryptic message about “Column profiling” which will make more sense if you read the data profiling blog post 😼.

The imported data

OK, now let’s get to the part that takes up most of the Editor’s real estate, the part you’re used to working for in Excel: the data!

Go ahead and play around with the data here in Power Query: hide some columns, drop a formula into a cell or something like that.

Spoiler: you can’t.

The Query Settings

Power Query has this data locked down tight! No more random hard-coding formulas or hiding columns. Everything must be done programatically.

Take, for example, something as simple as deleting a column. Let’s drop the year column from our data. Right-click on the column label, then select Remove.

Remove column Power Query penguins dataset

The column is gone from our dataset for good! Or, is it? To see how changes to the data work in Power Query, check out the “Applied Steps” menu over to the right.

The Applied Steps menu

We have a record of the deletion! Every step we take in Power Query will be maintained in the Applied Steps, shown in the box toward the right of your data:

Penguins Query Settings -- removed columns

You’ll see this is actually the third Applied Step:

  • The first step is to connect to the data. That’s the Source.
  • The next step is to set data types. Unlike Excel, in Power Query every value in a column must be of the same type.

If you click on any of the steps, you can “go back in time” to how the data looked at that point:

Power Query walk through applied steps

You’ll also see the M code that was used to generate that step in the Formula Bar. To see the entire script in one place, go to the View tab of the ribbon and Advanced Editor:

Power Query Advanced Editor

Let’s see what happens when we do more to the data: we’ll add an index column.

Go over to the Add Column tab on the ribbon, then Index Column > From 1:

Add index column in Power Query

You should now see this extra step in Applied Steps.

Not only can you step in and out of these steps in the menu, you can even delete them. Let’s say for example we want to get the year column back in our dataset. Simply click the X to the left of Removed Columns:

Removing applied steps Power Query

However as the warning message says, once a step has been deleted there is no way to undo that delete.

Exiting the Query editor

This is all pretty cool, right? Feeling a little more comfortable with the Power Query editor? Now let’s try making our exit.

If you’re in the Power Query editor and try going back to regular Excel, you’ll notice you are more or less locked out. The spinning blue wheel prevents you from making any changes.

To exit the query and bring your prepared data into Excel for further analysis, you must close and load the query. To do this, click Close & Load on the Home tab of the ribbon. You should see a menu like this:

Load to options

You’ve got a few options for what to do with the results of your query:

  • First, you have the option to load the data into either a table, PivotTable “Report” (it’s just a PivotTable), PivotChart, or connection only. If you select the last option, the results of your query will not be loaded to Excel but the query will be available in the workbook.
  • If you select to load the data to your workbook, you can place it in a new or existing worksheet, similar to inserting a PivotTable
  • You also have the option to add the data to the Data Model. This enters Power Pivot and DAX territory and is beyond the scope of this post.

The default loading option is to a table in a new worksheet; let’s go with that. You should now see a second worksheet in your workbook containing an orange-colored table. Neat!

Organizing your worksheets

It can be a little confusing that you now have two worksheets of the penguins table: one is the original source file and the other has been modified via Power Query. By default, you’ll see Excel does use different color tables to signify each kind. It’s also a good idea to give each worksheet a name to indicate whether it’s the input or output data source, such as penguins-in and penguins-out.

Keep in mind too that Power Query does not just work on worksheets from inside the same workbook. You can connect to CSV files, databases, the web and more.

Returning to the Power Query Editor

Missing Power Query? I thought so. If you want to return to your query to continue transforming the data, head to the Data tab on the main Excel ribbon, then Queries and Connections. You should see a menu to the right pop up listing your penguins query. Right click on it and select Edit to be returned to the Power Query Editor:

Queries & Connections menu in Power Query

The Excel you never knew

If you haven’t been using Power Query yet, I don’t blame you… it wasn’t the easiest to find, right? But I hope this post got you acclimated to the Query Editor and what this fantastic tool can do for you.

 

What challenges are you thinking of solving with Power Query? What questions do you have about the Editor interface? Let me know in the comments. If you are looking to introduce Power Query to your team, please get in touch for corporate training.

The post A tour of the Power Query Editor in Excel first appeared on Stringfest Analytics.

]]>
9638
Marshall on Modeling https://stringfestanalytics.com/marshall-on-modeling/ Sat, 13 Feb 2016 13:00:18 +0000 http://georgejmount.com/?p=1269 The 19th Century economist’s famous steps on building and communicating economic models. One of my all-time favorite quotes.

The post Marshall on Modeling first appeared on Stringfest Analytics.

]]>
The 19th Century economist’s famous steps on building and communicating economic models. One of my all-time favorite quotes.

MarshallonModelling

The post Marshall on Modeling first appeared on Stringfest Analytics.

]]>
1269
Abraham Lincoln, Data Analyst https://stringfestanalytics.com/abraham-lincoln-data-analyst/ Fri, 01 May 2015 02:52:30 +0000 http://georgejmount.com/?p=410 Give me six hours to chop down a tree and I will spend the first four sharpening the axe. Lincoln’s wisdom is easily applied to data analysis. Very little of working with data is actually running the model. Most of it is defining the problem, gathering data, and preparing a presentation. Do not swing at your […]

The post Abraham Lincoln, Data Analyst first appeared on Stringfest Analytics.

]]>

Give me six hours to chop down a tree and I will spend the first four sharpening the axe.

Lincoln’s wisdom is easily applied to data analysis. Very little of working with data is actually running the model. Most of it is defining the problem, gathering data, and preparing a presentation. Do not swing at your data with a dull axe!  Here is how to analyze data like Lincoln:

1. Frame the problem.  

Most business problems are unstructured. Our challenge is to find a sales-boosting promotion or a staffing pattern that enhances customer satisfaction.

The first step is to define the problem. Offer a hypothesis. Define your dependent and independent variables. Understand how the variables relate clearly.

2. Define — and defend — your assumptions

This one is difficult when you are modelling data on behalf of stakeholders who don’t know much about analytics. They want every exception and exclusion included in the model. Any assumptions you provide are presented as unrealistic.

Remember, though, that this is only a model. No model will ever be as complex or complete as real life. You just want something that will help you make sense of your data.

Assumptions allow you to control for factors so that you can test relationships between other variables. It’s the best solution we’ve got in business to a “control.”

You must make assumptions in your model because you are never going to capture everything in your data. Don’t try to make your data jive with the model 100%. In fact, if you do, then you’re going to overfit your data.

A model is like a pair of glasses. Don’t look at the glasses to test value. Look through them.

3. Collect and validate necessary data

Now you can gather data. Having all the data you need before starting your analysis will save time and frustration for later on.

In many cases, this will be the most time-consuming part of your project. You need to create a data set from several sources.

This can get ugly. That’s why it’s called  “data wrangling.” But carefully gathering data lets you check for discrepancies or irregularities. Are you missing something? Is some data formatted incorrectly? What about outliers? Is there a normal distribution? Answer these questions now before you get too deep into your analysis.

Abraham Lincoln, Data Analyst

4. Conduct your analysis.

This is the “tree chopping” segment of our exercise.  If you did everything else right you will spend the least amount of time on this problem.

5. Communicate results.

I like to think of this part as “reserve data wrangling.” You take results and translate them into a newer format still: actionable insights.

If you have developed a well-framed model with the proper assumptions and the necessary data, you should have a reasonable solution. It shouldn’t be unbelievable, but it should provide some new information.

Unless you are communicating to other analysts, don’t present in math-speak.  Alfred Marshall, the 19th Century economist, has the right idea here.

(1) Use mathematics as a shorthand language, rather than an engine of inquiry. (2) Keep to them till you have done. (3) Translate into English. (4) Then illustrate by examples that are important in real life. (5) Burn the mathematics. (6) If you can’t succeed in (4), burn (3). This last I did often.

When given a problem, data analysts are tempted to start swinging the axe immediately. Instead, take a lesson from Abraham Lincoln. Spend your time sharpening that axe, and your data problem tree will fall much easier.

The post Abraham Lincoln, Data Analyst first appeared on Stringfest Analytics.

]]>
556
Data Visualization: All Trends Subject to Change https://stringfestanalytics.com/all-data-subject-to-change/ Sat, 21 Jun 2014 15:58:00 +0000 http://georgejmount.wordpress.com/2014/06/21/all-data-subject-to-change Rather than my usual news-item commentary, I am going to write about something I’ve learned from data modelling and visualization. The biggest expense for many employers is salaries & wages.  This can be a tricky expense to model and visualize, because there is a lot of seasonality in the way payroll expenses are paid.  I […]

The post Data Visualization: All Trends Subject to Change first appeared on Stringfest Analytics.

]]>
office-1574717_960_720

Rather than my usual news-item commentary, I am going to write about something I’ve learned from data modelling and visualization.

The biggest expense for many employers is salaries & wages.  This can be a tricky expense to model and visualize, because there is a lot of seasonality in the way payroll expenses are paid.  I will give one example of a visualization that looked great until June or so — then started failing.
 
Why?  
 
My graphs were focused on showing the trend of different pay types (overtime, maternity leave, etc.) month-over-month.  What special kind of payroll would peak in summer?  Vacation!
 
A full year of data would have showed that vacation time dwarfs most other pay types.  The comparisons were much harder to see with this huge gap drowning out the scale.  I needed to consider a better visualization to account for the abnormal build in vacation time versus other pay types.
 
Moral of the story?  Think about where your data is going.  Think about seasonality and what might come next before you get too confident in your visualization.  Do a lot of backtesting, or fitting old data into your model or visualization.

The post Data Visualization: All Trends Subject to Change first appeared on Stringfest Analytics.

]]>
494