Skip to content

fserrey/google-analytics-sql

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

google-analytics-sql

Open Source? Yes!

Useful scripts to get basic ecommerce KPIs querying GA 360 BigQuery export

Description

This repo contains a set of queries in standard SQL useful to extract some of the main KPIs out of GA 360 (mainly, as we are moving to GA4 now). I hope you'll find them useful. Feel free to send your comments or make some suggestion in a PR form.

Some context

It's true that you might get some of these KPIs straight from the GA UI. However, getting access to the raw data pulled from GA gives you the ability to personalise the type of calculous that sometimes you need to perform in an external tool (i.e. Excel, Google Spreadsheet, etc). Additionally, if you are dealing with terabytes of data, the GA UI mostly provides sampled data and, spoiler alert, you can get them all with BigQuery.

Each row in the Google Analytics BigQuery dump represents a single session and contains many fields, some of which can be repeated and nested, such as the hits, which contains a repeated set of fields within it representing the page views and events during the session, and custom dimensions, which is a single, repeated field. This is one of the main differences between BigQuery and a normal database. (source)

How to access nested fields?

One of the trickiest parts of working with the GA BigQuery export schema is that data is not organized in nice little rows and columns. Instead, you have something that reminds a JSON object. Don't panic! We will get through it together.

So, how can I access this data? Well, something that helped me a lot is to think about the schema as a tree (inverted). You have the main body and, there are some branches that grow from it (and even more branches that grow from those previous branches). In the end, you just have a bunch of fields that comes from others. Like a nested JSON: Example of nested JSON

So, basically, we expand the NESTED fields and then JOIN them with the table. In other words, we unfold them using the UNNEST function so we can later CROSS JOIN these fields with it's equivalents. The issue here is that as you "unfold" some fields, those that are not-nested are going to repeat themselves as much as "nested rows" are held inside a nested field.

You can see a detailed explanation here

Additionally, check out these bonus resources:

About

Useful scripts to get basic ecommerce KPIs querying GA360 BigQuery export

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors