5 months ago
Like many marketers, I’m sure that your marketing data comes from several platforms and you have to juggle with a mix of siloed data to be able to analyze and understand your customer journey. Quite annoying, right?
Siloed data coming from different sources rarely communicate with each other and make the interpretation of KPIs and important metrics complicated. In the end, it leads to situations where you don’t know how to make decisions…
That’s why, more than ever, you need to take control of your data and gather it in one place. Generally, the main solution is to use spreadsheets, but spreadsheets can’t handle huge datasets. This is precisely where a data warehouse like BigQuery becomes interesting.
BigQuery is a data warehouse from the Google Cloud platform that helps you to store, visualize, analyze and ingest your data. The goal is to directly upload all your data in real-time from various sources: CRM, Google Analytics, Google Sheets, Google Ads, Google Optimize, Facebook Ads,… The objective is to centralize all your raw data in the cloud.
In terms of pricing, the model is quite simple. You’ll pay both storage and processing costs. Storage costs only depend on the volume of data stored. For processing, you will pay for querying data, data storage, and streaming inserts. Loading data is free of charge.
BigQuery provides free tiers for storage (10 Gb) and processing (1Tb). You only pay above those thresholds. For more information about pricing, you can check the overview of BigQuery pricing.
By being scalable and fast, Google BigQuery enables you to only focus on analyzing your data and allows large-scale analytics. As it’s connected to intelligence tools such as Data studio, it helps you to easily visualize your data in a clear and centralized report to get clear insights about your customer journey and get predictions about your clients for example.
BigQuery ML allows you to create and execute machine learning models directly on the data warehouse. You can create regression models, K-means clusters, decision trees, or time series just by a “CREATE model” query. It saves loads of time and makes it handy to have data and models managed in the same place.
Of course, BigQuery is not the only data warehouse. Amazon ( Amazon Redshift) or Microsoft (Microsoft Azure Synapse Analytics) also offer similar services.
If you use BigQuery or another solution, it will be a decision based on your project: type and structure of data, purpose (machine learning ?), or endpoint (dashboarding, data platform,..). But also current technology stack: are you already using Google Cloud Platform or Google data such as GA 4. “Multi-cloud” (aka using multiple cloud solutions) is often possible but brings development and sometimes complications.
The main question is: why should Marketers start to store their data in this cloud and how to use it?
If your project does not match with Big Data 3 V’s: if your data are not big in Volume (Size), doesn’t require Velocity (speed of execution) or are not Varied enough, maybe you’ll find other solutions within Google Cloud Platform that fit best with your needs (for example Google Sheets or Google SQL).
Don’t forget Google is a US company: they have no GDPR adequacy decision from Europe and are not under the protection of Privacy Shield (invalidated by EU in July 2020). That means uploading personal data from EU citizens into Big Query without data protection assessment should be considered as a risk (by the way, when playing with big data, a privacy assessment is always a good idea, US company or not: am I respecting the privacy of my customers, do I process sensitive data like health/opinions/sexual orientation/…, how do I respect the right to be forgotten, etc..)
No, “I want everything” is not a good answer (remember, you have a data protection assessment to fill, the purpose of processing is the first chapter).
You need to think about what you want to reach and the value you want to unlock. Is it a prediction of your customer’s behavior (churn, acquisition, ..), optimizing website visitors’ behavior, CRO, information about a specific category, enriching your CRM with Analytics data (or the other way around), optimizing relevancy, doing marketing automation,… Describe exactly what you need to do, be very specific, and evaluate the value (in terms of return or time saved) for each purpose.
The second step is to map all your data required to achieve this goal (automation tools, CRM, Ads platforms,…). Class them between “need to have” and “nice to have”.
Focus on needs to have, and check if your data sources have a public API or if automated data export is enabled. If it’s not the case you will have to find another tool. In marketing data, Supermetrics extracts data from your marketing platforms and uploads them into Big Query.
There are two main processing models :
BigQuery fits very well with the ELT paradigm. As BigQuery is very scalable, you can handle huge transformations on a big amount of data. That’s why Google Analytics 4 works best with BigQuery: data flows into BigQuery and you can make your analysis after. See below for a tutorial on how to import your GA4 data into BigQuery.
In the case of ELT, the transformation is done via SQL on BigQuery. You’ll find handy the documentation of bigQuery query syntax. Long story short: it is very similar to “traditional” SQL, with additional concepts such as ‘nesting’ : a table can be inserted (“nested”) into another table. That means you’ll have to “unnest” nested data in a record before using it.
If you’re familiar with SQL, you’ll also need to review the schema and especially the partitioning part. The way you organize your data is very important: remember, you pay per amount of data processed. If you tell BigQuery how to organize (“partition”) your data (per day, per site, ..) BigQuery will process only the data needed, and then charge you less.
Once your query is ready and optimized, you’ll have to visualize the results somewhere. There is no shame in using Google Spreadsheets to run the “last mile” of the analysis. Spreadsheets has built-in connectors (Data -> Data connectors -> Connect to BigQuery)
The most obvious choice is Data Studio, you connect to your view or table, and create a dashboard. Search for BigQuery in the connectors. Choose your project – table, and enjoy the full GA4 schema.
Note other visualization tools like PowerBI or Tableau can also connect to BigQuery.
At Clicktrust, BigQuery allows us to consider the consumer journey as a process. We design steps the consumer should take, from product discovery until the purchase, which defines the completion of the process.
We collect and store “events”, analyze the performance of the process, a little bit like a supply chain company would do :
To realize this, we can add different layers:
To conclude, BigQuery (or another data warehouse) will help you to take advantage over your competitors by enabling detailed and in-depth analysis. With ITP, IOS 14, the death of cookies,… It’s more than essential today to find alternatives to continue to understand the needs of your customers to make sure they continue to meet the needs of your business.
To learn more about the death of cookies, we recommend you to read the article on the death of cookies and how to activate CRM data.
Finally, if you’re interested in knowing how to put all of this into practice, stay tuned, a second article is coming soon.