itsopensource

Best way to keep users safe while using Google Analytics

July 11, 2020

Google Analytics is the most used web analytics service over the web, Google has made it pretty easy and effective in terms of implementation and dashboard UI. It gives detailed demographic data and many other features that justify its vast usage. The most common and easiest way to enable google analytics on any website is by adding the tag manager (the code snippet provided) to the website.

<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-XXXXXXXXX-1"></script>
<script>
  window.dataLayer = window.dataLayer || [];
  function gtag(){dataLayer.push(arguments);}
  gtag('js', new Date());

  gtag('config', 'UA-XXXXXXXXX-1');
</script>

Problem

Adding a Tag manager is allowing google to run some code on your website whenever someone visits the website or do some action (I would request to read this line once more). Based on the options selected on the analytics UI, Google inserts the scripts into the website, sends the data back with HTTP requests. Data includes the complete URL(with query params) and many other details that may be classified as Personal Identifiable Information(PII). Konark Modi, a privacy advocate, has written a detailed case study on how sensitive user data or PII is getting leaked to third parties, including Google Analytics, in this blog. As a developer, we always want to have complete control over whatever is being served on our website. Google tag managers kind of blow this up.

Leak 1
Leak 2

Solution

This can be avoided by using Google Measurement protocol.

From the docs

The Google Analytics Measurement Protocol allows developers to make HTTP requests to send raw user interaction data directly to Google Analytics servers

TLDR; Do not load google scripts but, create and send HTTP requests by yourself. This gives way more control over what you want to send to google and ensures your complete control over your website. You can send Requests for whatever action needs to be recorded for analytics. It can be just page visits, clicks, or any event.

How

Analytics tool receives the data via query parameters of the request, a typical request looks like this

POST /collect HTTP/1.1
Host: www.google-analytics.com

payload_data

Mandatory parameters are

v=1              // Version of the tool.
&tid=UA-XXXXX-Y  // Tracking ID / Property ID.
&cid=555         // Anonymous Client ID.
&t=              // Hit Type

Google provides a number of parameters in case you want more detailed analytics say for e-commerce, check the parameter guide here, some interesting parameters which can be controlled are

  • dr - Document referrer // = document.referrer
  • dl - Location URL // = document.location.origin + document.location.pathname (also may be document.location.search)
  • aip - Anonymize IP, if present the IP address of the sender will be anonymized // = 1
  • npa - Disable advertising personalization - if enabled it won’t be used when populating a remarketing audience for “past purchasers” // = 1

Lesser the parameters, lesser the data sent, better the privacy.

Google also provides a tool to check and create a proper hit via Hit Builder

If you find this a bit exhausting, then atleast follow the best practices to make sure You are not sending the user’s personal data to google. https://support.google.com/analytics/answer/6366371

--- Keep your users safe ---


Trishul Goel

Trishul Goel is a frontend developer, loves to talk about modern Javascript, frontend architecture, opensource, promotes PWAs (#teamWeb) and is expert in developing browser extensions..
@trishulgoel