Cookieless Google Analytics

Data protection-oriented, cookie-free tracking with Google Analytics through the use of the measurement protocol.

Since the start of the GDPR, I’ve been looking for a cookie-free Google Analytics alternative. I’m interested in which pages of my blog are called up, where the visitors come from and, since the relaunch of my blog, also which language they speak. I also want to be able to view the data in aggregated form, for example as a simple dashboard. And since my blog is just my private pleasure, I don’t feel like paying tons of money for analytics data every month. All this is easy to implement with Google Analytics - if it weren’t for the DSGVO.

The problem

If I want to use Google Analytics in a privacy-compliant way, I need the visitors’ consent. So I slap a banner right in front of a new user, “Agree” or “Decline” as options. Only to then give his data to Google and make him trackable on numerous websites.

Another disadvantage: More and more people are using extensions like Ghostery, which block Google Analytics directly on page load. At the latest with the further development of browsers (like the Intelligent Tracking Prevention in WebKit) the data becomes less and less reliable.

There are a few alternatives to Google Analytics: Direct GA competitors like Simple Analytics, plugins like Statify, or moving to server log analysis tools, such as with Netlify Analytics. Most of those cost, though, and limit me to the data collected with that particular tool. I’m a big fan of Google Data Studio, so I can analyze data from Search Console and other sources at the same time.

Even experiments with Firestore I have behind me. With my own tracking pixel, I have stored the user data there - but then I have to build EVERYTHING myself. From the simple bot filter, to the sorting of the sources into channels, to the interface with the Data Studio.

I want Google Analytics - only without cookies and DSGVO compliant.

Option 1: Standard integration with parameters.

Cookie-free tracking is also possible with the standard integration of Google Analytics. For this, only two parameters have to be set manually:

ga('create', 'UA-XXXXX-Y', {
  'storage': 'none',
  'storeGac': 'false',
  'anonymizeIp': 'true'
});

This means that GA will no longer set cookies and the user’s IP will be shortened within the GA library and only then transmitted. So from a privacy point of view this is a good start. Nevertheless, the full 40kb of Gtag.js are loaded - that’s better. And what Google collects in the background with the script, I don’t even want to know :D

The tracking is limited to pageviews

To store the complete browse history of a user, I would have to assign multiple pageviews to a user. Works with the Measurement Protocol - but I would have to store a user ID as a cookie or in LocalStorage to keep the variable costant over multiple pageviews. But this contradicts my basic idea to do completely without cookies (and also without the LocalStorage).

Instead, each pageview generates a new user ID and starts a new session with each page change. This means: No pageflow, no number of users (which is displayed but is equal to the pageviews), no entries or jumps. Every pageview is a new entry and every page change means a jump.

Option 2: Measurement Protocol

Option 2 is the Measurement Protocol. This is nothing else than an interface for sending raw data as HTTP request. If you want to play with the Measurement Protocol, you can use the Hit Builder.

The solution to the cookie-free Google Analytics script is a request in which I package a basic set of user data and pass it to the Measurement Protocol. I decided to use this data:

  1. user ID (required by Google for a valid request)
  2. user agent (for filtering bot traffic)
  3. page title
  4. URL
  5. referrer
  6. set language

So only the data really necessary for me for a reasonable analysis is transferred. The user ID is set completely randomly per page view. And all this in 17 lines of javascript:

if ("sendBeacon" in navigator) {
  navigator.sendBeacon(
    "https://www.google-analytics.com/collect",
    new URLSearchParams([
      ["v", "1"],
      ["t", "pageview"],
      ["tid", "UA-XXXXXXXX-X"],
      ["aip", "1"],
      ["cid", `${new Date().getTime()}${Math.random()}`],
      ["dt", document.title],
      ["dl", location.href],
      ["dr", document.referrer],
      ["ul", navigator.language],
      ["ua", navigator.userAgent],
    ]).toString()
  );
}

No IE

Internet Explorer does not understand sending via Beacon API. You can work around this (via Fetch API instead of Beacon), but this variant is significantly more performant.

No target group analysis

I only transmit a basic framework of information. Since I don’t give Google enough information to chain multiple pageviews together and thus no cross-page tracking is possible, I also don’t have any information about the target group. Means: No demographic data, no location, no interests and no network info. Could be included, but contradicts the basic idea.

More detailed tracking is only possible with great difficulty

Although events can also be transmitted via the Measurement Protocol, the option would be very cumbersome. If you rely on detailed tracking, you are wrong with this solution and should rather use the Google Tag Manager with individualized triggers and events.

Option 3: Measurement Protocol with proxy script

To completely decouple the user from Google, I now put a script in between. Instead of sending the request directly to https://www.google-analytics.com/collect it will be sent to a point on my server, which will then expand the data and send it on. This turns the previous script into an even shorter version:

if ("sendBeacon" in navigator) {
  const data = JSON.stringify({
    title: document.title,
    url: location.href,
    ref: document.referrer,
    language: navigator.language,
    useragent: navigator.userAgent,
  });
  navigator.sendBeacon("/.netlify/functions/pixel", data);
}

For further processing I use a Netlify serverless function - but it works just as well with PHP, the main thing is that somehow the data arrives at Google. Google only receives the data that I forward from the server. My Netlify function looks like this:

const fetch = require("node-fetch");

exports.handler = async (event) => {
  if (event.httpMethod !== "POST") {
    return {
      statusCode: 405,
      body: "Method Not Allowed"
    };
  }
  if (event.body == null) {
    return {
      statusCode: 400,
      body: "Bad Request",
    };
  }
  const body = JSON.parse(event.body);
  const endpoint = "https://www.google-analytics.com/collect?";
  const payload = new URLSearchParams([
    ["v", "1"],
    [ "t", "pageview"],
    ["tid", "UA-XXXXXX-X"],
    ["aip", "1"],
    ["cid", `${new Date().getTime()}${Math.random()}`],
    ["dt", body.title],
    ["dl", body.url],
    ["dr", body.ref],
    ["ul", body.language],
    ["ua", body.useragent],
  ]);

  try {
    const response = await fetch(endpoint + payload, {
      method: "POST",
      cache: "no-cache"
    });
    if (response.ok) {
      return {
        statusCode: response.status,
        body: response.statusText,
      };
    }
  } catch (err) {
    console.error("Error: " + err);
  }
};

Bonus: Data Studio template

With these two files, the data already flows cleanly into Google Analytics, where it can be filtered and analyzed as usual. But I’m a fan of Google Data Studio, so I made myself a suitable dashboard with the transferred data and the search queries from Search Console.

You can check out and copy the Data Studio here. Just swap the data sources and you have a good starting point for your own Data Studio.