Tumbled Logic

Jun 6 2012

Cookies! Coooooookies!

Okay, so Cookiegeddon is upon us. Are you suitably enraptured?

Cookies! Good, then I shall begin.

It has come to my attention that some of you don’t entirely know what a cookie is or how they work, or what the different sorts are. As a good Internet sort, I’m shall explain. Right here and now.

First, the common explanation of “cookies are files which websites download onto your computer…” is notable for being slightly accurate in certain respects some of the time, completely wrong in other respects, and decidedly unhelpful in terms of actually explaining what’s going on, all at the same time.

When you browse the Web, each page, image, video, script, stylesheet, piece of audio, or anything else, is retrieved by your computer making a request to the server, and the server sending a response back. For the most part, this is what’s known as “stateless” — if everything’s working properly and you’re dealing with a static piece of content, you should get exactly the same response back the hundredth or thousandth time as you do the first time.

Lots of things on the Web do need state, though. A shopping basket is the canonical example — as you navigate from page to page, the server needs some way to know that it’s still you. In order to achieve this, cookies were invented.

Accompanying each request to and response from a web server that your browser sends and receives are some headers. These headers provide a means to fine-tune the request and provide additional information about it. For example, it’s not the fact that a filename ends in .png which tells a browser that it’s a PNG image, but an accompanying header named Content-type.

Two of these headers (one included in requests, one included in responses) is how cookies are implemented.

When a server wants to maintain some sort of persistent state which should be sent back to it in future requests (so that it knows you are still you, for example), it will send a header named Set-Cookie in the response. This is a polite request to your browser to, if possible, store the things included in the header and include them in future requests back to that the server. How exactly these are stored is entirely up to the browser and pretty much completely irrelevant (some browsers do store them as individual files, but most don’t).

The precise format of the Set-Cookie header is fairly unimportant, but it’s worth knowing the basics. Each cookie has a name, and a value, and there’s usually a limit on the maximum number of characters the value can contain (although it’s quite large for most purposes). Accompanying the cookie can be some extra pieces of information about how the server would like the cookie to be stored and used. One of those pieces of information is an expiry time, which means that instead of the cookie being stored until you close your browser, it’s stored until the expiry time is reached. This allows things like “Remember my login for the next two weeks” checkboxes to work. Cookies with expiry times are called “persistent”, while cookies without expiry times are called “session cookies” (because they only last for as long as your browsing session).

The next time your browser makes a request to the same server, any cookies the server previously set is included in the request headers in a Cookie header. When it receives the header, the server is able to tailor the page content and available actions based upon the information in the cookie, because it now has a way to tie your requests together into a seamless stateful session instead of a set of distinct requests.

Practically none of the furore around cookies concerns session cookies — the cookies which allow you to be logged in and shop online and so on; rather, it’s all about persistent cookies, especially those with expiry times far in the future.

When you use “Private Browsing” (or “Incognito” in Chrome) modes in your browser, the expiry times in Set-Cookie requests are effectively stripped, making any cookies behave as though they’re session cookies even if the server intended them to be persistent. This means that although things work as normal while you’re browsing, as soon as you close the window and start again, you’re back to a clean slate.

Of particular concern to those worried about privacy are third-party cookies — these are cookies set by servers other than the site you’re actually on.

Once upon a time, you might have heard the term “web-bugs”, which were transparent images embedded in web pages and which were used to track you as you browsed different sites. Nowadays, these have largely been replaced with pieces of third-party Javascript and cookies. Many modern browsers refuse to honour overt third-party cookies and Microsoft has announced that Internet Explorer 10 will do, too — that is, cookies set directly by third-party web servers whose resources (such as images or scripts) are being embedded in the page you’re looking at.

More recently, there has been a rise in third-party cookies masquerading as first-party cookies, often used for things like Google Analytics (which is fairly invaluable to modern web teams, incidentally), but also for targeted advertising and so forth. These are cookies which are set and read by a piece of JavaScript on the page you’re viewing which has come from a third-party source, but as far as the browser is concerned is an integral normal part of the site you’re on.

It’s this sort of thing which privacy advocates are concerned about and has triggered “Cookiegeddon”, and not especially unfairly — although I must confess I’m not convinced that the way in which it’s been implemented has done much to really improve transparency.

Here’s a thought exercise for you, though:

You use a modern up-to-date browser with third-party cookies disabled. You visit a site you’ve never visited before and it has a Facebook “Like” button on it, along with a list of your Facebook friends who have also “Liked” the site. You haven’t actually clicked on anything yet.

How does the site know who you are on Facebook? Does the site even know at all? Does Facebook (and by extension, Facebook’s advertisers) know that you visited the site?

Photo © Peter Taylor; used under the terms of the CC-BY-2.0 license.


Page 1 of 1