The Privacy Risk Of Behavioral Profiling

Want to download the plugin right now for Chrome? Here it is.
For detailed information on the KeyboardPrivacy plugin, visit Paul Moore's blog post.

Historical background

During World War II, British intelligence operators listening to German morse code operators made anonymous profiles of the various people signaling the morse code. The speed of code, typing errors et al were used to differentiate between operators.

Somewhere around 1998-1999 I got my first fingerprint reader. It was a big & clumsy device, using RS-232c for signaling & PS/2 for power; it was expensive & pretty much useless. In other words… a must-have for a security geek. “How can I defeat this?” I thought to myself. If you think like that, you're a “hacker”. If not, I'd still recommend you to read this blog post.

When I got into biometrics

Fast forward to February 15th, 2011.
I was at the The official opening of NISlab, the Biometrics Research Lab at Gjøvik university college in Norway. I had already learned a lot about biometrics from Professors Patrick Bours and Christophe Busch, as well as many other brilliant people in this area. Not to be missed; the MythBusters (official page) and the movie Sneakers. “My voice is my passport”.

Out of the blue, I was invited by Professor Busch to participate on a panel about biometric authentication. When I asked him “why me?”, his response was basically

1) You're fight for passwords is pretty well-known.
2) you'd make an excellent devil's advocate here.

So I did. I also learned about “plain old biometrics”, which is “something you are”. That's stuff like your fingerprint, blood vein patterns in your palm (used in japanese ATMs for many years now), retina patterns etc. Then you have behavioral biometrics, which is described as “something you do”, like speaking, singing, walking, moving etc… and for this blog post: HOW YOU TYPE ON A KEYBOARD. Bruce Schneier actually mentioned keystroke biometrics back in April, 2007.

At another event that I attended in 2011, I learned that Professor Christophe Rosenberger at ENSICAEN in France had collected so much data that they could differentiate with more than 50% probability between men and women after rather few keystrokes. Now I fully believe there are tons of differences between men and women, but I didn't see that one coming. I had one of those serious jaw-dropping moments there, and I invited him to do a talk at Passwords12 in Oslo. You can watch his talk here: “Enhancing the password security with keystroke dynamics”. Christophe and his colleagues also has a free Windows application for testing purposes of “keystroke dynamics”, which is the official phrase used. Here's a screenshot of it, showing my characteristics after typing in “Password” 4 times. The spike you see is a very slight delay I between 2 keystrokes:

GREYC Keystroke Dynamics demo
This technology is really interesting and can be used to increase security in a lot of use cases. In fact, I asked Professor Rosenberger back in 2012 if they could possibly set up a web page to leverage this technology; to collect how every user entered their username and password. As this technology is dependent upon thresholds – how sure are we it's actually you typing on the keyboard – we couldn't let this decisively allow or disallow you access. But we could do it as a 2nd factor for authentication, while the user still only uses username + password to log into our demo site. Unfortunately, that site never came to fruitition.

Increasing the pace

Fast forward to 2013.

Edward Snowden. Need I say more?

Surveillance is everywhere it seems and “the Dark Web” was added to our buzzword list.
Protip: If you're not paranoid enough, go read GCHQ (Kindle edition), by Richard Aldrich. Even if you are paranoid, it doesn't mean people aren't watching.

Fast forward to 2015.
I went to the wonderful city of Dublin to speak at the Smart Business Show on April 22-23, along with Runa Sandvik and several others. As part of the visit to Dublin, we also visited TOG, a Hackerspace in the centre of Dublin, to talk & meetup with lots of people. It's a cool place and there are lots of interesting people to talk to. As part of my talk, I also mentioned keystroke dynamics and the possibilities of such technology for good and bad. That's when I heard from someone in the audience that this kind of technology were already in use by several banks. I was intrigued by this, so I started Googling for companies developing similar technology; online demos, use cases, risk analysis, potential security weaknesses and privacy issues raised. The science and technology seems well documented, potential attacks, weaknesses in the science and technical implementations not so much and privacy concerns even less.

Yes, that's correct.
Privacy issues concerning the use of “standard” biometrics, the “something you are” part such as fingerprints and palm blood vein patterns, have been discussed pretty well. Decisions have been made by regulatory government agencies, and laws have been written and updated to handle the use of such technology. But the newer, more advanced inherence factor (“something you do”) doesn't seem to have attracted that much attention… yet. In fact The Norwegian Data Protection Authority said this was more of a discussion topic at the moment and not something they had specifically handled any requests or sketchy cases about. Let me give you a few examples on how this technology could be used for purposes that could seriously violate your anonymity and privacy online.

The Tor Browser profiling example

I created and trained a biometric profile of my keystroke dynamics using the Tor browser at a demo site. I then switched over to Google Chrome and not using the Tor network, and the demo site correctly identified me when logging in and completing a demo financial transaction. As soon as somebody manages to build a biometric profile of your keystrokes at a network/website where you are otherwise completely anonymous, that same profile can be used to identify you at other sites you're using, were identifiable information is available about you.

A library of research papers shows many ways to build profiles and de-anonymize browsers and traffic on the Tor network. Based on browsing the titles themselves, I can't see any work where using keystroke dynamics to identify users across multiple networks, computers & browsers. Your favorite government agency – pick your country – could set up spoofed and fake pages on the dark web as well as in the real world, in order to identify people across them. For oppressive regimes, this is most certainly of high interest.

The Advertiser profiling example

Advertisers want to know who you are, what your interests are and where you go shopping etc. The list is endless and the better they can identify YOU across multiple platforms, systems, networks etc, “the better they can serve you targeted advertising”. If you cannot see any problems with that, you still have yet to receive ads for Viagra, young Russian brides, illegal drugs and gambling.

Today tracking technology used by advertisers to track you. “Unfortunately” to them, that tracking technology isn't really connected to YOU, but your digital representation. With keystroke dynamics applied, advertisers could identify you without using any of the current tracking technologies – in a worst-case scenario.

This technology will reduce password security

For quite some time, we've seen sites and services getting hammered on Twitter because they have implemented a function to disallow pasting into the password field.

“For security reasons”, they said.

I'm one of those participating in the howling wolfpack, as this seemed totally unreasonable. Not only that, it was easy to circumvent, something that one or more password managers has already implemented to deal with such sites. Wired publishes the story “Websites, please stop blocking password managers. It's 2015”. Trust me, we laughed. Although we haven't examined the named sites in the article specifically, there's no doubt that the collection of keystroke dynamics from users logging in could be the secret reason they won't tell us about.

Now imagine that every time you want to login to a website using this technology, you would be required to enter any and all text manually. I'm pretty sure you would stop using a 75-character random password ASAP, and use the shortest possible password for your own convenience and sanity. So my claim will be this: this technology has the potential to kill almost every piece of password advice we've been trying to educate users about for a long time. I'm really not happy about that.

Time to break it – in the name of privacy

So lets got back to my initial thoughts on how to defeat anyone trying to build such a profile on me.
You see, in normal situations I wouldn't mind if this was used to flag high or low probability of it actually being me logging in to my online bank, insurance company, online electronics store or the local library. Heck no, I'd support the implementation of such additional and “invisible” authentication security! But I'm a bit paranoid you see, and I can most certainly see situations were I don't want to be profiled and identified like this. Doing forensics & investigations into the “dark web” is one. Intelligence gathering for my government – or any other government – one doesn't want to leave fingerprints all over the place. Visiting the pages of your favorite attorney, leak submssion website for a news organisation, employer or government organisation are other situations. Basically there are LOTS of situations were you may not want to be easily identified. Ashley Madison, say no more.

My basic idea was aligned with some of the design principles of the Tor Browser. Build a piece of hardware that will collect all my keystrokes, cache them for some brief milliseconds, and pass them on to the computer at a certain constant pace. All keystrokes appear equal, just like all TOR Browsers initially appear equal. I can't hide my typing errors, spelling mistakes, lousy grammar or writing style, but at least I can conceal HOW I actually type on a keyboard.

Since I can't code anything at all, I discussed the idea with many people over the past years. On Wednesday, July 22 2015, I hit the jackpot: Paul Moore. He's a friend of mine in the UK, and he's been of valuable help for me already, setting up my own website properly in terms of security features. So I explained to him what I wanted to do, and found one demo site for the initial testing of the technology, and I said “defeat this, please.”. Paul accepted the challenge, and we spent the next days testing, reading, developing, discussing and laughing hard when it worked.

So I will let Paul do the more technical explanations here.

 

Media mentions

How the way you type can shatter anonymity—even on Tor (arstechnica)

Biometric behavioural profiling: Fighting that password you simply can't change (The Register)

How to bust keyboard biometrics, and why you might want to (Tripwire blog, by Graham Cluley)

Chrome extension thwarts user profiling based on typing behavior (Help Net Security)

Snoops Can Silently Track You Just Looking At Your Typing, Clicking And Battery Status (Forbes)