The Privacy Risk Of Behavioral Profiling

Want to download the plugin right now for Chrome? Here it is.
For detailed information on the KeyboardPrivacy plugin, visit Paul Moore's blog post.

Historical background

During World War II, British intelligence operators listening to German morse code operators made anonymous profiles of the various people signaling the morse code. The speed of code, typing errors et al were used to differentiate between operators.

Somewhere around 1998-1999 early fingerprint readers were a big & clumsy device, using RS-232c for signaling & PS/2 for power; it was expensive & pretty much useless. In other words… a must-have for a security geek. “How can this be defeated?” was the immediate thought. Those who think like that are “hackers”. For those who don't, this article is still worth reading.

Getting into biometrics

Fast forward to February 15th, 2011.

Per was at the The official opening of NISlab, the Biometrics Research Lab at Gjøvik university college in Norway. He had already learned a lot about biometrics from Professors Patrick Bours and Christophe Busch, as well as many other brilliant people in this area. Not to be missed; the MythBusters (official page) and the movie Sneakers. “My voice is my passport”.

Out of the blue, Per was invited by Professor Busch to participate on a panel about biometric authentication. When he asked “why me?”, the response was basically

  1. Your fight for passwords is pretty well-known.
  2. You'd make an excellent devil's advocate here.

So he did. He also learned about “plain old biometrics”, which is “something you are”. That's stuff like fingerprints, blood vein patterns in the palm (used in Japanese ATMs for many years now), retina patterns etc. Then there are behavioral biometrics, which are described as “something you do”, like speaking, singing, walking, moving etc… and for this article: HOW YOU TYPE ON A KEYBOARD. Bruce Schneier actually mentioned keystroke biometrics back in April, 2007.

At another event that Per attended in 2011, he learned that Professor Christophe Rosenberger at ENSICAEN in France had collected so much data that they could differentiate with more than 50% probability between men and women after rather few keystrokes. While there are obviously many differences between men and women, this particular one was unexpected. It was a serious jaw-dropping moment, and Per invited Rosenberger to do a talk at Passwords12 in Oslo. The talk can be watched here: “Enhancing the password security with keystroke dynamics”. Christophe and his colleagues also have a free Windows application for testing purposes of “keystroke dynamics”, which is the official phrase used. Here's a screenshot of it, showing Per's characteristics after typing in “ Password” 4 times. The spike visible is a very slight delay between 2 keystrokes:

GREYC Keystroke Dynamics demo
This technology is really interesting and can be used to increase security in a lot of use cases. In fact, Per asked Professor Rosenberger back in 2012 if they could possibly set up a web page to leverage this technology; to collect how every user entered their username and password. As this technology is dependent upon thresholds – how sure can one be that it's actually the right person typing on the keyboard – it couldn't decisively allow or disallow access. But it could serve as a 2nd factor for authentication, while the user still only uses username + password to log into the demo site. Unfortunately, that site never came to fruition.

Increasing the pace

Fast forward to 2013.

Edward Snowden. Need more be said?

Surveillance is everywhere it seems and “the Dark Web” was added to the buzzword list.
Protip: For those not paranoid enough, go read GCHQ (Kindle edition), by Richard Aldrich. Even if you are paranoid, it doesn't mean people aren't watching.

Fast forward to 2015.

Per went to the wonderful city of Dublin to speak at the Smart Business Show on April 22-23, along with Runa Sandvik and several others. As part of the visit to Dublin, they also visited TOG, a Hackerspace in the centre of Dublin, to talk & meet up with lots of people. It's a cool place and there are lots of interesting people to talk to. As part of his talk, Per also mentioned keystroke dynamics and the possibilities of such technology for good and bad. That's when someone in the audience mentioned that this kind of technology was already in use by several banks. Intrigued by this, Per started Googling for companies developing similar technology; online demos, use cases, risk analysis, potential security weaknesses and privacy issues raised. The science and technology seems well documented, potential attacks, weaknesses in the science and technical implementations not so much and privacy concerns even less.

Yes, that's correct.

Privacy issues concerning the use of “standard” biometrics, the “something you are” part such as fingerprints and palm blood vein patterns, have been discussed pretty well. Decisions have been made by regulatory government agencies, and laws have been written and updated to handle the use of such technology. But the newer, more advanced inherence factor (“something you do”) doesn't seem to have attracted that much attention… yet. In fact The Norwegian Data Protection Authority said this was more of a discussion topic at the moment and not something they had specifically handled any requests or sketchy cases about. The following are a few examples of how this technology could be used for purposes that could seriously violate anonymity and privacy online.

The Tor Browser profiling example

Per created and trained a biometric profile of his keystroke dynamics using the Tor browser at a demo site. He then switched over to Google Chrome without using the Tor network, and the demo site correctly identified him when logging in and completing a demo financial transaction. As soon as somebody manages to build a biometric profile of a user's keystrokes at a network/website where they are otherwise completely anonymous, that same profile can be used to identify them at other sites they're using, where identifiable information is available about them.

A library of research papers shows many ways to build profiles and de-anonymize browsers and traffic on the Tor network. Based on browsing the titles themselves, there doesn't appear to be any work on using keystroke dynamics to identify users across multiple networks, computers & browsers. Any government agency – pick a country – could set up spoofed and fake pages on the dark web as well as in the real world, in order to identify people across them. For oppressive regimes, this is most certainly of high interest.

The Advertiser profiling example

Advertisers want to know who their users are, what their interests are and where they go shopping etc. The list is endless and the better they can identify individuals across multiple platforms, systems, networks etc, “the better they can serve targeted advertising”. For those who cannot see any problems with that, they have yet to receive ads for Viagra, young Russian brides, illegal drugs and gambling.

Today tracking technology is used by advertisers to track users. “Unfortunately” for them, that tracking technology isn't really connected to a real person, but their digital representation. With keystroke dynamics applied, advertisers could identify individuals without using any of the current tracking technologies – in a worst-case scenario.

This technology will reduce password security

For quite some time, sites and services have been getting hammered on Twitter because they have implemented a function to disallow pasting into the password field.

“For security reasons”, they said.

Per is one of those participating in the howling wolfpack, as this seemed totally unreasonable. Not only that, it was easy to circumvent, something that one or more password managers has already implemented to deal with such sites. Wired published the story “Websites, please stop blocking password managers. It's 2015”. The security community laughed. Although the named sites in the article haven't been examined specifically, there's no doubt that the collection of keystroke dynamics from users logging in could be the secret reason those sites won't disclose.

Now imagine that every time a user wants to login to a website using this technology, they would be required to enter any and all text manually. Most people would likely stop using a 75-character random password pretty quickly, and use the shortest possible password for convenience and sanity. The claim here is this: this technology has the potential to kill almost every piece of password advice security professionals have been trying to educate users about for a long time. That's not a welcome development.

Time to break it – in the name of privacy

So the question returns to the initial thoughts on how to defeat anyone trying to build such a profile.
In normal situations, most people wouldn't mind if this was used to flag high or low probability of it actually being the right person logging in to an online bank, insurance company, online electronics store or the local library. The implementation of such additional and “invisible” authentication security is worth supporting! But there are certainly situations where people don't want to be profiled and identified like this. Doing forensics & investigations into the “dark web” is one. Intelligence gathering for a government – nobody wants to leave fingerprints all over the place. Visiting the pages of an attorney, a leak submission website for a news organisation, employer or government organisation are other situations. Basically there are LOTS of situations where people may not want to be easily identified. Ashley Madison, say no more.

Per's basic idea was aligned with some of the design principles of the Tor Browser. Build a piece of hardware that will collect all keystrokes, cache them for some brief milliseconds, and pass them on to the computer at a certain constant pace. All keystrokes appear equal, just like all Tor Browsers initially appear equal. It can't hide typing errors, spelling mistakes, lousy grammar or writing style, but at least it can conceal HOW someone actually types on a keyboard.

Since Per can't code anything at all, he discussed the idea with many people over the past years. On Wednesday, July 22 2015, he hit the jackpot: Paul Moore. Paul is a friend of Per's in the UK, and he had been of valuable help already, setting up Per's website properly in terms of security features. So Per explained what he wanted to do, found one demo site for the initial testing of the technology, and said “defeat this, please.” Paul accepted the challenge, and they spent the next days testing, reading, developing, discussing and laughing hard when it worked.

Paul provides the more technical explanations here.

Media mentions