Beyond fingerprints: Discussing the challenges of behavioral biometrics security

MATEUSZ LEWCZAK

01 December, 2023

Behavioral biometrics is an increasingly common element of the security of our bank accounts. It considers the way we type on a keyboard, move a mouse, use audio/video equipment, and even how we hold our phone. As it turns out, each of us performs these activities in a different way, and although these are small differences, with the use of Machine Learning, we are able to assess whether banking operations are performed by the account owner. However, these systems are not yet reliable and trusted enough to entrust them with direct intervention in user actions, such as blocking a session or refusing to perform an operation. At this stage, the human factor is crucial in actual verification, and it is the person who makes the final decision on what to do if the score indicates potential fraud.

As penetration tester I had the opportunity to test the security of several implementations of behavioral biometrics. In this article, I will present to you the threats that should be considered during the pentesting of such systems, based on my own experience.

Signature Replay Attack

The most common recurring problem is the low uniqueness of the collected data. Let’s use data from our audio/video devices as an example. It turns out that not only do we, humans, perform certain activities differently, our graphic cards or sound cards can generate different outcomes for different models and different manufacturers.

Graphical Signature

A classic case of this behavior for our graphic cards is a triangle filled with a gradient:

WebGL triangle

Source: https://browserleaks.com/webgl

This seemingly simple graphic will be rendered with small changes, imperceptible to the naked eye, on different models of graphic cards. However, if we save this graphic as Base64, and then calculate its SHA256 hash, the resulting hash would be different for different GPUs. This mechanism is commonly known as WebGL Browser Footprinting. If we consider the list of supported extensions by a given model, the name of the manufacturer, we get a fairly unique hardware identifier.

What distinguishes this triangle from other graphics? The use of a gradient means that the graphic does not have a uniform color, and the transitions between colors are calculated differently on different GPUs.

But here begin the security problems associated with implementing this mechanism. Usually, the graphics used by biometrics are the same for all users, often, it is this very triangle. Consequently, if an attacker obtained the victim’s graphic card signature, they could use it to increase their score during attack.

The fact that the generated graphic is predefined means that carrying out this attack only requires convincing the victim to visit a specified link. On the attacker’s page, there would be a script that, in the background, would collect all the necessary information to impersonate the victim’s GPU. Additionally, it is worth mentioning that the rule of rendering result uniqueness has its exceptions, such as all Nvidia RTX family cards giving the same result.

The solution to this problem is to use unique graphics for each user individually. This graphic should be generated based on certain values received directly from the server. An example of such an algorithm could be Perlin Noise, an algorithm that generates noise based on a seed.

Perlin Noise example

Source: https://pl.wikipedia.org/wiki/Szum_Perlina#/media/Plik:Perlin_noise_example.png

The seed is a 64-bit natural number, which should be unique for each user. In this situation, an attacker who does not know the seed will not be able to steal the signature.

Audio Signature

The same problem applies to audio signatures, which are meant to identify our sound systems. In this case, the system uses a low-level browser API to play sound and record it in the form of sound amplitude values at the output. The principle of operation is identical. When the user enters, a signature is calculated based on the amplitude values, and then a SHA256 hash is calculated from it, which is a unique identifier for this model.

Again, as with GPUs, the values based on which the hash is generated are predefined for each user in JavaScript code, and the attacker can obtain them using social engineering methods. The solution to the problem is also to use unique input data.

External Attacks

Hardware signatures are generated based on certain measurements of the performance of electronic devices in our computers. As is the case with measurements, external factors play a huge role, the same is true for audio signature measurements. Sound in a computer is something that works in parallel and takes into account many internal sound sources (programs), for example, we can listen to a podcast, music, and play a game at the same time. All these sounds overlap each other, and yet we can still hear them (although they may be distorted), there is no fight between programs over who should “play” at any given moment.

Because of this feature of the sound system, it is possible to influence the measurement of the audio signature. Although the entire process takes place internally in the browser and other programs do not influence the generated sounds, using the same tools as for measurement, it is possible to create a malicious JavaScript script that will distort the measurement and significantly affect the “score”. The script must operate in the background when the user tries to go through verification, e.g., it could be open on a page in another tab.

The most used API in this case is Audio Context, supported by most modern browsers. If an attacker prepared a script on their page that played sounds through Audio Context at very short intervals (e.g., 100ms), we can influence the measurement. For example, below there is a line graph of the audio signature for my computer:

Original audio signature

Below there is what the signatures look like overlaid – the original and the one obtained during the operation of the malicious script:

Modified audio signature

The changes are significant enough that they can negatively affect the user’s assessment and lead to falsely positive identification of the user as malicious.

In extreme cases, when we use the proper parameters (e.g., an interval of 10ms), it is possible to block access to the audio functionality in the browser for other tabs. In this way, the victim will not be able to successfully pass the verification.

Lack of Cryptographic Mechanisms

One of the key aspects of security about behavioral biometrics is the need to protect confidential data from unauthorized access. Encrypting data is a basic method for ensuring the confidentiality of information, but in the case of behavioral biometrics, the problem is a lack of appropriate cryptographic mechanisms, such as hybrid encryption.

Hybrid encryption involves combining symmetric and asymmetric encryption to secure data. Symmetric encryption is efficient and fast but requires sharing a key between the sender and receiver. On the other hand, asymmetric encryption, such as RSA, relies on public and private keys, allowing for safe key exchange, but is significantly slower. Combining these two techniques in hybrid encryption enables effective and efficient data encryption.

Unfortunately, for behavioral biometrics, such approach is often lacking - biometric data, such as cursor movements, typing dynamics, or walking patterns, are often stored and transmitted unencrypted or just use simple symmetric encryption methods. This increases the risk of interception and unauthorized access to these data, which can lead to serious consequences, such as identity theft or privacy breaches.

Another important aspect of the lack of cryptographic mechanisms in behavioral biometrics is the absence of a proper hash that would ensure data integrity. Hash functions are commonly used in cryptography to generate unique hash codes based on input data. They are irreversible, meaning it is impossible to recover the original data from the hash. Moreover, even a small change in the input data results in a completely different hash.

It is also worth considering at this point the possibility of the more recommended option, which involves using the Elliptic-curve Diffie-Hellman protocol to calculate a shared key for symmetric encryption.

Final thoughts

In conclusion, while behavioral biometrics offer innovative ways to secure user data, they also present unique challenges. Understanding these problems is the first step towards developing more robust security measures. By continuously evolving our approaches and employing trusted countermeasures like dynamic algorithms and randomized sound sequences, we can stay ahead in the cybersecurity game.

Other Insights

Mobile Device Security in today's enterprise landscape: A comprehensive approach

The risks of over-logging: a case study on application takeover

Unveiling hidden data: a log file's security breach

Happy to get a call or email
and help!