What is a device fingerprint?
Community representation. author: anonymous
In computing, your fingerprint refers to the process of creating unique identifiers to combine with all kinds of digital data. But when it comes to certain methods of identifying individual users or computing devices, we turn to the browser or device with your digital fingerprint.
Essentially, this process consists of collecting information about the user's smartphone, computer, or other devices. Sometimes this can be achieved even by hiding the IP address or using different browsers.
For many years, web analysts have collected information from devices and browsers in an effort to measure legitimate web traffic and detect potential scammers. Today, more advanced approaches make it possible to specify the collection of necessary materials.
Previous methods for working with device fingerprinting were focused primarily on computers. But modern technologies are capable of identifying devices of any type, and the expanding mobile development market is of particular interest.
How it works?
Device identification is based on collections of data sets, which are then combined and passed to a hash function. The output (hash value) can then be stored as a unique device (or user) ID.
The information collected is often stored in a database rather than on the device itself. While a unit of data is somewhat anonymized, a combination of multiple data sets can be unique.
Device identification can be done using both active and passive methods. The goal of both approaches is to collect information about the device. Even if thousands of computers run the same operating system, it's likely that each computer will have a unique set of software, hardware, browsers, plugins, languages, time zones, and general settings.
Passive identification
As the name suggests, passive methods collect information in less obvious ways, without asking the user (or remote system). The data is collected based on what is sent by each device, so when using passive identification, less specific information is obtained (for example, the operating system).
For example, someone could create a passive data collection system that would analyze and store information about wireless drivers for network devices, such as Internet modems. Passive analysis can examine different types of drivers without any requests from the device. In other words, different devices use different methods to scan available connections (entry points). These differences can be used by an attacker to unmistakably identify the driver on each target device.
Active identification
On the other hand, active identification is based on active network interactions that allow better identification of the client side. Some sites run JavaScript code to learn more information about the user's devices and browsers. The information received may include window size, fonts, plugins, language settings, time zone, and even hardware details.
A notable example of implementing active client data collection is canvas data collection, which is used on both computers and mobile devices. The method is often based on a script interacting with the canvas (graphical element) of HTML5 web pages. The script tells the canvas element to draw a hidden image on the screen and then takes into account information received from the image, such as screen resolution, fonts, and background colors.
What is it used for?
Device fingerprinting methods provide advertising systems with a way to track consumer behavior across multiple browsers. They also allow banks to determine whether the request is coming from a trusted device or from a system that has previously been observed to engage in fraudulent activity.
In addition, a device's digital fingerprint helps sites prevent multiple accounts from registering for one user or not showing up in search engine results for devices that exhibit suspicious activity.
A device's digital fingerprint can be useful in detecting and preventing identity and credit card fraud. However, these technologies threaten user privacy and are implementation dependent. Thus, data collections may be unidentified, especially those collected by passive methods.
What are the disadvantages?
When considering active identification, it is important to clarify that data collections are collected through scripting languages such as JavaScript. Mobile devices and users may run privacy software or enable plugins that limit the functionality of tracking scripts, making identification more difficult. Such software includes browser extensions that block trackers and ads.
In some situations, however, privacy-conscious users may be easier to identify. For example, when they use unpopular software and plugins, as well as special settings that, ironically, make them even more different.
The effectiveness of the digital footprint can be limited by the high variability on the client side. Users who constantly change settings or use multiple virtual operating systems cause inaccuracies in the data collection process.
Using different browsers can also lead to inaccuracies in the information collection process, but modern cross-browser fingerprinting techniques can be used to prevent such interference.
Conclusion
There are several methods for implementing and using data-based device identification. The efficiency of data collection and identification of a single device can vary significantly from one method to another.
By itself or in combination with other methods, device fingerprinting technology can be used as an effective tool for tracking and identifying users. This powerful technology can be used for both legitimate and malicious activities, so a basic understanding of how it works is definitely a good starting point.
