Tracking users – privacy in e-commerce spaces

Why are Internet users tracked?

One of the biggest risks of tracking is global surveillance. This surveillance can be performed by government, for security or political reasons, or by companies for commercial reasons. As detailed in a New York Times article, marketers have long understood the benefits of learning and influencing consumers’ habits. Detecting major changes in behavior increases the odds of getting customers to switch to a different product. This monitoring was previously performed via different types of fidelity cards. Internet tracking is a more powerful tool since it allows marketers to adapt their strategies almost instantaneously. Marketers use prediction models that can tell from a user’s change of behavior if she is pregnant or getting divorced. Although tracking has huge economic benefits, it raises serious privacy concerns.

There are a variety of motivations behind online tracking. First-party tracking is often performed by website owners to personalize user experience across sessions, such as maintaining the user’s shopping cart and preferences. First-party tracking is also used for fraud detection and law enforcement. In fact, several regulations require websites to log users’ activities for the purpose of fraud prevention, anti-money laundering, national security and law enforcement. Two major reasons for third-party tracking are user profiling, which is used in targeted advertising, and measurement/ analytics. These two aspects are detailed in the rest of this section.

User profiling

The intention of behavioral targeting is to track users over time and build profiles of their interests, characteristics, such as age and gender, and shopping activities. Online advertisements use behavioral targeting to display advertisements that reflect users’ interests. To a first approximation, online advertising systems are composed of three main entities: the advertiser, the publisher and the ad network. The advertiser is the entity, such as a car manufacturer or a hotel, which wants to advertise a product or service. The publisher is the entity, such as an online newspaper company, that owns one or several websites and is willing to display advertisements and be paid for it. Finally, the ad network is the entity that collects advertisements from the advertisers and places them on publisher sites. If a user clicks on an advertisement (in the ‘cost-per-click’ model), the ad network collects payment from the corresponding advertiser, and pays out a part of it to the publisher. There is, therefore, a strong incentive for the ad network to generate accurate and complete profiles in order to maximize the ‘click-through rate’ and consequently revenues.


E-commerce sites, in the first-party context, also use behavioral tracking and profiling to recommend products that are likely to be of interest to users. For example, Amazon recommends products to online users based on individuals’ past behaviors (personalized recommendation), on past behaviors of similar users (social recommendation) and, of course, on searched items (item recommendation).  With the emergence of smartphones, many applications record users’ locations and movement. Location information enables many useful services such as driving directions, knowing where their friends are or recommendations for nearby restaurants. However, this information is also collected by marketers to improve profiling. While the benefits provided by these systems are indisputable, they unfortunately pose a considerable threat to location privacy, as illustrated by the recent iPhone and Android controversies.

Web analytics/measurement

Tracking is also used for various types of aggregate measurements, such as website traffic statistics or effective exposure of advertising. Although it is technically feasible for first parties to carry out these measurements on their own, many websites use third-party web analytics tools, such as Google analytics, to obtain aggregate traffic statistics such as most visited pages, visitors’ countries, etc. These tools typically track users to collect their browsing activities and to periodically compile them into aggregated statistics. These statistics are often used by websites to measure the effectiveness of ad campaigns or to optimize their content. Companies often advance the ‘nothing-to-hide’ argument to justify their activities – why would a user be concerned about his privacy if he has nothing to hide? Solve refutes this argument by pointing out that it stems from a narrow conception of privacy as secrecy or concealment of information.  Privacy dangers do not necessarily manifest as visceral injuries or damage. Information-gathering programs are problematic even if no information that people want to hide is uncovered. Collected information can be incorrect or distorted, and result in incorrect decisions, which will create frustration. The potential harms are error, abuse, lack of transparency and accountability.


The risks of personalization

As described previously, profiling is often used by service providers to personalize their content to users. A news site may display only news matching users’ previous reading patterns. A merchant site may propose only products that match the user’s inferred interests, needs or preferences. Search engines may refine results based on a user’s previous queries and clicks. And of course, online advertisements are often behaviorally targeted. This personalization is a cause for concern. With service personalization users get trapped in a ‘filter bubble’ and don’t get exposed to information that could broaden their worldview. In authoritarian states, personalization could also be used to increase censorship by selecting news to show to specific users. Conversely, content and service personalization can be a source of information leakage, as it is often possible to retrieve a user’s interests from the content/services provided to him using various inference techniques. For example, it was shown that a user’s google history can be partially reconstructed from his query recommendations and that a user’s interest profile can be inferred from his targeted ads. In another example, a man discovered his teenage daughter was pregnant because he received coupons for baby food from the US superstore Target. The teenager had been profiled as pregnant from her purchase behavior.

Source: Privacy Online