Data mining is being used to target individuals, both by advertisers and organised crime. Andy Smith FBCS CITP examines the role played by data aggregation.

One of the key issues online and in the modern world, is the ability for organisations, including sales and marketing departments, advertising companies and serious organised crime, to use data aggregation and data mining.

Putting the pieces together

Aggregation is the compilation of individual items of data, databases or datasets to form large datasets, e.g. bringing together social media accounts, internet searches, shopping preferences, emails and even dark web data for millions of people. Data mining is taking a large dataset and using tools to search for particular words or phrases, then refining the search with combined search terms to find individual records of interest.

Online marketing can be aggressive and unwanted

We are all victims of spam, adware and other unwelcome methods of trying to separate us from our money. Most online targeted marketing is far better than blanket marketing and can actually be very useful. However, to achieve this, advertising organisations need to track and hold a significant amount of information about users and their preferences.

Some of this can be personal, such as age and location. When companies are tracking spending profiles and the types of products people buy, this can become very sensitive. Basically, marketeers are gathering (aggregating) huge amounts of information and then mining this for marketing purposes. However, this data can also be misused for nefarious purposes in the wrong hands.

The limitations of GDPR

In Europe, there are laws to protect the public from aggressive marketing, invasion of privacy and to ensure data protection, including the General Data Protection Regulation (GDPR) and the European Convention on Human Rights (ECHR) Article 8. These laws cover the type of data that is held and ensure it is properly protected and- to a certain extent- not misused. But this only applies to reputable companies and those in jurisdictions covered by such laws.

The same capability is available to organised crime, which is a wholly different and much more serious problem, as the laws do not apply. Between law-abiding professional organisations and organised crime, there is a spectrum of organisations ranging from slightly intrusive targeted marketing, to malicious code authors who install adware on devices which replaces official adverts with nefarious ones. This is one major area of data mining and the one most people think of - but it is not the only one.

Organised crime, terrorist organisations, investigative journalists and private investigators can all use data sources on the internet and data mining tools to find and target people and groups. It is amazing to consider what is now achievable; even small snippets of information can be used as keys in different databases that yield further information, which in turn can be used as search keys in other databases. Given large aggregated data sets and the right search terms, it’s possible to find a lot of information about people; including information that could otherwise be considered confidential: from medical to marital.

Anonymity is becoming even harder to preserve

In oppressive regimes, these tools can be used to suppress human rights. For example, finding a posting on a news group that goes against a regime, using the IP address to find the service provider, then the credit card details to find the individual poster and track them down - even though they thought their post was anonymous. The trouble is, not all organisations do a good job of protecting their data. Worse still, individuals are very bad at protecting their own information.

For you

Be part of something bigger, join BCS, The Chartered Institute for IT.

One aspect of preventing data mining is helping the naïve to protect themselves: parents, extended family, friends and children often do not understand the implications of giving out sensitive personal information, such as the kind posted on social media. Something like posting a holiday photo to Facebook may be all that is needed to indicate to a criminal that he person is not at home. Then, the metadata from a picture of that person’s recent BBQ can be used to find out the exact location of the currently empty house…

One of the most important aspects is making people aware that they are sharing their lives- not just with friends and family, but also with anyone that has a good search engine- from marketeers to organised criminals. This is especially true when social media sites change their terms and conditions and open up privacy settings. Suddenly, the site might now ‘own all photographs posted on xxxx’. Some may also remove privacy settings, exposing users’ information until they notice and update their privacy controls again.

How secure are the security settings?

Millions of people still do not realise that their information is public. Even simple things like putting too much detail in a CV uploaded to job sites can be a bad thing. It does not take much for a criminal to open an account as a potential employer and browse CVs, which can include full names, addresses, contact details and so on; or for your boss to find out you are looking for a new job. It is vital that people think carefully about what information they are putting on the internet and why.

A short CV with an email address and note that a full version is available on request is all that is needed on job sites. Searching for medical websites and certain information should be done with caution, including ensuring the browser is set to ‘do not track’. If possible, use a different web browser type for sensitive sites, one that does not share cookies or cache with the main browser, and preferably uses a VPN via another country.

The need to protect the innocent

Practical steps can also be taken when looking after computers for children and family members (who may be adults new to the internet). It is best to ensure that their computer has a full internet security package, which includes parental controls. This can be configured for them, preventing personal information from being exposed and blocking access to blacklisted websites. Though this will not solve every issue, it will certainly help to protect the less aware.

As storage gets cheaper, processing power increases exponentially and the internet becomes more pervasive in everyone’s lives, the data mining issue will just get worse. Criminals are going to follow the money online; they are going to target people for identity theft, blackmail and worse. Private investigators and investigative journalists are going to use these massive data sources to their benefit, and marketing will become even more accurate and targeted - even down to exact current location.

Easy to remember passwords, not password

However, this does not have to be as bad as it sounds: fear, uncertainty and doubt can be just as bad, as they prevent people from making full use of the advantages offered by the internet. Simple tricks can help, such as never using the same password across multiple sites and instead using a formula that’s easy to remember, like ‘company+constant’, e.g. eBay!771492, for eBay and Amazon!771492, for Amazon.

Finally, accurate data shouldn’t be given to websites unless it is for official purposes, such as for government or banks. It’s amazing how many sites will ask for a date of birth and mother’s maiden name, so even changing a DOB by a couple of days on these sites will stop it being misused. If personal data is protected online as it would be in the real world, and personal data exposure and storage on third party databases is minimised, the internet can be enjoyed with low risk.