A lot is already known and experienced about data security. This blog tries to slice and dice the data security paradigm differently, with the aim of providing a different perspective. The hope is that, different perspectives lead to different viewpoints, thoughts, realizations and ideas. If this blog creates a new thought or idea, and leads to an ‘aha’ moment, then it’s a purpose well served.
Is Your Data Malignant or Benign? Are They Just Labels?
We are used to commonly classifying your data into Personal data and Profile data. Personal data is the data that you have learnt how to safeguard, since it can be used as legal means of identifying yourself in many systems. Profile data is the behavioral data about you, which contains a lot of interesting information, like what you browse, where you pause, what services you avail, what you buy, what places you visit, your life style, health, who you connect with, etc. The list keeps growing, and keeps getting more and more interesting. The interesting part is that, you safeguard personal data because you possess it and control it. The profile data is all about you, but is actually never with you. I am sure you are interested in knowing what your profile data looks like, and the analysis and conclusions from that. Having roughly clarified what I mean by Personal and Profile data, I will try to define what is Malignant and Benign from your data. I will keep the definition simple:
|Malignant Data||Your personal data that can be used to malign you or cause harm, if someone with a bad intent gets access to it. So we will start with labeling your personal data as malignant data.|
|Benign Data||Your profile data, based on your behavior and preferences, which you assume exists, but don’t really know who owns different pieces of it and how they all fit in. But we will assume that the profile data is interesting, but should cause no immediate harm to you.|
I want to share here that a half of the people who reviewed the categorization in the blog argued that the labels should be interchanged! With increasing accuracy of Big Data algorithms, and increasing consolidation of your profile data by a few players on the web, your profile data can almost accurately establish your identity. That is what makes it very tricky to decide on where to stick the malignant and benign labels. For the arguments presented in this blog, we will go with the labels that I’ve proposed.
[Some extra thought: If identifying you has been around ‘what you know’, ‘what you have’ and ‘who you are’, the assumption has always been that ‘who you are’ is biological. These days, the question arises whether ‘who you are’ is biological or behavioral?]
Obvious and Non-Obvious Aspects
For both malignant and benign data of yours, we will examine obvious and non-obvious aspects of how someone can use it when they have access to it, what will you lose, and what are the ramifications. While doing so, we will come across a few interesting views, which will give us good clarity on how the non-obvious aspects of Data Security are evolving, and what is driving them.
One Person’s Loss is Another Person’s Gain
What happens when someone lays hands on your data, be it malignant or benign? What will you lose, and what will they gain?
There are different ways of using your data, and different things that you stand to lose, but for the simplicity of the flow in this blog, we will mention the most obvious in the boxes above.
When someone lays hands on your personal data, which we called malignant data, the obvious thing that someone can do is steal – be it information or money. Knowing your coordinates, the non-obvious thing they can do is, they can sell to you – through unsolicited phone calls, which means, you are losing your space.
Knowing that someone has access to your profile data, and can analyze your behavior, is not really comfortable. The obvious concern is that your private space is continuously watched, and you are being learnt about, without your knowledge. The non-obvious use is that, you can be influenced. We are talking of influence beyond the ads that make you buy something or click on something. We are talking about influencing what you see online – which is controlling the information you get, the options you see and the views you get. So the subtle move is from presenting you choices to influencing what you even get to see at all. It is almost along the lines of presenting you views that will change your view of the world.
As you move from left hand bottom corner to right hand top corner, what you lose changes from tangible to intangible.
Means of Protection – Regulations and Technology
The bottom-left quadrant has always been the first to be addressed by technology. There is a wide range of technologies that addresses a wide range of needs from access to encryption, to the protection of personal data from obvious misuse. There have been regulations covering the obvious misuse of profile data. That covers the top-left quadrant. Many people I interacted with, differ in their opinion on what level of protection is offered by the regulations on the non-obvious use of personal data. Quite a few believe that when someone gets your personal data, they are really bound by self-regulation to prevent non-obvious misuse. So that’s on the bottom-right quadrant. The top-right quadrant is where it gets interesting, which is the non-obvious use of your profile data. The regulations as well as technologies around this area are rapidly evolving.
First, a few major events that happened in the last year throw some light on the grey areas in the non-obvious use of profile data. The biggest among them was the fall out of Cambridge Analytica using data mined from Facebook profiles through an app. The realization that up to 87 million users’ data could be profiled gave rise to new possibilities in terms of non-obvious use. So far, the focus was really on data privacy, and protecting sensitive data pertaining to a user, which could obviously be misused. Now, the focus started changing to the impact of being able to profile millions of users. The spotlight shifts from providing better services and customized info to an individual, to psychographic profiling of large population, and customizing information or deriving analysis that could be used in different areas, which have a large impact. There have been a lot of discussions associated with Brexit, elections in the US, India, and other countries, especially around customized campaigns and the role of social media.
The regulations have been evolving to address this quadrant, considering the far reaching social impact in this area. The Consent Act and the GDPR regulations are recent examples. Overall, the regulations have changed the interpretations and coverage to show how they are adapting to the changed landscape. Personally Identifiable Information (PII) has started to include both personal and behavioral data or profile data. Covering both ‘data controllers’ and ‘data processors’ in the responsibilities associated with data privacy, will start reducing the immunity that’s currently available for impartial processors of information.
There will still be a lot of challenges for regulations to cover. Considering that many of the data can be synthesized from multiple ‘holders’, and big data is about ‘new insights’ and ‘new data’ that can be created as an outcome of processing the captured data, the regulations can again become grey. This is really non-obvious use for the user. The consent is good enough for obvious use, but the non-obvious uses can be many. Consent covers the explicit data that is collected, but still the users will not know the kinds of data that can be derived from multiple correlations of consented data. For example, even in the recent CA-Facebook case, there was no violation of consent in what data will be collected.
In terms of technologies, the regulations force a rapid creation and adaption of technologies to serve the new needs. Companies need technologies to clearly understand the data they possess, as well as the obligations associated with it. Considering that data can have multiple ‘holders’ and ‘controllers’, a unified view becomes more important. The technologies have to also enable audits and compliance to the new regulations that will come in and continue to evolve. The explicitness of consent and the control of personal and profile data, will introduce the need to have a life cycle of consent management! This is in addition to the “forget me” option.
Ramifications – The Non-Obvious Use of Profile Data
A good example is, in the recent years, a lot of concerns have started to emerge about behavioral profiling analysis being used for elections – be it selection of candidates or campaign strategies, including targeted messaging.
The potential impact of non-obvious usage of benign data, and concerns around it has been evident in some of the major events the world saw in the last year. The regulations have started to address this area immediately. There have been technology advances too, like WhatsApp introducing a ‘forwarded’ tag in its messages. There have been social changes too, like social media companies taking up special campaigns to educate users and voters to not fall prey to targeted campaigns or fake news.
The major factor driving evolution in this quadrant is the fact that the analysis of group behaviors can lead to targeted solutions, which means we are looking at ‘Weapons of Mass Persuasion’. In effect, benign data can be used in non-obvious ways to create malignant impact.
As a result, we see a sudden spurt in awareness across societies and government, as well as regulations and technologies trying to address this concern of malignant impact, leading to a lot of rapid evolution in the top-right quadrant.