In a previous blog post I talked about how data became ‚big‘ and what that means for us. I asserted that this growth can be seen as a natural process, and that we may be facing a filter problem rather than an overload. The second observation is that besides its unprecedented scale, data has gained so much importance because of its messy nature, which enables us to make stronger and more accurate predictions – but that also opens up an ethical dimension of big data, where we are ruled by algorithms and our personal data is turned into profits.
Order out of Chaos
Knowledge has taken on the shape of the internet: it is networked — much like the architecture of the net — and thus messy and chaotic. This view is continuously expressed by scholars like Mayer-Schönberger and Weinberger. This is great, and allows for contradictory opinions and open discourse. It is what the WWW was designed for at CERN in 1989. However, due to the amount of data, we loose track. Algorithms shape our world and tell us what is interesting and relevant to us — they bring order to the chaos. Meaning is calculated and curated by algorithms. We love the simplicity and orderly nature of the outcome presented on glossy websites, but we are cought in filter bubbles. Behind these shiny facades mathematical models calculate our thoughts. Our personal data shape our digital avatars. These profiles tell the truth about us, or rather show a distorted perspective of it — and companies happily take advantage of that to make their profits.
„Data doesn’t lie.“
Greg Linden, inventor of the Amazon.com recommendation engine
The Human Perspective
By using digital communication technologies, we contributes to the growing data stream: we constantly create, publish and share (with or without concious consent) content and overload it with metadata (we „like“ content, we comment on it, we retweet, we tag). While this networked knowledge and sheer endless possibilities to predict future behaviours may be fascinating, the ethical dimension of this development is often unconsidered. People’s data is turned into profit, which seems to bother just a few. However, data literacy and awareness about data ownership seems to be on the rise – hopefully we will reach a balance between technological determinism and humanistic values.
When talking about big data, it is easy to see data as a given resource, while forgetting that most data is created from the daily activities of people all around the world. It is shaped by people and their lives, and manipulating and visualising it directly affects their lives. Jaron Lanier suggests that „data concerning people is best thought of as people in disguise“, and that we should refrain from thinking of data as just a new resource that we can exploit at free will.
„Our core illusion is that we imagine big data as a substance,
like a natural resource waiting to be mined.“ Jaron Lanier
The vast amount of data and the emergence of data as raw material for business — the ’new oil‘, as is often stated — paints a rather unpleasing picture, where humans are overshadowed, controlled by data. But we’ve seen that the notion of a ‘data flood’ may be wrong: yes, there is a lot of data generated every moment, but that only means that we have to design better filters to bring this data to light. In the same way the notion of data as the ’new oil‘ is misleading. Do we really want to speak of ‚data mining‘, ‚data drilling‘, ‚data spills‘, all of which have negative connotations? Data can be seen as material, but it should not be forgotten that it is people’s data we are dealing with.
Data visualisation artists also raise their voices in opposition to the exploitation of data and the misconception of data as a natural resource. The most noticeable figure in this discussion has been Jer Thorp, a data artist living in New York. In his article for the Harvard Business Review entitled ‚Big Data Is Not the New Oil‘ Thorp gives a humorous yet serious comment on the glorification of big data as the solution to all problems. He argues that data belongs to the people who create it, and that they should have a better understanding of data ownership, an open discussion on data and ethics, and that it is necessary to foster a deep cultural understanding of data as a new kind of resource. His article uses the image of a dying oil-soaked bird to attract attention to his main argument, that making profit from people’s data is unethical and that we should not repeat the mistakes we made with oil. The „deeply human“ data collected from people must be treated as such and re-framed into a human context. To achieve this, he proposes to educate people in understanding data ownership by giving tools to the public to explore their own individual data. Secondly, he calls for a more open discussion about data and ethics and coins the attribute „data humane“ for companies who work in the area of defending people’s rights on their own data. Finally, Thorp wants to „foster a deep understanding of data in society“, which can be facilitated through an open discussion that involves artists, poets and performers.
This is an edited excerpt of a chapter of my MA dissertation entitled „Making Big Data Small“.