Swimming or drowning in the data ocean? Thoughts on the metaphors of big data

English: Tsunami hazard sign

English: Tsunami hazard sign (Photo credit: Wikipedia)

There is no doubt that there is a current fascination in both popular culture and academic research with big data – the vast quantities of data that are generated from people’s interactions with digital technologies. The term ‘big data’ is appearing with ever-greater frequency in the popular media, government reports, blogs and academic journals and conferences.

The ways in which big digital data are described rhetorically reveal much about their contemporary social and cultural meanings. As Sue Thomas writes in her book Technobiophilia: Nature and Cyberspace, organic metaphors drawn from the natural world have been continually used to describe computer technologies since their emergence. Such natural terms as the web, the cloud, bug, virus, root, mouse and spider have all been employed in attempting to conceptualise and describe these technologies. These metaphors work to render digital technologies more ‘natural’, and therefore as less threatening and alienating. However nature is not always benign: it may sometimes be wild, chaotic and threatening, and these meanings of nature may also be bestowed upon digital technologies.

By far the most commonly employed metaphors to discuss big data are those related to water or liquidity: streams, flows, leaks, rivers, oceans, seas, waves and so on. Both academic and popular cultural descriptions of big data have frequently referred to the ‘fire hose’ of data issuing from a social media site such as Twitter and the data ‘deluge’, ‘flood’ or ‘tsunami’ that as internet users we both contribute to and which threaten to ‘swamp’ or ‘drown us’. These rather vivid descriptions of data as a fluid, uncontrollable entity possessing great physical power emphasise the sheer volume and fast nature of digital data movements, as well as their unpredictability and the difficulty of control and containment. They suggest an economy of digital data and surveillance in which data are collected constantly and move from site to site in ways that cannot easily themselves be monitored, measured or regulated.

Other metaphors are sometimes employed to describe the by-product data that are generated include data ‘trails’, ‘breadcrumbs’, ‘exhausts’, ‘smoke signals’, ‘shadows’. All these tend to suggest the notion of data as objects that are left behind as tiny elements of another activity or entity (‘trails’, ‘breadcrumbs’, ‘exhausts’), or as less material derivatives of the phenomena from which they are viewed to originate (‘smoke signals’, ‘shadows’).

Digital data are also often referred as living things, as having a kind of vitality in their ability to move from site to site and morph into different forms. The rhizome metaphor is sometimes employed to describe how digital data flow from place to place, or from node to node, suggesting that they are part of a living organism such as a plant. This also suggests a high level of complexity and a network of interconnected tubes and nodes.

The focus on liquidity, ceaseless movement and flux and vitality, while accurately articulating the networked nature of contemporary societies and the speed and ease at which information travels across the networks, also tends to obscure certain dimensions of digitisation. The blockages and resistances, the solidities that may impede the fluid circulation of data tend to be left out of such discussions. The rhetoric of free streams of flowing communication tends to obscure the politics and power relations behind digital and other information technologies. The continuing social disadvantage and lack of access to economic resources (including the latest digital devices and data download facilities) that many people experience belies the discourse of free-flowing digital data and universal, globalised access to and sharing of these data.

Liquidity metaphors evoke the notion of an overwhelming volume of data that must somehow be dealt with, managed and turned to good use. Instead of ‘surfing the net’, a term that was once frequently used to denote moving from website to website easily and playfully, we now must cope with huge waves of information or data that threaten to engulf us. When we think of digital data as ‘breadcrumbs’ or ‘shadows’, they are less overtly threatening, but are also depicted as subtle means of tracking and tracing our movements and activities. As we grow increasingly aware of the use of digital data for surveillance and espionage purposes, these metaphors may take on a more malign meaning, suggesting that we are being monitored constantly whether we agree or not. Digital data surveillance systems are beginning to know more about us than we ourselves do in their capacity to silently watch and record our actions. When we conceptualise digital data and the systems that produce them as complex living organisms, they appear more benign, part of ‘good nature’, but also again as potentially wild and uncontained, growing out of our control.

What the rhetoric of big data tends to suggest is that we harbour both attraction towards and fear about this phenomenon. Big data may offer many benefits, but they also generate anxieties due to their volume, power, ceaseless movement, complexity, mystery and ability to generate knowledge about us that we may not want others to see.

For more on the social and cultural aspects of big data see my Bundlr ‘The Social Life of Big Data and Algorithms’.

Why should sociologists study digital media?

Why should sociologists be interested in the new digital media technologies? This is a question I have been thinking and writing about recently in developing my next book project on digital sociology (to be published by Routledge next year). Here are some of the reasons that have emerged in the literature:

  • Social life is increasingly being configured through and with digital media.
  • What counts as ‘the social’ is increasingly framed via digital media.
  • Digital media use and practice is structured through gender, social class, geographical location, education, race/ethnicity and age, all social categories with which sociologists have traditionally been interested.
  • Digital media are integral parts of contemporary social networks and social institutions such as the family, the workplace, the education system, the healthcare system, the mass media and the economy, again phenomena that have long been foci for sociological research and theorising.
  • Digital media configure concepts of selfhood, social relationships, embodiment, human/nonhuman relations, space and time – all relevant to sociological inquiry.
  • Digital media have instituted new forms of power relations.
  • Digital media have become central to issues of measure and value.
  • Digital media offer alternative ways of practising sociology: of researching, teaching and disseminating research.
  • Digital media are important both to ‘public sociology’ (engaging with people outside of academia) and ‘private sociology’ (personal identities and practices as sociologists) (see here for my previous post on this).
  • Digital media challenge sociologists’ role as pre-eminent social researchers: sociologists need to address this.
  • Digital media technologies can contribute to ‘live sociology’ and ‘inventive methods’, or new, creative ways of practising sociology.

As this list implies, digital sociology goes well beyond simply a focus on ‘the digital’. It raises major questions about what should be the focus and methods of contemporary sociological research and theorising. As such, sociologists writing about digital media are important contributors to debates about the future of sociology and how the discipline can remain vibrant, creative and responsive to new developments and social change.