The thirteen Ps of big data

Big data are often described as being characterised by the ‘3 Vs’: volume (the large scale of the data); variety (the different forms of data sets that can now be gathered by digital devices and software); and velocity (the constant generation of these data). An online search of the ‘Vs’ of big data soon reveals that some commentators have augmented these Vs with the following: value (the opportunities offered by big data to generate insights); veracity/validity (the accuracy/truthfulness of big data); virality (the speed at which big data can circulate online); and viscosity (the resistances and frictions in the flow of big data) (see Uprichard, 2013 for a list of even more ‘Vs’).

These characterisations principally come from the worlds of data science and data analytics. From the perspective of critical data researchers, there are different ways in which big data can be described and conceptualised (see the further reading list below for some key works in this literature). Anthropologists Tom Boellstorff and Bill Maurer (2015a) refer to the ‘3 Rs’: relation, recognition and rot. As they explain, big data are always formed and given meaning via relationships with human and nonhuman actors that extend beyond data themselves; how data are recognised qua data is a sociocultural and political process; and data are susceptible to ‘rot’, or deterioration or unintended transformation as they are purposed and repurposed, sometimes in unintended ways.

Based on my research and reading of the critical data studies literature, I have generated my own list that can be organised around what I am choosing to call the ‘Thirteen Ps’ of big data. As in any such schema, this ‘Thirteen Ps’ list is reductive, acting as a discursive framework to organise and present ideas. But it is one way to draw attention to the sociocultural dimensions of big data that the ‘Vs’ lists have thus far failed to acknowledge, and to challenge the taken-for-granted attributes of the big data phenomenon.

  1. Portentous: The popular discourse on big data tends to represent the phenomenon as having momentous significance for commercial, managerial, governmental and research purposes.
  2. Perverse: Representations of big data are also ambivalent, demonstrating not only breathless excitement about the opportunities they offer but also fear and anxiety about not being able to exert control over their sheer volume and unceasing generation and the ways in which they are deployed (as evidenced in metaphors of big data that refer to ‘deluges’ and ‘tsunamis’ that threaten to overwhelm us).
  3. Personal: Big data incorporate, aggregate and reveal detailed information about people’s personal behaviours, preferences, relationships, bodily functions and emotions.
  4. Productive: The big data phenomenon is generative in many ways, configuring new or different ways of conceptualising, representing and managing selfhood, the body, social groups, environments, government, the economy and so on.
  5. Partial: Big data can only ever tell a certain narrative, and as such they offer a limited perspective. There are many other ways of telling stories using different forms of knowledges. Big data are also partial in the same way as they are relational: only some phenomena are singled out and labelled as ‘data’, while others are ignored. Furthermore, more big data are collected on some groups than others: those people who do not use or have access to the internet, for example, will be underrepresented in big digital data sets.
  6. Practices: The generation and use of big data sets involve a range of data practices on the part of individuals and organisations, including collecting information about oneself using self-tracking devices, contributing content on social media sites, the harvesting of online transactions by the internet empires and the data mining industry and the development of tools and software to produce, analyse, represent and store big data sets.
  7. Predictive: Predictive analytics using big data are used to make inferences about people’s behaviour. These inferences are becoming influential in optimising or limiting people’s opportunities and life chances, including their access to healthcare, insurance, employment and credit.
  8. Political: Big data is a phenomenon that involves power relations, including struggles over ownership of or access to data sets, the meanings and interpretations that should be attributed to big data, the ways in which digital surveillance is conducted and the exacerbation of socioeconomic disadvantage.
  9. Provocative: The big data phenomenon is controversial. It has provoked much recent debate in response to various scandals and controversies related to the digital surveillance of citizens by national security agencies, the use and misuse of personal data, the commercialisation of data and whether or not big data poses a challenge to the expertise of the academic social sciences.
  10. Privacy: There are growing concerns in relation to the privacy and security of big data sets as people are becoming aware of how their personal data are used for surveillance and marketing purposes, often without their consent or knowledge and the vulnerability of digital data to hackers.
  11. Polyvalent: The social, cultural, geographical and temporal contexts in which big data are generated, purposed and repurposed by a multitude of actors and agencies, and the proliferating data profiles on individuals and social groups that big data sets generate give these data many meanings for the different entities involved.
  12. Polymorphous: Big data can take many forms as data sets are generated, combined, manipulated and materialised in different ways, from 2D graphics to 3D-printed objects.
  13. Playful: Generating and materialising big data sets can have a ludic quality: for self-trackers who enjoy collecting and sharing information on themselves or competing with other self-trackers, for example, or for data visualisation experts or data artists who enjoy manipulating big data to produce beautiful graphics.

Critical Data Studies – Further Reading List

Andrejevic, M. (2014) The big data divide, International Journal of Communication, 8,  1673-89.

Boellstorff, T. (2013) Making big data, in theory, First Monday, 18 (10). <http://firstmonday.org/ojs/index.php/fm/article/view/4869/3750&gt;, accessed 8 October 2013.

Boellstorff, T. & Maurer, B. (2015a) Introduction, in T. Boellstorff & B. Maurer (eds.), Data, Now Bigger and Better! (Chicago, IL: Prickly Paradigm Press), 1-6.

Boellstorff, T. & Maurer, B. (eds.) (2015b) Data, Now Bigger and Better! Chicago, IL: Prickly Paradigm Press.

boyd, d. & Crawford, K. (2012) Critical questions for Big Data: provocations for a cultural, technological, and scholarly phenomenon, Information, Communication & Society, 15 (5),  662-79.

Burrows, R. & Savage, M. (2014) After the crisis? Big Data and the methodological challenges of empirical sociology, Big Data & Society, 1 (1).

Cheney-Lippold, J. (2011) A new algorithmic identity: soft biopolitics and the modulation of control, Theory, Culture & Society, 28 (6),  164-81.

Crawford, K. & Schultz, J. (2014) Big data and due process: toward a framework to redress predictive privacy harms, Boston College Law Review, 55 (1),  93-128.

Gitelman, L. & Jackson, V. (2013) Introduction, in L. Gitelman (ed.), Raw Data is an Oxymoron. Cambridge, MA: MIT Press, pp. 1-14.

Helles, R. & Jensen, K.B. (2013) Making data – big data and beyond: Introduction to the special issue, First Monday, 18 (10). <http://firstmonday.org/ojs/index.php/fm/article/view/4860/3748&gt;, accessed 8 October 2013.

Kitchin, R. (2014) The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences. London: Sage.

Kitchin, R. & Lauriault, T. (2014) Towards critical data studies: charting and unpacking data assemblages and their work, Social Science Research Network. <http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2474112&gt;, accessed 27 August 2014.

Lupton, D. (2015) ‘Chapter 5: A Critical Sociology of Big Data’ in Digital Sociology. London: Routledge.

Lyon, D. (2014) Surveillance, Snowden, and Big Data: Capacities, consequences, critique, Big Data & Society, 1 (2). <http://bds.sagepub.com/content/1/2/2053951714541861&gt;, accessed 13 December 2014.

Madden, M. (2014) Public Perceptions of Privacy and Security in the post-Snowden Era, Pew Research Internet Project: Pew Research Center.

McCosker, A. & Wilken, R. (2014) Rethinking ‘big data’ as visual knowledge: the sublime and the diagrammatic in data visualisation, Visual Studies, 29 (2),  155-64.

Robinson, D., Yu, H., and Rieke, A. (2014) Civil Rights, Big Data, and Our Algorithmic Future. No place of publication provided: Robinson + Yu.

Ruppert, E. (2013) Rethinking empirical social sciences, Dialogues in Human Geography, 3 (3),  268-73.

Tene, O. & Polonetsky, J. (2013) A theory of creepy: technology, privacy and shifting social norms, Yale Journal of Law & Technology, 16,  59-134.

Thrift, N. (2014) The ‘sentient’ city and what it may portend, Big Data & Society, 1 (1). <http://bds.sagepub.com/content/1/1/2053951714532241.full.pdf+html&gt;, accessed 1 April 2014.

Tinati, R., Halford, S., Carr, L., and Pope, C. (2014) Big data: methodological challenges and approaches for sociological analysis, Sociology, 48 (4),  663-81.

Uprichard, E. (2013) Big data, little questions?, Discover Society,  (1). <http://www.discoversociety.org/focus-big-data-little-questions/&gt;, accessed 28 October 2013.

van Dijck, J. (2014) Datafication, dataism and dataveillance: Big Data between scientific paradigm and ideology, Surveillance & Society, 12 (2),  197-208.

Vis, F. (2013) A critical reflection on Big Data: considering APIs, researchers and tools as data makers, First Monday, 18 (10). <http://firstmonday.org/ojs/index.php/fm/article/view/4878/3755&gt;, accessed 27 October 2013.

Medical diagnosis apps – study findings

Over 100,000 medical and health apps for mobile digital devices have now been listed in the Apple App Store and Google Play. They represent diverse opportunities for lay people to access medical information and track their body functions and medical conditions. As yet, however, few critical social researchers have sought to analyse these apps.

In a study I did with Annemarie Jutel we undertook a sociological analysis of medical diagnosis apps, and two articles have now been published from the study. Annemarie is a sociology of diagnosis expert and we were interested in investigating how these apps represented the process of diagnosis. We drew on the perspective that apps are sociocultural artefacts that draw on and reproduce tacit norms and assumptions. We argue that from a sociological perspective, digital devices such as health and medical apps have significant implications for the ways in which the human body is understood, visualised and treated by medical practitioners and lay people alike, for the doctor-patient relationship and the practice of medicine.

In one article, published in Social Science & Medicine, we focused on self-diagnosis apps directed at lay people. We undertook a search using the terms ‘medical diagnosis’ and ‘symptom checker’ for apps that were available for download to smartphones in mid-April 2014 in the Apple App Store and Google Play. We found 35 self-diagnosis apps that claimed to diagnose across a range of conditions (we didn’t include apps directed at diagnosis of single conditions). Some have been downloaded by tens or hundreds of thousands, and the case of WebMD and iTriage Health, millions of smartphone owners.

Our analysis suggests that these apps inhabit a contested and ambiguous site of meaning and practice. The very existence of self-diagnosis apps speaks to several important dimensions of contemporary patienthood and healthcare in the context of a rapidly developing ecosystem of digital health technologies. They also participate in the quest for patient ‘engagement’ and ‘empowerment’ that is a hallmark of digital health rhetoric (or what I call ‘digital patient engagement’).

Self-diagnosis apps, like other technologies designed to give lay people the opportunity to monitor their bodies and their health states and engage with the discourses of healthism and control that pervade contemporary medicine We found that app developers combined claims to medical expertise in conjunction with appeals to algorithmic authority to promote their apps to potential users. While the developers also used appeals to patient engagement as part of their promotional efforts, these were undermined by routine disclaimers that users should seek medical advice to effect a diagnosis. While the cautions that are offered on the apps that they are for ‘entertainment purposes only’ and not designed to ‘replace a diagnosis from a medical professional’ may be added for legal reasons, they detract from the authority that the app may offer and indeed call into question why anyone should use it.

In our other article, published in the new journal Diagnosis, we directed attention at diagnosis apps that are designed for the use of medical practitioners as well as lay people. We analysed 176 such apps that we found in Google Play and the Apple App Store in December 2013. While 36 of these were directed at lay people, the remainder were for medical practitioners. The Diagnosis article mainly concentrates on the latter, given that our other article was about the self-diagnosis apps for lay people.

Our research suggests that these apps should be used with great caution by both lay people and practitioners. The lack of verifiable information provided about the evidence or expertise used to develop these apps is of major concern. The apps are of very variable quality, ranging from those that appear to have the support and input of distinguished medical experts, specialty groups or medical societies to those that offer little or nothing to support their knowledge claims. While at one end of the spectrum we can see apps as a delivery system for information which has been subject to the conventional forms of academic review, at the other extreme, we see apps developed by entrepreneurs with interests in many topics outside medicine, with little input from medical sources, or with inadequate information to ascertain what the sources might be. The lack of information provided by many app developers also raises questions about how users can determine the presence of conflicts of interest and commercial interests that might determine content.