Internet Geographer


Posts tagged geoweb
Rethinking the Geoweb and Big Data: Future Research Directions

This short chapter is a reflection on future directions that research on the geoweb and big data could take. It is derived from a reflection that the editors of this volume asked me to provide to a session on the geoweb and big data at the 2014 meeting of the Association of American Geographers.

You can read the full piece here:

Graham, M. 2018. Rethinking the Geoweb and Big Data: Future Research Directions. In Thinking Big Data in Geography: New Regimes, New Research. Thatcher, J., Eckert, J., and Shears, A. (eds). University of Nebraska Press. Lincoln. 231-236.

Semantic Cities: Coded Geopolitics and the Rise of the Semantic Web.

In order to understand how the city’s contested political contexts are embedded into its digital layers, we traced how the city is represented on online platforms that house facts about much of the world. We did this by analyzing representations of Jerusalem across the Arabic, Hebrew and English versions of Wikipedia (working with a translator on the Arabic and Hebrew versions), as well as on the platforms of Wikidata, Freebase and Google. As our cities become increasingly digital, and as the digital becomes increasingly governed by the logics of the semantic web, there are important questions to ask about how these new alignments of code and content shape how cities are presented, experienced, and brought into being. What we found is a paradoxical situation whereby, through connecting datasets, semantic web initiatives detach localized information from the contexts of its creation. By divorcing content from its contexts, this process establishes new contexts in which necessarily political decisions are being made with far reaching consequences.

This is a topic of a new chapter (that I wrote with Heather Ford) that just arrived on my desk this morning. You can download the piece here:

Ford, H., and Graham, M. 2016. Semantic Cities: Coded Geopolitics and the Rise of the Semantic Web. In Code and the City. eds. Kitchin, R., and Perng, S-Y. London: Routledge. 200-214.

Otherwise, here’s a shorter version I wrote in Slate:

Graham, M. 2015. Why Does Google Say Jerusalem is the Capital of Israel Nov 30, 2015

We also have an earlier blog and webcast on the topic (and here's Washington Post’s coverage of our work). 


My response to the geoweb and ‘big data’ alt.conference at #AAG2014

At the recent #AAG2014 alt.conference on the geoweb and ‘big data’, I was asked to serve as a panellist at the end of the day: summarising some of the day’s themes, and reflecting on how they speak to future directions in the discipline. The responses that I prepared are below. Forgive the scattered nature of the notes, as they were hastily put together. 

We were asked to engage with the lightning talks and the ways that they factor into the potential future directions of research. Let me go through a few themes that emerged.

First, I’m not sure we’re all talking about the same thing when we speak about ‘big data’ and the geoweb. This isn’t necessarily a problem, but I’d hope that future conversations could focus more on what exactly the ‘geoweb’ is? what exactly do we mean when we speak about it? Where are the boundaries between the web and the geoweb? (I’m not sure I clearly see them). Where are the boundaries between the geoweb and what we might think of as the underlying/offline/material geo that seems to underpin, augment, or inform it? I’m also not sure I clearly see those boundaries in part because of the ways that place is always transduced: constantly remade, and reenacted. So, whilst I don’t think we have to agree on any definitions, I do think that we should avoid taking for granted some of the assumptions wrapped into these very powerful terms.

Second, we hear a lot about the need for more mixed methods research. Yes. Absolutely. But I also think that we need to avoid creating caricatures to argue against. Is there anyone out there who is actually saying that big data can answer all facets of all societal questions? How then should we best channel our energies into creating, carrying out, and enacting those hybrid approaches then?Jin-Kyu and others offered us some helpful beginnings here.

Third, it’s nice to see the beginnings of some more cross-pollination between geography, computer science, information studies, internet studies, and other social sciences. There is definitely a lot that we can contribute as geographers, but we also need to make sure that we aren’t reinventing the wheel. So, for instance, we often talk about crowdsourcing or vgi, but there’s a lot of work being done in information studies, psychology, and internet studies trying to understand motivations for crowdsourcing. we could do more to allow that work to cross-over to geography and geoweb research. And then hopefully feed back into it.

Fourth, a lot of our conversations about big data often seem to forget the truly massive amount of paid human labour that goes into the filtering, sorting, cleaning, manipulating, and managing of it. We seem to talk about big data as something that pings around between sensors, datasets, machines, and algorithms. But one of the things that I’m working on is looking at those digital sweatshops, the micro workers, the click workers, the gold farmers - those labourers in the background that are keeping our networks chugging along. And I hope we’ll start to see more of this work - remembering that automation is often an illusion. What should we be asking about those millions of workers in the shadows; doing unorganised; low-paid; alienated work - and making many of our ‘big data’ ecosystems function.  

Fifth, building on Jeremy’s comments this morning, I wonder if we should be leading a charge to address - what I think is one of the most pressing issues of our time - concerns about privacy. I think that - as geographers - we’re maybe somewhat unwisely ceding this space to computer scientists - who do tend to be very informed on the topic - and politicians - who, well, don’t tend to be informed on the topic. What should we be doing and saying and researching as geographers, to draw on our expertise and the strengths of our discipline to make a difference - and I want to emphasise - make a difference - in this new world of always-on tracking and monitoring and the datafication of everything.

But how do we also make sure that privacy isn’t used as an excuse for the wholesale locking away of social data by large companies - meaning that we can’t use those data to address the social and human questions that really matter. So, where do we stand on the transparency/privacy spectrum? And, again, what should we be doing about it?

Sixth, a lot of people today spoke about focusing on what, who, and where is left out. I very much agree that this is a crucial first step. Castells puts it well, when he says that “the costs of exclusion from networks increases faster than the benefits of inclusion in the network.” And this is an area of work that we tend to do very well as geographers (this is a question that people in other disciplines often seem to miss), but it is precisely that - a first step. How can we move beyond it? What can or should we do about it? If we establish that the digital layers that augment place are inherently uneven, unrepresentative, and imbalanced, what can we do with that knowledge; what should we do with that knowledge?

We should also think about the flip side of this issue. Whilst there’s been a lot of focus on where there isn’t enough data; or where data might not be able to capture the complexities of any given situation. What about contexts where we have too much data? Some of the talks guided us through methods for dealing with ‘big data’; but we probably need more of this. Should we be having more conversations about what to actually do with it? It would be nice to have conversations about cluster computing, graph databases, agent-based models and other methods for grappling with unmanageable volumes of data. Yes, we always need to remember what those data leave out; but unless we want to abandon the whole big data project we should also be - critically - trying to figure out what those datasets do tell us about society - and how they help us to answer the big questions that we need to ask.

Finally, let’s keep our eyes on the prize. Let make sure that we’re asking the questions that matter, and not being too driven by just what data are available. Let’s make sure our research continues to focus on questions about things like inequality, power, voice, control, and human welfare.  And I say continue because I was very impressed by the topics that the presentations today were tackling.
We can make sure that we’re shaping not just the questions being asked, but also the data being collected. Some of this means doing things like always being explicit that there is never any such thing as ‘raw data’. Data are always socially, and humanly constructed. And recognising that, in many ways, we’re the privileged ones in this room. We have the knowledge, the skills, and desire to be the ones doing the constructing and doing the shaping of data.

A few weeks ago, Tony Benn - who was a British Labour party politician - passed away. He famously had a set of five questions that he said that we should always ask any powerful person: "What power have you got? Where did you get it from? In whose interests do you exercise it? To whom are you accountable? And how can we get rid of you?” Well I wonder if we shouldn’t adopt those questions to the data intermediaries, systems, platforms, and algorithms that we’re dealing with. "What power have you got? Where did you get it from? In whose interests do you exercise it? To whom are you accountable? And how can we get rid of you?”  It’s been nice to see a lot of the work on big data and the geoweb tackling these questions, and I hope we see more of it in years to come.

Uneven Geographies of User-Generated Information: Patterns of Increasing Informational Poverty (new paper)

After years of work, the first peer-reviewed paper to emerge from our research on Wikipedia is now officially ’in press’: 

Graham, M., Hogan, B., Straumann, R. K., and Medhat, A. 2014. Uneven Geographies of User-Generated Information: Patterns of Increasing Informational Poverty. Annals of the Association of American Geographers (forthcoming).

The paper has some very interesting and important findings, summarised in the abstract below:

Geographies of codified knowledge have always been characterized by stark core-periphery patterns: with some parts of the world at the center of global voice and representation, and many others invisible or unheard. However, many have pointed to the potential for radical change as digital divides are bridged and 2.5 billion people are now online.

With a focus on Wikipedia, which is one of the world’s most visible, most used, and most powerful repositories of user-generated content, we investigate whether we are now seeing fundamentally different patterns of knowledge production. Even though Wikipedia consists of a massive cloud of geographic information about millions of events and places around the globe put together by millions of hours of human labor, it remains that the encyclopedia is characterized by uneven and clustered geographies: there is simply not a lot of content about much of the world.   

The paper then moves to describe the factors that explain these patterns, showing that while just a few conditions can explain much of the variance in geographies of information some parts of the world remain well below their expected values. These findings indicate that better connectivity is only a necessary, but not a sufficient condition for the presence of volunteered geographic information about a place. We conclude by discussing the remaining social, economic, political, regulatory, and infrastructural barriers that continue to disadvantage many of the world’s informational peripheries. The paper ultimately shows that, despite many hopes that a democratization of connectivity will spur a concomitant democratization of information production, internet connectivity is not a panacea, and can only ever be one part of a broader strategy to deepen the informational layers of places.

This is the first of a handful of papers that are in the works, and I’ll post any updates that we have. In the meantime, feel free to get in touch if you have any comments, critiques, or questions about this contribution.