This is the second in a short series of posts about making scholarly practices more visible and material, and the useful outcomes which might result from that. The previous one was about flying—you can read it here. This one describes my work over the past couple of years in assembling “The Past Futures Database”, a collection of about 25,000 articles about science technology and medicine from British and American publications across the twentieth century. (At the time of writing the database is almost ready for launch; this post will be updated when it goes live). The collection aims to offer its users a way to explore how techno-social futures were represented in popular media during the twentieth century. It does this by including a wide range of utterances about possible futures related to science and technology; the ways in which these utterances have been collected is meant to unsettle some received notions about how we collect images of the future. It also allows users to search week by week for the whole twentieth century, to see what discussions of science, technology and the future were happening at which points. A number of important magazine collections—including the New Scientist, Popular Mechanics, and Life magazine—are available through Google books. These, however, can only be searched by keyword and can be cumbersome to use. The goal of our database is to provide an overview which allows for explorations of various kinds.
The database was a digital humanities (DH) project, and producing it involved thinking hard about digital methods of scholarship and archives. In an article from a few years ago for Science, Technology and Human Values, Claire Waterton argues that there has recently been ‘a “move toward exposure of the guts of our archives and databases, toward exposing the contingencies, the framing, the reflexivity, and the politics embedded within them.” Waterton analyses these processes in terms of a “convergence between the world of social theory and those worlds concerned with building archives”, exploring in detail three examples of “active experimentation with the collection, representation, and display of data about natural/social worlds, which are partly informed by the work of STS and partly by other influences, including social theory”.
My work on the database has been very much concerned with these questions of how the technical minutiae involved in the production of digital resources can have serious epistemic, political and organisational consequences.
In a recent review of works on the aesthetics of the Digital Humanities, Jessica Hurley observes “the infrastructures of the digital humanities are, like all the best infrastructures, simultaneously omnipresent and invisible”. The fields depends on, and operates through
A vast, interlocked network of objects, capital, people, and ideologies: ASCII code; fiber-optic cables; tenure lines; server farms; research centers and literature labs; wage laborers and graduate students who scan, attach metadata and program search functions; the Defense Advanced Research Projects Agency (DARPA); the manpower, capital, and geopolitical location required to apply for a .edu domain name ($185,000, US institutions only); laptops; postdoctoral fellowships; silicon mines; Silicon Valley; the contemporary fetish for STEM in higher education.
Thinking about building a database involves reflecting on these things as well.
There have been quite a number of extraordinarily successful digital resources produced in the history, going back twenty five years or so. Some, such as the digitised collections of Charles Darwin’s correspondence and the Newton project, are direct heirs to massive scholarly undertakings from the nineteenth and twentieth centuries. Vast in scope and with digital aspects which mean they can be investigated by a large number of different audiences, both scholarly and not, they are some of the great intellectual monuments of our time. The visions of Newton and Darwin which are now readily available are critically, historiographically, and politically more interesting and more available than the sources to which previous generations had access. Why are you even reading this? Go and find out about Newton.
Other recent DH projects have focused on crowd-sourcing: using platforms to identify illustrators in tens of thousands of pages of Victorian periodicals, for example. All of this is valuable on an empirical level and raises interesting theoretical questions about the sources through which we tell historical stories for large publics, and how those publics can contribute to our work. The use of digital crowdsourcing recalls the great outsourced philological enterprises and correspondence networks of the nineteenth century, such as the far flung groups of people who made contributions to the Oxford English Dictionary. This in turn brings up questions about how such contributors are recognised, and credit given to them for their labours. These questions have far-reaching and under-addressed consequences for the digital humanities, for reasons I’ll return to below.
Our database couldn’t be like these projects, for a few different reasons. One: we didn’t have enough money. Two: a lot of the materials we wanted to include were in copyright. Three: even the ones which weren’t in copyright had often already been digitised as parts of large-scale projects by private companies such as ProQuest and Gale. (What I’m going to say next is a little critical of these companies, and I want to be clear about my stance. I don’t think they are ‘corrupting academia’. I also don’t think that they should make all the results of their extremely expensive digitisation programs available for free. But I think that we who work and study at universities should be a lot clearer on the landscape which working with these resources actually leaves for us. I also don’t want to turn this into a ‘the U.K. is lagging behind digitally’ type of argument. Many digital resources, such as the searchable records and historical archive of Hansard, are world-class and incredibly useful.)
When I started I was pretty confident in using Microsoft Access due to a past job. I thought (naively) that I’d be able to design the database using this software and then straightforwardly put it online on the website which Sam Robinson had built for our project. This turned out to be impossible: there’s no good integration between WordPress (which our site is built in) and Access. We found various workarounds but they weren’t particularly elegant.
Who cares, right? I can imagine two kinds of reader who will have got this far in this post. The first understand the humanities, and will see these technical problems as glitches which should be solved using bought-in technical expertise. The second is people who know databases, who will be laughing about how I thought Access would do what I wanted it to. Of course there are plenty of humanists with good groundings in computer science and data architecture—they’ll be a bit scornful on both levels, I imagine.
For good or ill, though, I was anxious that we should get things right before hiring in a database designer, and keen that we on the project should do as much as possible ourselves. Why this obstinacy? I’ve encountered situations in the past where a design for a site or a database has gone out to a designer, and this has locked in features which turn out to cause real problems later on. Take, for example, bulk importing—updating a database with multiple entries at a time. Now I know you almost certainly don’t care about bulk importing, unless you have spent any significant amount of time doing data entry. But whether a database needs to deal with records one at a time, or whether it can pull a whole huge spreadsheet all at once can make the difference between spending the next six weeks inputting and being able to go off in twenty minutes to drink sweet filter coffee at the Aberystwyth arts centre, or whatever your local equivalent might be. It’s the difference between looking at a large collection of data and seeing your Monday, your Tuesday, your Thursday evaporate, and being able to finish the task and do something else. Plus data inputting hurts when you do it for a long time—it involves little tiny movements, often very repetitive and can set off your RSI. For the workflow of the person who is entering data, these features make huge differences.
To deal with our Access issue, I spent a long time looking for off the shelf packages to produce online databases. These fall into two main categories: those aimed at academic users and those intended for business users. The former group are pretty cheap but often rigid and can require quite a bit of expertise to use. The latter group are usually eye-wateringly expensive, flexible, and extremely well-supported. I tried a standard app-builder programme called Zoho Creator and it helped me clarify the structure for the database. In the end it turned out that because Zoho charge by the record, our database was going to be too big to be affordable on their platform. In addition, some of the ways I wanted to relate tables on our database exceeded their current functionality. Evaluating the product in this way was a good exercise: it gave me a better idea of how to relate different bits of data. And I think that suggesting new uses or features for commercial products is a good service which academics can perform. In principle, it’s much better idea for us to work with companies which can deliver the kinds of things we want at scale, rather than constantly starting from scratch and insisting on bespoke designs for every digital project we undertake.
Alongside Zoho, I worked a lot with Zotero—a program designed for researchers which will extract metadata (such as title, author, publication date) from webpages you visit and gathers them in a database. Zotero is amazingly powerful, in my opinion, though it struggles to import data from other sources. Partway through the biggest gathering of data, the way Zotero integrates with Firefox changed. This little change meant it no longer extracted publication dates from a lot of the sources I was using. This little change multiplied the time each entry took by about 1.5. Across thousands of entries, that adds up to a lot. But there’s no question that for my purposes Zotero was a fantastic resource.
The portal for the database is in Omeka, which I didn’t discover until very late in the day. Omeka cheaply does lots of the things we wanted: it imports bulk files nicely. Its two major drawbacks are that it is not really a relational database, because it treats all entries as essentially being the same kind of data, and that it doesn’t deal with dates in a very sophisticated way. For the former of these problems I developed a solution; the latter is as far as I can make out intractable, at least for now. I would recommend Omeka as an accessible format for anyone who’s happy to muck around a little but lacks extensive coding experience.
For sources, we couldn’t rely on publications being out of copyright. I also had little appetite to attempt to work through very long runs of magazines in print form when digital versions were already available, even if they were inaccessible. (I could have done this: perhaps I should, as it would have acted as a finding aid for collections which can appear daunting in size. Knowing that I would be duplicating information which was readily available for subscribers, however, stayed my hand). Instead I looked mainly at publications whose copyright remained with the original publisher, and which had made previews of their articles available online as a way to entice people to subscribe. These publicly available collections of metadata were often of substantial size—thousands of articles were available in this format—and by featuring them in our database we would be drawing attention to these magazines as going concerns, encouraging scholarly and other publics to buy access if they wanted to see the whole article. This seemed like an elegant solution to what I had come to think of the ProQuest problem.
Still, the labour involved in assembling these collections was considerable. I had substantial help from Sam Jackson, an undergraduate intern at Aberystwyth, as well. Sam went through two more than two decades’ worth of the magazine. His help was invaluable. For TIME magazine, which makes up the most significant collection in the database, it involved looking at every single issue for the period 1923 and 2007, (about five and a half thousand in total) opening each relevant article in a separate tab and then adding it to the initial collection using Zotero. Other magazines involved more or less work. I wanted to do the Spectator, all of whose archives are available online for free, but the quality of digitisation is extremely poor and cleaning up the articles would have taken more time than I had.
I am aware that these mundane considerations are a long way from the tone in which Digital Humanities are usually discussed. Among its advocates, DH is supposed to offer a radically changed model of scholarship, at once a pushback against the previous dominance of critical theory, and a way of opening up the academy to new users. Among its critics, it is regarded as a threat to the intellectual integrity of scholarship in the humanities—a presumptive challenge based on a fetishisation of data, science and technology, which has failed to live up to its promises. I am moved by arguments on both sides of the divide, but think that much of both the advocacy and the complaint are misguided. They are misleading because they model both the the role of the human scholar and the role of digital technologies in ways which abstract from the kinds of labour and dependency which this kind of work actually involves. Sometimes this is in support of a belief that methods derived from Artificial Intelligence will soon be able to take over many routine clerical tasks, or the claim that the kinds of reading at a distance which digital approaches enable can give access to a larger, and hence superior range of texts. For more critical scholars it is intended to present digital methods as foreclosing on the less quantifiable, more particularistic aspects of scholarship—the thorny paths of inquiry which are held to be our mode of resistance.
What is missing from both approaches is a serious engagement with the massive amount of—often precarious and poorly paid—human labour which digital processes actually continue to involve. Anyone who has worked a lot with Google Books will have come across a page with an inadvertently scanned image of a thumb on it—these have something like the same status as monk’s marginal notes from the medieval period, noting how cold it is. In much less cutesy ways, the appalling conditions in which (for example) Facebook moderators have to work has been the subject of occasional press reports. The increasing numbers of moderators now employed by companies like Alphabet for its YouTube subsidiary are part of the digital landscape; so are the ‘Trolls’ who work in so-called click farms. Of course everyone working in these fields has access to a greater or lesser degree of automation: moderators can, to a degree, use AI; and the Trolls are so adept at creating puzzling variants of existing kids’ programmes that a nostalgic clip show produced in 2047 looking back on childhoods of the Twenty Teens is likely to feature fond memories of watching some frightening shapes and traumatising off-brand cartoon pigs.
For the foreseeable future, though, a very high proportion of digital work will continue to involve human judgment and exposure of workers to content which can be very unpleasant indeed. Absent the enforced leveraging of huge crowd-sourcing which Google books was able to employ, digitisation projects will also continue to demand huge amounts of routine, often boring labour. Because the people doing this work are often casual employees of private contractors, they are often not regarded as making serious contributions to the scholarly enterprise. For the nineteenth century Oxford English Dictionary we have a good idea of at least some of the people who read through texts and made excerpts and sent their slips into the lexicographers. There has rightly been a move to celebrate the highly skilled but low status human calculators and computers who contributed to everything from the construction of the Nautical Almanac to the Space Programme*. But by and large not academics simply do not care about who scanned the documents; who added tags to the old articles on which our new articles are based; who used their judgment to correct the optical character recognition which made the text readable.
A lot of discussion around the role of the digital economy has focused on how much better paid digital jobs are than those in other sectors. But this exclusive focus on the better jobs in the sector risks obscuring the myriad ways in which bad jobs are outsourced. If the fact that moderators are at serious risk of PTSD from routine exposure to images of animal cruelty** (never mind anything else) online were treated as a serious issue of occupational health, wouldn’t this make a difference for how those jobs are rewarded and supervised? It is easy to lament this as one of the ill effects of capitalism and perhaps more difficult to see how better recognition of the hazardous conditions of much digital work could be achieved.
It may be objected that most academic digital labour is not hazardous in the same way as content moderation. However true this may be, I think we should think more about how the divisions of labour which lead some laborious tasks to be handed off to external contractors, to poorly paid lower level members of staff, and others. Instead of accepting that academics should primarily adopt a managerial role with respect to these other processes, which should be more interested in how never having to do the really boring routine stuff might feed obscurantist fantasies about how digital tools actually work. If we never see the work being done, and never participate in it ourselves, it is perhaps easier to believe ‘a robot did it.’ This makes it more difficult to reflect seriously on the combination of human and non-human agency through which the digital realm is maintained.
I wouldn’t compare the database we’ve created for this project with the major scholarly works described above. But I hope it might provide an example of how such resources can be created in-house, using the resources which are immediately available to us. Bumping up against those restrictions is a way to start thinking more critically about the human labour which digital work demands.
*I am not thinking exclusively of those who were promoted to do more interesting work, like Katherine Johnson, but the other calculators as well, who as David Allen Grier has argued were for a long time written out of history.
*Here’s Wired’s description of the experiences of one YouTube moderator: ““If someone was uploading animal abuse, a lot of the time it was the person who did it. He was proud of that,” Rob says. “And seeing it from the eyes of someone who was proud to do the fucked-up thing, rather than news reporting on the fucked-up thing—it just hurts you so much harder, for some reason. It just gives you a much darker view of humanity.”
This post is about flying. More specifically, it’s about the connections between being a scholar and pressures to travel by aeroplane, what might be involved in resisting those pressures. I believe this would be a kind of anticipatory action which would allow us to think more clearly about our identities . A lot of what I have to say is about the ways in which academic historians can find ways of working which might not involve flying. This is because history is what I know a bit about; other fields and other professions will have different needs and different workarounds, and will find their own ways of developing more robust and sustainable communities which rely less on international travel by plane.
It is based on conversations I’ve been having with friends and colleagues for a long time, and is also inspired by some arguments about international conferences which Jim Grozier makes in a recent article for the blog of the British Society for the History of Science. Noting advice that sustainable futures would require significant reductions in flight emissions, Jim writes:
Academics whose careers depend on global collaboration and discussion will clearly find that difficult. Scaling down international conferences in favour of smaller, nationally focussed events is at least a partial solution; so is greater use of technology to allow remote attendance, and even remote participation. The technology required to do this isn’t new: five years ago I sat in a conference room in Moscow and listened to a talk being given from Chicago by a colleague who could not make the trip. We were able to listen to his voice and see his visual presentation; later, we were able to ask questions and listen to the answers. Indeed, “all-electronic” conferences are now being held, and I recently heard of one which had managed to provide the necessary facilities without charging a fee to delegates. At the same time, we must acknowledge that remote participation does not provide all the benefits we normally expect from a conference; the informal “networking” element is missing.
I think this is right, regarding international conferences, and it’s worth extending further to think about other academic practices as well. It is quite easy to pathologise international conferences—one of the first thing people who are climate change deniers generally do when they’re complaining that climate scientists are hypocrites is to talk about how they fly all over the world. But as Jim suggests, finding alternatives to air travel can involve slightly different ways of working as well.
I last travelled by aeroplane about fifteen years ago, when I went to visit family in Stockholm. Shortly after this, a friend told me about a talk which she had attended by the government’s Chief Scientific Advisor, at which he had discussed the disproportionately destructive effects of air travel. I remember that at the time I had hoped to visit Japan, though I no longer remember exactly why. I found this news hard to take—it was an affront to my sense of what I thought I could want from my life, a narrowing of horizons. I remember wanting very much to reject it and then feeling an increasing sense of calm when I realised that I didn’t need to; that in the extremely narrow realm of my own actions, I could simply decide not to fly. And in the time since, I haven’t.
Maybe a decade ago, there seemed to be quite a lot of people I knew who had made a similar decision. My impression is that this number is now quite a lot smaller. I suspect that this mainly reflects a greater awareness of what the job market requires than we possessed in 2002. These days, among the people I know, talking about a decision not to fly is halfway between a boast and the admission of an embarrassing ailment. The embarrassment comes in part from the sense that this decision has no real practical effect. Individuals decisions not to fly do not ground aeroplanes. Colleagues talk about having to go to international conferences, the appalling waste of travelling so far to see little more than the hotel and a conference centre. But the logic of working in universities requires us to meet colleagues from all over the world; and I’ve benefitted again and again from the wisdom and companionship of people who have flown in. Another friend, who delights in travel, offered to trade one of the long haul flights which they were planning to take so that I could take one—a transactional view which would have had exactly as much effect as my refusal to fly at all. I refused. I can’t see a way back through the decision which I made fifteen years ago: this is no longer a rational decision (if that’s ever what it was): it is now a habit, a reflex, a way of seeing the world. My recurring nightmares most often involve being made to drive a car, and being made to get in an aeroplane.
Although I believe the environmental motivations behind my decision not to fly are sincere, it is not hard to see how they could point towards a vicious and exclusionary politics. Many people need to fly—really, seriously, need—in order to see family, to return to their homelands, to go to work, to conduct negotiations, and so on. In the present political climate, there is a straight path between something like a refusal to fly and something like the prime minister’s denunciation of people who are “citizens of the world” as being “citizens of nowhere”. This aggressively anti-cosmopolitan politics offers fantasy reassurances to people deemed native subjects (“you belong here—we will look after you”) while terrorising people without those protections. I do not want to believe that my refusal to fly connects with this kind of nativism; in practice, though, I can see real risks of turning inward if we are less personally connected to other members of our scholarly communities in locations we can only reach by air.
At the same time, it is worth asking exactly why living a cosmopolitan identity requires us to travel by plane—and what ways there are of living openly towards the world, which don’t involve the destructiveness of air travel. For an academic—in my case, for an academic historian–doing this involves unpicking some of the assumptions about what our work demands and finding different ways of working, and especially of how to build trusting relationships at a distance. It also involves recognising how a scholarly profession which is based around air travel to which people have different levels of access is characterised by a lot of inequalities. To put it crudely, most Ivy League professors are able to clock up a lot of air miles, while most people who work at other institutions can’t afford to do that. And scholars from wealthy institutions, mainly based in the Global North, who are able and legally entitled to afford to travel the world have opportunities which scholars from less wealthy institutions, mainly based in the Global South, may lack. Perhaps developing strategies for working at a distance more effectively would be a way of supporting colleagues who will never be able to access the same resources as those in wealthier institutions.
Trying to do this is a good idea for a number of reasons—not least, that it is a way for scholars to feel less helpless. The circuit of international conferences, brief opportunities to visit institutions which can pay, and visiting archives all around the world can feel like a grind even when you do have time to do it. When you have serious caring responsibilities, whether for infirm parents, children, or others, it often becomes much harder to keep up. Scholars who are not able to find full-time academic appointment are similarly disadvantaged by a set-up based which is based on the networking and face-to-face contact which the current set up demands. Obviously, it would take a lot more than reducing flying to effect serious change in this respect, but thinking finding ways to work differently from less advantaged institutions can be a way of building different kinds of relationships.
For historians, some ways of working remotely have been fairly well explored already, while others are less clear. What we know about quite well is using proxy researchers to undertake research in archives which we cannot access ourselves. I saw a presentation ya few days ago by the historian Sergay Radchenko, in which he talked about the limitations and strengths of this approach. It works well, he said, when we already have a fairly good idea of what we documents are looking for. It is usually more of a problem when we don’t already know what an archive contains—it is not a good way of doing exploratory work. Historians working with proxies are also unlikely to get a “feel” for an archive—the logics through which documents have been collected, the specificities of the ways in which materials are arranged, and so on. Lack of knowledge of these things can be a serious impediment to historical understanding. Finally, archival practices around the world differ significantly, and historians may need to develop trusting personal relationships with archivists. It is not clear that they would be able to do this, working through intermediaries.
An established problem of this kind of remote working is how credit is awarded. If a proxy researcher is the one doing the bulk of the archival labour, the work is theirs—but if they are doing so at the direction of another historian, how credit be awarded appropriately? In particular, if researchers in wealthier institutions are collaborating with proxies without an institutional basis or who are based in poorer institutions, there seems a real risk of exploitative dynamics arising.
A second established part of archival practice in some locations is that historians are able to contact archivists and archivists send copies of materials directly—usually (but not always) for a fee. Again practice varies enormously, and the problems about lack of direct acquaintance with primary materials also persist. Moreover, it is possible to imagine a situation where archivists have to field large numbers of trivial or malicious requests, if such a practice were to become more prevalent.
This connects to another issue, about which I have seen less discussion: how we build trusting relationships with colleagues we have never met face to face. (Jim’s suggestion about local networking session goes some way towards this, but probably would not allow for close collaboration with people we haven’t met). Colleagues often talk about using teleconference programs to work together, but also note that face to face meetings are essential. It is not easy to specify why it meeting face to face makes such a difference, but everyone I have ever spoken to about this subject agrees that it does. It has to do with the greater range of communication—through body language, posture, touch, and so on—which are available in person, as well as the various ways in which communications tends to get lost in translation when we are writing, speaking on the phone, or talking over Skype. I would very much like to see research about how to improve trust in remote working.
If developing trust at a distance is a problem when working with colleagues in the same profession, who are likely to share something of the same worldview, it is likely to be an even greater difficulty when dealing with proxy researchers or other collaborators. I am aware of the very rich history of go-betweens in the construction of scientific knowledge, and that colleagues who are journalists or who work for NGOs have significant experience and established procedures for working with people they will not be able to meet, but with whom they need to collaborate. I hope that scholars can learn something from these other fields about working with trusted intermediaries. I am aware, however, that serious journalism and serious NGO work still usually requires extensive meeting face to face, enabled by air travel.
I do not think that any of this is without risk—it could be that any increased focus on working remotely would simply reproduce existing inequities—but I would argue, it is worth trying to find a vision of scholarly life which is less dependent on highly destructive socio-technical systems. Doing this is a way to feel less helpless in the face of rapid and often destructive shifts within universities, and to understand how our material practices contribute to exclusions from scholarly communities. It cannot provide an answer for everyone—people will still need to fly, for all kinds of reasons. But at least some of us should try to develop more robust and resilient ways of working together remotely.
There are some obvious practical steps which would help with this. Some archives maintain directories of people who are willing to do proxy research for researchers who are unable to access collections themselves. Others are very generous in supporting researchers from overseas. It would be good to gather best practice in this area. Of course we will need to be careful that archivists are not overloaded with requests. Other equally practical questions will probably require rather more muddling through. But we will not get very far by simply rejecting the bits of the job–such as international conferences–which are widely regarded as tiresome and burdonsome. We need alternatives for the parts which make us feel good about ourselves and our work as well.
Part of scholarly reflexivity should be to examine the things we do and the infrastructures they rely upon. We would not expect an argument to be entirely right the first time it is made—others will offer criticisms, denunciations, suggest alternatives. Through this scrutiny it will improve. It should be the same in trying to different ways of working. We should want to develop lots of them and as far as possible explore their strengths and weaknesses in practice. If the name wasn’t already taken, you could call this “grounded theory”.
Author: Amy C. Chambers
Westworld was my favourite series of 2016. It presented a rich science fiction future that managed to be fresh and exciting despite being
a remake based upon a 1973 movie by the same title. It had and continues to have lots of opportunities for developing exciting and prescient narrative that can be explored in what I hope will be a long running series. I was mesmerised from the opening credits, which I wrote about here. Westworld played around with time and I will have to rewatch all ten episodes as I attempt to distinguish between ‘fact’ and ‘fiction’, and past, present and/or future.
**The following post contains spoilers**