8A: Network Analysis

The advent of the internet, and especially of its more socially connected Web 2.0 variant, has ushered in a golden age for the concept of the network.  The interconnected world we now live in has changed not only the way we study computers and the internet, but the very way we envision the world and humanity’s place in it, as Thomas Fisher has argued.  The digital technologies that we are learning to use in this class are tightly linked to these new understandings, making network analysis a powerful addition to the Digital Humanist’s toolkit.  According to Fisher,

The increasingly weblike way of seeing the world … has profound implications for how and in what form we will seek information. The printed book offers us a linear way of doing so. We begin at the beginning—or maybe at the end, with the index—and work forward or backward through a book, or at least parts of it, to find the information we need. Digital media, in contrast, operate in networked ways, with hyperlinked texts taking us in multiple directions, social media placing us in multiple communities, and geographic information systems arranging data in multiple layers. No one starting place, relationship, or layer has privilege over any other in such a world.

Small NetworkTo study this world, it can therefore be helpful to privilege not the people, places, ideas or things that have traditionally occupied humanistic scholarship, but the relationships between them.  Network analysis, at root, is the study of the relationships between discrete objects, which are represented as graphs of nodes or vertices (the things) and edges (the relationships between those things).  This is a very active area of research that emerged from mathematics but is being explored in a wide array of disciplines, resulting in a vast literature.  (Scott Weingart offers a gentle introduction for the non-tech savvy in his Networks Demystified series and you can get a sense of the scope from the Wikipedia entry on Network Theory.)  As hackers, we are not going to get too deep into the mathematical underpinnings and rely mostly on software platforms that make network visualization relatively easy, but it is important to have a basic understanding of what these visualizations actually mean in order to use them critically and interpret them correctly.

 


Exercise: Your (analog) social network

The basics of visualizing a network are fairly intuitive and can be done with pen and paper.

  • Draw a simple diagram of your own social network including
    • 10-12 people as nodes and
    • your relationship to them as edges
  • Put yourself at the center and then place other people around you.
    • Start with your immediate family (your kinship network) and then expand out to include extended family, friends, people you know through clubs or activities, etc.
  • Draw lines to connect these people to yourself
  • Now draw lines to connect them to each other.
    • How many have relationships that do not run through you?
  • As undifferentiated lines, these are probably not very informative, so code the lines to indicate the nature of each relationship

What takeaways emerge from your diagram?

Are there connections that surprised you or figures that emerge as more central to your network than you had realized?

  • Swap diagrams with your neighbor and see if the diagram helps you understand their network more easily.

 


Exercise: Your (digital) social network

The relationships you just drew can be expressed in a simple data model as a “triple” comprised of a subject, a predicate, and an object.  My relationship to my friend Chris for instance can be expressed as a triple in the following format:

Austin — is friends with — Chris

subject — predicate — object

Each relationship in your whole network can be represented this way as a set of triples, that allow for easily readable data storage and ready network visualization.   Many DH projects make extensive use of the RDF (Resource Description Framework) specifications for modeling large sets of data as an RDF graph of triples.  For our small example, we are going to recast our personal network as a set of triples and visualize it as a digital network using Google’s Fusion Tables application.

As we’ve already seen, Fusion Tables is an experimental platform for data visualization that Google has developed to allow spreadsheet data to be quickly visualized in any number of ways from traditional bar and line charts to maps and network visualizations.  Google launched its first MOOC around Fusion Tables a while back called Making Sense of Data that you can still view if you want an in depth look at how to use all the features of this application.  For now, we going to focus on its Network Graph capabilities.

Our first step will be to populate a Google Sheet with triples representing our own network data, and then import it into Fusion Tables and visualize it.Screen Shot 2015-02-17 at 11.38.45 AM

  • Launch Google Drive and create a new sheet with the following three columns: Person A, Relation, and Person B
    • Go through your hand-drawn diagram and translate each network relationship into a triple following the model above

(One word of caution — there are two types of relationship that can be expressed here: mutual and unreciprocated.  “Is friends with” or “is a sibling of” would be mutual relationships that produce an undirected graphDirected graphs map one-sided relationships like “is the parent of,” “is the student of” or “is in love with” by drawing a directional arrow for the edge.  Both are possible and can be used, but you should be aware of the distinction as you draw up your triples and stick to one or the other.)

This data model is unlike a relational database in that you will be repeating names in order to express all of the relationships in the graph.

  • Try to connect each person or node with at least two others
  • Make sure you are logged in and save your sheet

Import your data into Fusion Tables

Go to the Fusion Tables start page, click on Google Spreadsheets and import your data, checking the Export box if you wish to make the data public and downloadable.  Screen Shot 2015-02-17 at 11.49.41 AM

  • A window should open showing your data table.  You will add a new chart by clicking the red plus sign of the type “Network graph” and change the options to Show the Link Between your Person A and Person B columns.Screen Shot 2015-02-17 at 11.57.27 AM
  • Congratulations!  You have just made a graph of your social network.  Explore the limited options and apply some filters, then click and drag around the graph to see how you can change the visualization.

 


 

Assignment

Now that you know the basics of what a network graph is and how to create a rudimentary one, let’s explore some much more sophisticated network analysis DH projects.  With your group, explore one or more of the following projects:

As you explore the project, consider the following questions about the nature of this network analysis:

  • What (or who) are the nodes and what are the edges?
  • How are the relationships characterized and categorized?
  • What interactions does the project allow?
    • How does this impact their effectiveness and/or your engagement?
  • How was the project created?
    • See if you can dig around in the documentation and discover what tools or data manipulation steps produced the outcome you see.
    • Does the project combine network analysis with any other information or technique, like spatial analysis or text mining?

 

4A: Spatial Humanities: GIS/Mapping 101

For the next week we will be exploring the spatial humanities — a vibrant and increasingly popular area of digital humanities research.   Humanities scholarship is currently undergoing a “spatial turn” akin to the quantitative, linguistic and cultural “turns” of previous decades, and many are arguing that the widespread adoption of Geographic Information Systems (GIS) technology and user-friendly neogeography tools are fundamentally reshaping the practice of history and other disciplines.  Yet while these powerful computer tools are certainly new, the mode of thinking “spatially” is not unprecedented, and may in fact be seen as a move away from the universalizing tendencies of modern western scholarship towards more traditional understandings of the lived experience of place, emphasizing the importance of the local context.

In practice, much of this scholarship involves creating maps — an act that is not without controversy.  Maps are conventional representations of space that come laden with the embedded cultural worldviews of their makers.  Maps are also highly simplified documents that often paper over contested or fuzzy boundaries with firm lines; it is hard to express ambiguity with maps, but it is very easy to lie with them.  The familiarity of widespread tools like Google Maps and Google Earth might fool us into thinking these are unproblematic representations of space, but it must be remembered that all maps contain embedded assumptions and cannot be taken at face value. Maps produced in the course of humanities scholarship are not just illustrations but arguments, and they must be read with the same level of critical analysis that you would apply to articles or monographs.

(For more concrete suggestions along these lines, see Humanizing Maps: An Interview with Johanna Drucker.)


Example 1

One area of historical research that saw an early adoption of GIS is economic land use.  A good example is Michael McCormick’s book on the Origins of the European Economywhich layered many different types of evidence against each other in a GIS to argue for a much earlier origin to Europe’s medieval economy than had been accepted previously.  McCormick has since made his database publicly available and continues to add to it with collaborators at Harvard, as the Digital Atlas of Roman and Medieval Civilization.

Screen Shot 2015-02-17 at 4.35.27 AM

The DARMC provides a rich resource and a good introduction to the potential of GIS to reveal patterns and connections through the spatial layering of disparate datasets.  It also offers a good orientation to the basic layout of most GIS systems, with a map view window on the right and a list of layers on the left that can be turned on and off.

  • Explore the DARMC.
    • What layers have been included?
    • What patterns show up when you juxtapose cultural, environmental and economic data in this way?
    • What connections do you see?
  • Also take the opportunity to explore the measurement tools at the top of the window to interrogate the spatial  attributes of the data.

Example 2

The quantitative data compiled in projects like the DARMC can help address many historical problems, but they don’t necessarily answer more qualitative research questions concerned with the lived experience of the past.  For this objective, we must move beyond birds-eye-view 2D maps of spatial distributions and attempt to visualize particular places at particular moments in time.  Such “geovisualization,”  the digital reconstruction of past landscapes, is another booming area of scholarship that allows us to virtually experience a place as it might have been, and also has the potential to answer important scholarly questions.

Screen Shot 2015-02-17 at 5.05.37 AMAnne Kelly Knowles’ digital reconstruction of the Battle of Gettysburg is an excellent example of this potential that uses a combination of digitized information from historical maps, documentary accounts and environmental data on the physical geography of the battlefield to answer the question of what the generals could see during the battle and how those sightlines influenced their decision making.

  • Read the brief introductory article at Smithsonian magazine and then explore the “story map” in detail.
    • How does the map combine geographic and temporal information?
    • Does it effectively give you a sense of the experience of being on the battlefield?
    • What does this reconstruction offer that more traditional publications could not?
    • What could be improved in the representation?

 


Group Exercise: The Varieties of Maps

(Exercise borrowed from Lincoln Mullen)

The next step is to become familiar with as wide a variety of maps as possible, including digital maps and analog, maps that have been made by scholars and maps that have not. Below is a list of online mapping projects.

In a group, pick three projects from the list to explore and compare. Your aim is to gain familiarity with projects involving maps and mapmaking, both by scholars and on the web generally.

As you look through these projects, consider the following questions or prompts.

  • Create a taxonomy of maps. What categories do these maps fit into? You might consider the purposes of the maps, their audience, their interfaces, among other axes of comparison.
  • What is the grammar of mapping? In other words, what are the typical symbols that mapmakers use, and how are they can they be put in relation to one another?
  • Which maps stood out to you as especially good or clear? Why?
  • Which maps were the worst? What made them bad?
  • How do scholarly maps differ from non-scholarly maps?
  • What kind of data is amenable to mapping? What kinds of topics
  • What accompanies maps? Who controls their interpretation? What is their role in making an argument?
  • How do recent web maps compare to maps made online in the past few years? How can maps be made sustainable?
  • Which of these maps are in your discipline? Which maps might be helpful models for your discipline?

 


Exercise (Georeferencing)

In order to reconstruct past landscapes like the Gettysburg battlefield, the first step is often digitizing the data recorded in a historic map by georeferencing (or georectifying) that image — that is, aligning the historical map or image with its location on the earth in a known coordinate system.  There are many ways to do this, but we will start with a cloud based solution requiring no complex software.

The David Rumsey Map Collection is a vast archive of scanned historic maps, mostly covering North and South America.  They have enabled a  crowd sourcing technique to get the public to help georeference these images for use in GIS applications, but the David Rumsey Georeferencer is very buggy and not very accurate.

Instead we will use the MapWarper online tool to rectify an historic map from an online collection of scanned images.

MapWarper.net screen shot

  • Go to Boston Public Library’s Norman B. Leventhal Map Center
    • Search for and download a historic map that covers an area of interest
  • Then follow these instructions from Lincoln Mullen to Georectify a map with MapWarper
    • Use their tool to set control points (at least 5 are recommended), clip the map area, and rectify the map
    • When you are finished, you will see the map overlaid on a basemap of the world.
  • You can download your newly rectified map as a KML file by going to the Export tab

Exercise (Digitizing)

Now that you have coordinates for your image, you can bring it into a GIS program, align it with other spatially-aware data, and digitize the information by creating vector geometry from it.  We’re going to do that in Google Earth as a first pass, since it is free, widely available and often people’s first introduction to using a GIS.

  • Open Google Earth and the newly rectified map by choosing File > Open from the menu and navigating to your downloaded kml file.
  • Double-clicking the file should launch Google Earth and show you the map aligned over the modern satellite imagery.

Screen Shot 2015-02-17 at 5.47.18 AM

  • Now use the Add PlacemarkAdd Polygon, and Add Path tools in the top menu to digitize features from your georeferenced map.
  • For instance you could pinpoint the old courthouse (a placemark, or point), trace the old shoreline (a path, or line feature), or trace the outline of a neighborhood (a polygon).
    • As you create features you can add metadata, change the symbols, and change the location of the camera to save alongside the feature in the Get Info window.
  • Finally you can right click your newly created features and save them as KML or KMZ files (a zipped version of KML) for use in other programs.

 


Resources

Lincoln Mullen of the Center for New Media and History at George Mason University has developed a fantastic resource for getting started with mapping for the humanities.

The Spatial Humanties Workshop site he developed will give you a detailed introduction to the different types of maps you might want to make as a digital humanist, the software and libraries that are out there to use, and most importantly the academic issues and theoretical questions that are raised by mapping humanities data in a digital space.