Just to recap, our project entailed taking Zoobook data converting it to text. The text was then used to map the data Points on ArchGIS. This plan was simpler said than done as we encountered a host of problems and roadblocks along the way. Read all about it here
You can see my Tutorial on my website: AwsomeBeasts
For the next week, we are planning on working hard on cleaning our data to the point that it is in a usable format for the mapping program. As the association, we had planned to make between Alumni location and Major ended up not panning out, we are working on achieving a replacement data set. For the new data, we are contacting the Archives and Athletic departments asking for old sports team rosters.
By next weekend we should be able to input our data into the mapping software and examine the results.
Over the past week, we have delved deeper into the world of 3D modeling. We have explored programs such as CityEngine, with its perceptual generation and we have toyed with photogrammetry programs such as Photoscan. After experiencing the two new types of modeling options and remembering experiences with simple programs such as SketchUp there are a few quick questions that come to mind. What are the pros and cons of each of these options? What is the best use of each of these different approaches?
Procedural modeling is best used for the creation of Citys and towns in which high detail is not required. All that one needs is to create a starting code for its instructions on how to model each of the points provided. Once the program his been created (may take some time) all that one needs to do enter the data and the software does the work.
This is good when you need to model a lot of objects with relatively similar appearance.
Lack of detail. All the buildings will come out looking the same a relatively flat in appearance. the uniform structure caused by the modeling engine. Bad at detail and historical accuracy
Easy to Create. This process creates a highly realistic and authentic replica of the original item. This system is also easy and fast to use making it fantastic for museum collections or other objects with easy access.
Only useable if you have unobstructed access to the object. It does not work for items that no longer exist or you wish to reproduce in a different state then they are now.
Authentic replication of a product. The original object does not need to still exist.
Time-consuming. Can take days to create even one project in high detail.
Examination of existing project, Marie Saldana
Marie Saldana’s Rome project features a fantastic example of procedural modeling from computer code. The Roman city through this project can be fully viewed and respected in awe as a whole. The sense of scale expressed in this project is also impressive. Where the project falls short however is in its detail, most of the building feel flat and without variation. This flatness was to be expected as it was all modeled by a computer using the same code for each building.
For this week’s part of the project, our plan was to gain access to the data that we will use in the following weeks to create our Map. The first step of the data gathering process was to decide and narrow down what type of data we were hoping to display. After some discussion, we settled on gathering information on the following variables: Name, Year, Location (State, City, Town), High School, Major, and Gender.
At the moment we are running into two large problems each related to the two respective sources we are planning to use for our data.
i) Alumni Directory:
While all the information here is easily available in digitized form, it is not all located in one condensed location, meaning that to go through and transcribe all the information by hand would be overly time-consuming.
The problem with the Zoobook is the direct inverse of the one we had with the Directory: while all our information is in one place, it is located in separate unusable PDF format.
We are planning on using a PDF –to–text converter to make the Zoobooks usable. After we have the information in text format, we are going to use a self-created Python code to reformat and import the massive text blocks (created by the converter) into an Excel spreadsheet.
We are expecting to be a little behind. However, as long as we get all our data over the course of the week, then we should be back on track for data scrubbing and formatting next weekend.
For my building chose, Skinner Chapel because it’s one of the most Iconic landmarks on campus and it thought that modeling it would be an interesting endeavor.
I started by grabbing the google earth image for the building via “Geo-location” in sketch up. Once I had the base image I drew on the building basic exterior structure and extruded the building upward. With a rough exterior model for me to overlay the pictures, I could begin inserting the images I had found on the Archives data base.
After spending a good chunk of time overlaying and adjusting the model/ images I realized that the back side of the building was largely uncovered and thus I imported some pictures of the general web to complete the creation.
After good tweaking later I ended up with a somewhat cohesive model; However I can definitely understand why this technique is better suited for quantity over quality
When processing data, the foremost and maybe even the central choice one needs to make for data entry and management is the choice of utilization of the flat database vs. the relational database. Both of these options have pros and cons associated with them, which in turn grow or lessen with respect to how you intend to use your data.
However, first let’s provide some definitions: Flat databases are built from a single file, a single table. Relational databases are built from multiple files or tables and relate to each other via special key fields – hence the term, “relational.”
Extremely easy to start using: Anyone with access to a computer is likely to have some variety of spreadsheet programs, be it Excel for Windows users or Numbers for Mac OS. Even if both of these options are unavailable, Google Sheets is available free for everyone with a Google account. All of these options can be almost immediately opened upon starting up one’s computer, allowing data entry to start promptly.
User-Friendly: In all the options listed above, as soon as the program has started a table is easily constructed with data entry being self-explanatory. New headers and columns can be easily added and manipulated in these programs, making the initial setup a breeze. For relatively little work you have a table that is easily readable and cohesive.
Bad handling large amounts of data: Most of the problems with flat data emerge when the spreadsheet grows larger. I estimate about 200+ entries would be the cutoff point for when one should begin thinking about switching to another method, (can’t imagine that it would work above 1000 entries) – see why below.
Starts looking sloppy: If you go flat, the more data you enter into the spreadsheet, the more variation and duplicity you are likely to see in this data. Therefore, in your spreadsheet, you are going to see columns of different lengths and widths. Various categories for each data point also complicates spreadsheet filing. Sloppiness may lead to increased error rates.
Rigid: Flat data and spreadsheets often are quite rigid and time-consuming to manipulate; especially the more information entered. If a mistake is found later in the data entry processes or if one wishes to add or remove a category, complications can emerge.
Over the Internet: Most Relational Databases will normally at minimum have internet compatibility or exist as a web program. It is also possible to establish permissions rights for various parts of the data, such as viewing only or editing only of particular locations.
This option allows for easy collaboration or sharing of data.
Logical categories/subcategories: There is a large amount of compartmentalization with each value existing under a new header, which itself is in under another larger parent category, keeping all the information organized.
Easy manipulation and search-ability: Once all the data has been successfully entered into the relational database, manipulation becomes a breeze. Because almost every value is separately compartmentalized, it is extremely easy to find the value or group of values one is searching for and then to move or reposition them in a new location.
Harder to get working and somewhat of a general pain: When the program is first entered, it’s a pain to start designing all the separate headers and categories under which to store the information. I attempted to recreate the system described in Stephen Ramsey’s article. However initially learning to use the program proved difficult not to mention that if one were to do it from scratch, a clear picture of the goals is necessary
Conclusive Thoughts: When choosing a database format, the most important thing to think about is its purpose. If you are collecting large amounts of data under various categories and groupings, hoping to then compile all the information, then a flat spreadsheet is not the way to go. Relational database utilization might provide you the “bang for the buck.” However, if the project is a quick analysis of 200-300 points with under 5 groupings, then flat data will probably save you a lot of time.
Other Issues to keep an eye out for:
Plan – Know what you want. Know where you’re going: It is always good to have a rough outline planned for how much data you are going to collect and your plans for processing the information before you get started. This planning can help you avoid problems down the road, such as selecting the wrong database format.
What are the Categories?: Know the categories and groups under which you sort the information before you begin entering it. This preparation will allow for less confusion and avoid redundancy.
Consistency is Key: Be consistent in how your data is entered and in your project as a whole. The more consistent you can be, the cleaner the final product; the nicer the information you can draw from it.
I was unsure about where to post the newest blog, so I ended up posting it on the new web pages we had created. It can be seen Here.
Sorry about the inconvenience.
The digital project that I chose to reverse engineer was the Mapping Occupation project By Gregory Downs and Scott Nesbit. This project in the Digital Humanities,
“captures the regions where the United States Army could effectively act as an occupying force in the Reconstruction South… it presents the basic nuts-and-bolts facts about the Army’s presence, movements that are central to understanding the occupation of the South.”(1)
When entering the website, one is faced with a large map on the right and a navigational/information bar on the left. There are two options for exploring the data and information presented. The first method is exploratory, allowing the viewer to freely zoom in and out on the large exploratory map that fills 80% of the screen, adjusting displayed information relative to date. Information that can be adjusted includes total troops, cavalry, black troops, zones of access, occupation to railroads, and voting data. This approach allows for a free examination of the data. The second approach is more structured than the first, featuring a narrative map through the site. This approach tells a story of army occupation from near the end of the war to past the 1870’s, allowing for an analysis of the data, drawing on the components of politics and culture to add contrast.
When the website was describing how its data and analysis were conducted, there seem to have been two inquiries. The first was data collection information that would form the backbone of the website such as location, number and time that US troops occupied certain areas. These data were collected via the final monthly military reports made by the military. When information was unavailable for a particular month, data from other reports were made in the particular time period; conflicts were resolved by use of the most detailed report. Likely sources for error and a more detailed decryption of data compilation are available on the website.
The second inquiry was a little more complex to conduct as it involved defining the likely zones of occupation and access. Extrapolating from the data given via the troop’s station, these zones were calculated. Occupation zones were found by using a distance that a unit Calvary or infantry could travel in a day’s walk or 6-hour train ride. Zones of access were formed by “places from which freedmen might reasonably travel to a military outpost.”(2) These, however, were more constrained as the former slaves could only travel by foot without owning houses and train access being restrained. All the data are available for download on the website.
My main annoyance with the website is its separation of numerical data and narrative data. The Map does a nice job featuring the large amounts numerical data visually; however, I find myself getting annoyed when attempting to learn about a specific location. When I try to click on particular troop stations, it would be functional and user-friendly if an info window would appear to provide troop numbers and maybe even a short blurb on the current situation. For example, what’s that one unit of some 5,000+ men doing in the middle of the Ocean in the gulf of Guinea? Instead, if I wish to discover more on the narrative front I need to read through the long paragraphs featuring white text on a black backroad, making it difficult to read extended time on a computer. This issue brings me to my secondary annoyance: the black and white color scheme becomes difficult to navigate after a few minutes. A white/grayscale with sparing amounts of black would have been more appealing.
In summary, I found the website informative and can see its value under the right circumstances, say a civil war analysis or for information on racial tensions in the post-war US. The website could use some fine-tuning in map navigation and interface, however, I’m just nitpicking, in the end, it provided an interesting interface trough which to study the US troop occupation of the south.
I learned from a friend that this Carleton course, “Hacking the Humanities,” was supposedly lots of fun, featuring a plethora of digital technologies. Since my friend told me about it, I have been looking forward to taking the course. As such, when the first day of class was postponed, I was deeply disappointed and redirected my sorrow to the internet. I googled the syllabus to find the first homework assignment to get a taste for class, hoping that my efforts for 2017 matched the previous direction.
Upon discovering that the first assignment rotated around mostly free software, I downloaded the software and went about teaching myself how to use it. This is where things in retrospect are kind of funny because I didn’t get the in-class introduction to the program. Completing even simple actions was difficult. For example, on the first house, I tried to design, I attempted to draw a box using the Line draw tool in 3D space resulting in a mutated box. It was only after 30 minutes and some YouTube videos that I discovered the whole process could be made easy via the use of the rectangle tool and then extruding the image into 3D space.
With the help of YouTube faculty guidance, drawing became easier, and my house started to take form. It is at this point that I would like to mention both the amazing benefits and shortfalls of YouTube videos: while the video taught one how to draw complicated shapes and gave tips on good design, it skipped the more simple things. For example, two and a half hours into the process with the help of the tutorial videos my house looked amazing, featuring nice indented windows with multiple layering and a varied roofline. It was at this point that what the video did not teach me came to haunt my construction: the lack of the “pan” ability. Please, note: I knew how to get around via the pan icon on the screen and then rotate the image on the mouse. However, I did not know of the “shift click” pan hot key; thus, as panning took a large amount of effort, I tried to use it as little as necessary. This meant that I did not often view the house from a variety of angles and as anyone who has modeled before surely knows, that’s not a good thing. The end effect of this mistake was that my house was slightly crooked, necessitating me to start anew. Perhaps, the friendly neighborhood Cubist would have been fine, but who wants to live in a Picasso’esque dwelling?
After a few more hours of work with backtracking and remodeling, I arrived at something close to final product. I then painted the house, using a combination of provided and custom texture files and rescaled the house. At this point, there were only two problems left: one of the house faces would not let me indent the window and another insisted that it was divided into two faces even though there was no dividing line.
The problem with the window was peculiar: I originally thought the problem was in how I drew in the exterior frame. However, after redrawing multiple times and attempting to move/redefine the object, nothing changed. Finally, I gave up on the window and simply removed the entire face of the house to redraw it, which ended up functioning. I assume what had happened was multiple layers overlapping in that section, which I noticed while removing it, caused the software to get confused.
The invisible dividing line problem was fixed only by tenacious trial and error – i.e., luck and the fortune of the gods – as I tried everything from the erasure tool to removing the portion of the house – all to no avail. The program continued insisted on cutting the face diagonally down from the top right to bottom left. In the end, I discovered what was causing the problem while I was attempting to export the file. I stumbled open a button entitled “unhide.” I wish all problems in life had the unhide button. The irony of life is that the unhide button is often hidden. Upon clicking the option and selecting “all,” a line running nearly the full length of my construction was revealed, having been hidden before.
For new users I would like to share two tips:
The first is ascetic. Tip One:
When modeling, I found that the more small indentations one can make, the better the end appearance. For example, when adding windows, don’t just draw one in, extrude it inwards to represent the insert in the building and then add a small frame on the sides that are also indented creating a nice layered image. See the window above. I also went into similar detail on the doors.
The second – Tip Two:
Use the “Pan” and movement functions as often as you can!
Pan: Sift + Left mouse
Rotate: Center mouse button
Pivot: Center mouse button + Ctrl