MacKay Data Analysis

From Norsemathology
Revision as of 17:02, 8 September 2016 by Madison (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Small Multiples

Small multiples for the graphic that MacKay used in his 1901 report on phenology, featuring the summarized variables that MacKay chose:

  • Mayflower,
  • Strawberry,
  • Apple,
  • Lilac, and
  • Blackberry.

"Small multiplies" is an idea due to Edward Tufte, who is famous for his designs of the proper way to illustrate graphical information. I found this article entitled "Tufte in R", which might be interesting to investigate. Ironically the section on "Small Multiples" is "in preparation". Maybe we could do this in R. I do have an example of Small Multiples in R, actually, which I've made available on mathstat (/var/www/html/mackay/R/SmallMultiples/generate_graphs.R). This also illustrates how to read an Excel spreadsheet in R, and some MYSQL commands for selecting data. It's all a little overwhelming perhaps! But it's worth digging into....

Here is the Mathematica File that produces the following plots [This needs to be updated Madison -- also the file for year 07 is called 1917m....]:

G1901.png G1902.png G1903.png G1904.png G1905.png G1906r.png G1907r.png G1908.png G1909.png G1911.png G1912.png G1913.png G14.png G15.png G16.png G17.png G18.png G19.png G20.png G21.png G22.png G23.png

Good work Madison and Steve: Animate.gif

  • I added "padding" for missing years, by repeating the preceding year.
  • To create the animation I used the ImageMagik command convert: convert -delay 200 -loop 0 *19*png animate.gif

Now to the important question: What do we learn?


  • In order to get a (very) crude estimate for the centroid of each region, I reproduced the regions with cardboard and then balanced it on a pen to find the midpoint. From there, I was able to use a computer program to find a coordinate for each centroid to compare to the point I found. Eight of the regions coincide with county borders, so I was able to find coordinates for each county. I then put them into a program that would give me a geographical midpoint. I was able to compare to what I had done with the cardboard, and surprisingly enough, they actually matched up quite well. The estimate for regions six and seven were even more crude because they split up a county. Finding those coordinates involved a lot of trial and error.
  • Here are the coordinates I ended up with (in degrees);
    • Region 1: 43.9, -65.8
    • Region 2: 44.2, -65
    • Region 3: 44.85, -64.9
    • Region 4: 45.25, -63.6
    • Region 5: 45.1, -62.45
    • Region 6: 45.5, -63.97
    • Region 7: 45.7, -62.75
    • Region 8: 45.85, -60.45
    • Region 9: 46.4, -60.6
    • Region 10: 46.2, -61.1
  • Here is the link to the website I used to find midpoints

Missing Data