MacKay Data Analysis


Revision as of 17:02, 8 September 2016 by Madison (Talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

Small Multiples

Small multiples for the graphic that MacKay used in his 1901 report on phenology, featuring the summarized variables that MacKay chose:

  • Mayflower,
  • Strawberry,
  • Apple,
  • Lilac, and
  • Blackberry.

"Small multiplies" is an idea due to Edward Tufte, who is famous for his designs of the proper way to illustrate graphical information. I found this article entitled "Tufte in R", which might be interesting to investigate. Ironically the section on "Small Multiples" is "in preparation". Maybe we could do this in R. I do have an example of Small Multiples in R, actually, which I've made available on mathstat (/var/www/html/mackay/R/SmallMultiples/generate_graphs.R). This also illustrates how to read an Excel spreadsheet in R, and some MYSQL commands for selecting data. It's all a little overwhelming perhaps! But it's worth digging into....

Here is the Mathematica File that produces the following plots [This needs to be updated Madison -- also the file for year 07 is called 1917m....]:

Image:g1901.png Image:g1902.png Image:g1903.png Image:g1904.png Image:g1905.png Image:g1906r.png Image:g1907r.png Image:g1908.png Image:g1909.png Image:g1911.png Image:g1912.png Image:g1913.png Image:g14.png Image:g15.png Image:g16.png Image:g17.png Image:g18.png Image:g19.png Image:g20.png Image:g21.png Image:g22.png Image:g23.png

Good work Madison and Steve: Image:Animate.gif

  • I added "padding" for missing years, by repeating the preceding year.
  • To create the animation I used the ImageMagik command convert: convert -delay 200 -loop 0 *19*png animate.gif

Now to the important question: What do we learn?


  • In order to get a (very) crude estimate for the centroid of each region, I reproduced the regions with cardboard and then balanced it on a pen to find the midpoint. From there, I was able to use a computer program to find a coordinate for each centroid to compare to the point I found. Eight of the regions coincide with county borders, so I was able to find coordinates for each county. I then put them into a program that would give me a geographical midpoint. I was able to compare to what I had done with the cardboard, and surprisingly enough, they actually matched up quite well. The estimate for regions six and seven were even more crude because they split up a county. Finding those coordinates involved a lot of trial and error.
  • Here are the coordinates I ended up with (in degrees);
    • Region 1: 43.9, -65.8
    • Region 2: 44.2, -65
    • Region 3: 44.85, -64.9
    • Region 4: 45.25, -63.6
    • Region 5: 45.1, -62.45
    • Region 6: 45.5, -63.97
    • Region 7: 45.7, -62.75
    • Region 8: 45.85, -60.45
    • Region 9: 46.4, -60.6
    • Region 10: 46.2, -61.1
  • Here is the link to the website I used to find midpoints

Missing Data

Personal tools