# MacKay Data Analysis

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

## Small Multiples

Small multiples for the graphic that MacKay used in his 1901 report on phenology, featuring the summarized variables that MacKay chose:

• Mayflower,
• Strawberry,
• Apple,
• Lilac, and
• Blackberry.

"Small multiplies" is an idea due to Edward Tufte, who is famous for his designs of the proper way to illustrate graphical information. I found this article entitled "Tufte in R", which might be interesting to investigate. Ironically the section on "Small Multiples" is "in preparation". Maybe we could do this in R. I do have an example of Small Multiples in R, actually, which I've made available on mathstat (/var/www/html/mackay/R/SmallMultiples/generate_graphs.R). This also illustrates how to read an Excel spreadsheet in R, and some MYSQL commands for selecting data. It's all a little overwhelming perhaps! But it's worth digging into....

Here is the Mathematica File that produces the following plots [This needs to be updated Madison -- also the file for year 07 is called 1917m....]:

• I added "padding" for missing years, by repeating the preceding year.
• To create the animation I used the ImageMagik command convert: convert -delay 200 -loop 0 *19*png animate.gif

Now to the important question: What do we learn?

## Centroids

• In order to get a (very) crude estimate for the centroid of each region, I reproduced the regions with cardboard and then balanced it on a pen to find the midpoint. From there, I was able to use a computer program to find a coordinate for each centroid to compare to the point I found. Eight of the regions coincide with county borders, so I was able to find coordinates for each county. I then put them into a program that would give me a geographical midpoint. I was able to compare to what I had done with the cardboard, and surprisingly enough, they actually matched up quite well. The estimate for regions six and seven were even more crude because they split up a county. Finding those coordinates involved a lot of trial and error.
• Here are the coordinates I ended up with (in degrees);
• Region 1: 43.9, -65.8
• Region 2: 44.2, -65
• Region 3: 44.85, -64.9
• Region 4: 45.25, -63.6
• Region 5: 45.1, -62.45
• Region 6: 45.5, -63.97
• Region 7: 45.7, -62.75
• Region 8: 45.85, -60.45
• Region 9: 46.4, -60.6
• Region 10: 46.2, -61.1
• Here is the link to the website I used to find midpoints