CURM Thirty-first Meeting, 5/18/2009
Agenda
Announcements
Suzanne has just recently submitted the request for meal reimbursement -- hold on!
Future Business
- Figure 3: Katie will produce them (please do one with symbol by location, and one with symbol by size -- small, medium, large -- per Jon's request); Grayson will comment on it (rough version here)
- Wasp RWL to Mass: Katie will finalize, and comment
- Cicada RWL to Mass: Grayson and I will continue to work on the step-wise regression, then Katie will produce the final figure. Grayson will comment.
- Stair-step figure: I will finalize the model, and get the approximate SEs. I will comment.
- Crude opportunistic statistic: I will calculate and comment.
Old Business
Last time we were looking at some of the results of some propositions in materials Jon sent from the literature. Katie was able to contest the notion that there was a change in the regression of RWL versus Mass for wasps: we don't see the hypothesized decrease in RWL growth with increasing Mass here's an animation of the results... In fact, the relationship seemed to be in the opposite direction.
Grayson was able to determine that the Dow data provided evidence consistent with about a 25% mass conversion relationship.
I presented some information related to the non-linear regression standard errors, which we will need to present along with our models.
These relationships will not be playing a role in the current paper, however. We will focus on the more limited objectives required by Jon Hastings in this manuscript for the Florida Entomologist
- Here's the current draft
- Table 1
- Table 2
New Business
For this time:
- Do I recall that Katie agreed to look at the Florida Entomologist requirements for authors? Sorry, seems like a long time ago....
- Here are the directions from the Florida Entomologist requirements for tables and figures:
- Submit all figures and photographs as .jpg or .tiff files. All figures and tables must be referenced in the text with Arabic numerals in the order in which they appear in the text. Table footnotes are written below the table and indicated as superscript Arabic numerals. Table legend in uppercase. All captions for figures are listed together on a separate page. All illustrations must be complete and final. Make a composite figure if numerous line drawings or photographs are needed; do not combine photographs and drawings in the same figure.
- Here are the directions from the Florida Entomologist requirements for tables and figures:
- I have been talking to Jon about what we need to put together for the paper.
- First of all, what are the analyses we are going to present:
- Wasp RWL and Mass (use FLORIDA.xls)
- Cicada RWL and Mass (use CICADA_PREY_SPECIES_size_RWL_FL08.xls
- Stair-Step Model (using newprovisioning_florida.xls)
- In discussions with Jon, we will discuss how we arrived at our final model, but only present the final stair-step model
- I was able to obtain approximate standard error estimates, using our non-linear regression approximations discussed last time:
baseline 26.588 0.342 step-one 16.038 0.671 step-two 10.960 1.299 center-one 28.372 0.009 width-one 0.008 0.010 center-two 33.664 0.008 width-two 0.007 0.006
- Reproduce Figure 3 from Grant's paper (using newprovisioning_florida.xls) Here's newprovisioning_florida_table_3.txt
- Use the breakpoints of our stair-step model and the census provided by the wasps in St. John's to calculate the probability that the wasps could be choosing their cicadas opportunistically.
- We need to assure that we're using the same data. Let's check numbers against the files that Jon wants to use (above):
- First of all, what are the analyses we are going to present:
R code for Cicada stepwise regression
Data files:
Here's my take, Grayson:
- xlispstat came up with two models: either lnRWL alone, or lnRWL with the middle*lnRWL terms. There is one rule of thumb which states that you don't throw an interaction term in unless you throw in the intercept term.
- Using R, we see, through both add1 and drop1, that the conclusion is that it's certain that we need lnRWL, and then it's a toss-up between medium and medium*lnRWL interaction term. Since it's virtually a toss-up, I'd say that we go with the three term model
Fit <- lm(lnMass ~ lnRWL + medium)
Results:
Call: lm(formula = lnMass ~ lnRWL + medium) Residuals: Min 1Q Median 3Q Max -0.309325 -0.081741 0.001887 0.090330 0.273934 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -4.08275 0.14327 -28.50 < 2e-16 *** lnRWL 3.02359 0.04220 71.66 < 2e-16 *** medium 0.22861 0.03008 7.60 3.87e-12 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1137 on 140 degrees of freedom Multiple R-squared: 0.9829, Adjusted R-squared: 0.9827 F-statistic: 4033 on 2 and 140 DF, p-value: < 2.2e-16
R-code:
ctable<-read.delim("cicada_prey_species.txt") species<-ctable[,1] mass<-ctable[,2] RWL<-ctable[,3] lnMass <- log(mass) lnRWL <- log(RWL) levels(species) <- c("s", "s", "l", "m") # Grayson: T.b. comes before T.g. Fit <- lm(lnMass ~ lnRWL + factor(species) + factor(species)*lnRWL) lm1 <- lm(lnMass~1, data=ctable) Fit <- lm(lnMass ~ lnRWL + factor(species)*lnRWL) # automatically does constant plus cross terms summary(Fit) # Let's try some new data, in which I add the indicator variables: ctable<-read.delim("cicada_prey_species_with_indicators.txt") lnMass<-ctable[,1] lnRWL<-ctable[,2] small<-ctable[,3] medium<-ctable[,4] large<-ctable[,5] mediumCross <-medium*lnRWL largeCross <-large*lnRWL lm1 <- lm(lnMass~1,data=ctable) add1(lm1, ~ lnRWL + medium + large + mediumCross + largeCross) lm2 <- lm(lnMass~ lnRWL) summary(lm2) add1(lm2, ~ lnRWL + medium + large + mediumCross + largeCross) lm3 <- lm(lnMass~ lnRWL + medium) summary(lm3) add1(lm3, ~ lnRWL + medium + large + mediumCross + largeCross) Fit <- lm(lnMass ~ lnRWL + medium + large + mediumCross + largeCross) summary(Fit) drop1(Fit, test="F") Fit <- lm(lnMass ~ lnRWL + medium + large + mediumCross) summary(Fit) drop1(Fit, test="F") Fit <- lm(lnMass ~ lnRWL + medium + mediumCross) summary(Fit) drop1(Fit, test="F") Fit <- lm(lnMass ~ lnRWL + medium) summary(Fit) drop1(Fit, test="F")
Output from xlispstat:
- Regression:
Linear Regression: Estimate SE Prob Constant -2.74044 (0.803583) 0.00085 lnRWL 2.61456 (0.244668) 0.00000 medium-ind -5.52714 (2.58019) 0.03395 large-ind 0.682505 (2.92283) 0.81572 medium-ind*lnRWL 1.58078 (0.695402) 0.02457 large-ind*lnRWL -0.100850 (0.750453) 0.89330 R Squared: 0.983722 Sigma hat: 0.112296 Number of cases: 143 Degrees of freedom: 137
- Step-wise Regression:
Linear Regression: Estimate SE Prob Constant -4.63425 (0.146302) 0.00000 lnRWL 3.19455 (4.227782E-2) 0.00000 R Squared: 0.975899 Sigma hat: 0.134690 Number of cases: 143 Degrees of freedom: 141
- Backward Step-wise Regression:
Linear Regression: Estimate SE Prob Constant -4.08042 (0.143162) 0.00000 lnRWL 3.02288 (4.216392E-2) 0.00000 medium-ind*lnRWL 6.088277E-2 (7.979796E-3) 0.00000 R Squared: 0.982977 Sigma hat: 0.113600 Number of cases: 143 Degrees of freedom: 140
Output from R:
- Regression:
Call: lm(formula = lnMass ~ lnRWL + factor(species) * lnRWL) Residuals: Min 1Q Median 3Q Max -0.280596 -0.081156 0.001235 0.082626 0.251145 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -2.7404 0.8036 -3.410 0.000853 *** lnRWL 2.6146 0.2447 10.686 < 2e-16 *** factor(species)l 0.6825 2.9228 0.234 0.815715 factor(species)m -5.5271 2.5802 -2.142 0.033950 * lnRWL:factor(species)l -0.1008 0.7505 -0.134 0.893295 lnRWL:factor(species)m 1.5808 0.6954 2.273 0.024570 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1123 on 137 degrees of freedom Multiple R-squared: 0.9837, Adjusted R-squared: 0.9831 F-statistic: 1656 on 5 and 137 DF, p-value: < 2.2e-16
- Step-wise Regression and Backward Step-wise Regression give virtually a dead-heat between the following two models::
Call: lm(formula = lnMass ~ lnRWL + medium) Residuals: Min 1Q Median 3Q Max -0.309325 -0.081741 0.001887 0.090330 0.273934 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -4.08275 0.14327 -28.50 < 2e-16 *** lnRWL 3.02359 0.04220 71.66 < 2e-16 *** medium 0.22861 0.03008 7.60 3.87e-12 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1137 on 140 degrees of freedom Multiple R-squared: 0.9829, Adjusted R-squared: 0.9827 F-statistic: 4033 on 2 and 140 DF, p-value: < 2.2e-16
Call: lm(formula = lnMass ~ lnRWL + mediumCross) Residuals: Min 1Q Median 3Q Max -0.309276 -0.081488 0.001927 0.090330 0.273895 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -4.08042 0.14316 -28.50 < 2e-16 *** lnRWL 3.02288 0.04216 71.69 < 2e-16 *** mediumCross 0.06088 0.00798 7.63 3.29e-12 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1136 on 140 degrees of freedom Multiple R-squared: 0.983, Adjusted R-squared: 0.9827 F-statistic: 4042 on 2 and 140 DF, p-value: < 2.2e-16
Non-linear Regression
For our models, we're looking at models that contain several variables (not simple linear)
Last time, we examined how standard errors are handled in the simple linear regression case, and how the Hessian matrix can be used to approximate them. The non-linear regression approximation was verified in this case:
where
- is the sum of squared errors (which is what we minimize) for the estimated parameter values ;
- N is the length of the data vector x;
- p is the length of the parameter vector ;
- diagonal extracts the diagonal of a matrix as a vector; and
- H is the Hessian matrix.
This code dumped into this site illustrates that the approximation works (or follow the link in this file).
The Hessian matrix approach is important because, whereas we don't have the matrix in the non-linear case, we can always approximate the Hessian matrix numerically, so we can get something reasonable.