| Title: | Diversity-Interactions (DI) Models |
|---|---|
| Description: | The 'DImodels' package is suitable for analysing data from biodiversity and ecosystem function studies using the Diversity-Interactions (DI) modelling approach introduced by Kirwan et al. (2009) <doi:10.1890/08-1684.1>. Suitable data will contain proportions for each species and a community-level response variable, and may also include additional factors, such as blocks or treatments. The package can perform data manipulation tasks, such as computing pairwise interactions (the DI_data() function), can perform an automated model selection process (the autoDI() function) and has the flexibility to fit a wide range of user-defined DI models (the DI() function). |
| Authors: | Rafael de Andrade Moral [aut, cre], John Connolly [aut], Rishabh Vishwakarma [ctb], Caroline Brophy [aut] |
| Maintainer: | Rafael de Andrade Moral <[email protected]> |
| License: | GPL (>=2) |
| Version: | 1.3.3 |
| Built: | 2026-05-17 08:16:32 UTC |
| Source: | https://github.com/rafamoral/dimodels |
The 'DImodels' package is suitable for analysing data from biodiversity and ecosystem function studies using the Diversity-Interactions (DI) modelling approach introduced by Kirwan et al. (2009) <doi:10.1890/08-1684.1>. Suitable data will contain proportions for each species and a community-level response variable, and may also include additional factors, such as blocks or treatments. The package can perform data manipulation tasks, such as computing pairwise interactions (the DI_data() function), can perform an automated model selection process (the autoDI() function) and has the flexibility to fit a wide range of user-defined DI models (the DI() function).
Introduction to Diversity-Interactions models
Diversity-Interactions (DI) models (Kirwan et al 2009) are a set of tools for analysing and interpreting data from experiments that explore the effects of species diversity on community-level responses; for example, the effect of increasing community diversity on biomass production in a grassland ecosystem. Most analyses of diversity experiments quantify community diversity in terms of species richness, the number of species present. The DI method modifies this presence/absence approach in mixed communities by taking species relative abundance in the community into account. So, instead of ignoring differences in community responses across communities with the same species but with different species relative abundances, the DI approach aims to understand and explain these differences.
What variables will a suitable dataset contain?
The DI approach assesses the effect of species diversity on a community response over a period of time. For data from a biodiversity study with S species to be suited to the DI approach, the variables that are required on each experimental unit are:
A set of proportions that characterise the proportions of each species at a defined starting point in time. These proportions (or relative abundances) of species in the community ( for the ith species) range between 0 (absence) and 1 (monoculture - the only species present) and the sum of all the values for a community is always 1.
A community-level response variable, recorded a period of time after the initial species proportions were recorded.
The dataset may also contain other variables such as a block, density or treatments.
What are Diversity-Interactions models?
A DI model typically has three components and takes the form:
where y is a community-level response, the Identities are the effects of species identities and enter the model as individual species proportions at the beginning of the time period, the Interactions are the interactions among the species proportions, while Structures include other experimental structures such as blocks, treatments or density. In a three species system, with an experimental blocking structure, a possible DI model is:
Where p1, p2 and p3 are the species proportions of species 1, 2 and 3 respectively, and is the effect of block k. The error term is usually assumed to be normally distributed with mean 0 and constant variance (and assumed independent and identically distributed).
In a monoculture of the ith species, the expected performance of y in block k is . In mixture, the Identities component provides the expected performance as a weighted average of monoculture responses, and the Interactions component is added to it to give the overall expected performance of the mixed community. The Interactions component addresses the question: do mixed communities perform differently from what might be expected from the weighted averaging of monoculture performances? Note that in Kirwan et al 2009 the Interactions component is referred to as the diversity effect, however, here we use the more general term "Interactions", since the interpretation of how species diversity affects the response is a combination of both the Identities and Interactions components. The community with the best overall performance may depend on both the Identities and Interactions and will rarely be the community with the largest net interactions effect. The equation above provides an example of a DI model where there is a unique interaction term for each pair of species. It is possible to test various constraints among interactions, some of which may be motivated by the context of the data (Kirwan et al 2009).
The Interactions component may also include an non-linear exponent parameter on each term, where a value different to one allows the importance and impact of interaction terms to be altered (Connolly et al 2013).
What can the DImodels package do?
The DI approach is a full regression method where the response of the community is characterised by the effects of diversity variables such as Identities and Interactions components, and by experimental structural variables such as blocks, density and treatments. All of these may be important determinants of response and so should be included in the analysis of community responses. The DImodels package aims to make it easier to analyse data using DI models.
Currently, the DImodels package contains three main functions: autoDI, DI and DI_data. Here we give a brief overview of each, and link to the respective help files for further information.
auto_DI: This function gives a simple overview of the successive contribution of the Structures, Identities and Interactions components to the model via an automated model selection process. It will identify the 'best' model from a (limited) subset of all possible DI models. However, autoDI may need to be supplemented by more refined analysis, for example, auto_DI does not test for interactions between the terms in the Structures and Identities components. It can also only facilitate one block, one density and one treatment variable. However, it is a very useful starting point for DI model exploration. Further information at: autoDI.
DI: This function can fit a wide range of DI models and includes the flexibility to test for multiple treatments or additional interactions, for example, between terms in the Identities and Structures components. The DI function fits one user-defined DI model at a time, and it allows for flexibility in the form of the model through a combination of in-built argument options and additional user-defined options. Further information at: DI.
DI_data: This function can compute various forms of interactions among the variables. This function is not required when using autoDI, or for the in-built argument options in DI, but may be needed when specifying additional user-defined options in DI. Further information at: DI_data and for examples of when it is required see DI.
There are three other functions:
theta_CI can fit a confidence interval to the parameter theta, when it has been estimated using either autoDI or DI,
DI_compare that can compare a fitted DI model to the 'reference' model (see autoDI) for details about the reference model), and
update_DI that can update a fitted DI model passing different values to one or more arguments (see update_DI) for details).
Challenges with fitting and interpreting Diversity-Interactions models
Analysing data using DI methods can be tricky for people familiar with ordinary regression models and the DImodels package aims to make analysis easier. The difficulties lie in:
The lack of familiarity in dealing with the Identities component variables , which must sum to 1. This constraint can lead to interpretative issues, and estimation problems with some widely used R software.
The novelty of specifying the Interactions components in terms of many pairwise interaction terms whose numbers may greatly increase with S. The number of pairwise terms can often be reduced by identifying biologically meaningful patterns among them, for example, through functional grouping of species. This may greatly reduce the number of coefficients to be estimated and interpreted in the model (Kirwan et al, 2009).
The introduction of a power coefficient theta () for all pairwise interaction terms (Connolly et al, 2013). This parameter can be very useful in describing the effect of community evenness on community response, i.e., whether response changes rapidly or slowly across communities where the relative abundance of species changes from being equal for all species to dominance by one or more species.
Limitations of the DImodels package
Currently, the DImodels package does not support multivariate responses or repeated measurements on the same experimental unit. It also does not currently support the random effects approach to modelling pairwise interactions that was developed by Brophy et al 2017.
Rafael de Andrade Moral [aut, cre], John Connolly [aut], Rishabh Vishwakarma [ctb], Caroline Brophy [aut]
Maintainer: Rafael de Andrade Moral <[email protected]>
Brophy C, A Dooley, L Kirwan, JA Finn, J McDonnell, T Bell, MW Cadotte and J Connolly. (2017) Biodiversity and ecosystem function: Making sense of numerous species interactions in multi-species communities. Ecology 98, 1771-1778.
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
Example datasets:
The Bell dataset.
The sim1 dataset.
The sim2 dataset.
The sim3 dataset.
The sim4 dataset.
The sim5 dataset.
The Switzerland dataset.
## Load the Switzerland data data(Switzerland) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p4 are in the 4th to 7th columns in Switzerland Switzerlandsums <- rowSums(Switzerland[4:7]) summary(Switzerlandsums) ## Example of autoDI auto1 <- autoDI(y = "yield", density = "density", prop = c("p1","p2","p3","p4"), treat = "nitrogen", FG = c("G","G","L","L"), data = Switzerland, selection = "Ftest") summary(auto1) ## Example of DI m1 <- DI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen", FG = c("G","G","L","L"), DImodel = "FG", data = Switzerland) summary(m1) ## Example of DI_data. ## Create interaction variables and incorporate them into a new data frame Switzerland2. ## Switzerland2 will contain the new variables: AV, E, p1_add, p2_add, p3_add, p4_add, ## bfg_G_L, wfg_G, wfg_L. FG_matrix <- DI_data(prop = c("p1","p2","p3","p4"), FG = c("G","G","L","L"), data = Switzerland, what = "FG") Switzerland2 <- data.frame(Switzerland, FG_matrix)## Load the Switzerland data data(Switzerland) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p4 are in the 4th to 7th columns in Switzerland Switzerlandsums <- rowSums(Switzerland[4:7]) summary(Switzerlandsums) ## Example of autoDI auto1 <- autoDI(y = "yield", density = "density", prop = c("p1","p2","p3","p4"), treat = "nitrogen", FG = c("G","G","L","L"), data = Switzerland, selection = "Ftest") summary(auto1) ## Example of DI m1 <- DI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen", FG = c("G","G","L","L"), DImodel = "FG", data = Switzerland) summary(m1) ## Example of DI_data. ## Create interaction variables and incorporate them into a new data frame Switzerland2. ## Switzerland2 will contain the new variables: AV, E, p1_add, p2_add, p3_add, p4_add, ## bfg_G_L, wfg_G, wfg_L. FG_matrix <- DI_data(prop = c("p1","p2","p3","p4"), FG = c("G","G","L","L"), data = Switzerland, what = "FG") Switzerland2 <- data.frame(Switzerland, FG_matrix)
This function provides an automated way to fit a (limited) range of Diversity-Interactions (DI) models. Using one of several selection criteria, autoDI will identify the best DI model from the range fitted via a three-step selection process (see Details for more information). While autoDI can be a useful starting point for fitting DI models, its range of models is not exhaustive and additional model fitting or testing via DI is likely to be required. For instance, autoDI does not test for interactions of a treatment with other variables in the model.
autoDI(y, prop, data, block, density, treat, ID, FG = NULL, selection = c("Ftest","AIC","AICc","BIC","BICc"), step0 = FALSE, step4 = TRUE)autoDI(y, prop, data, block, density, treat, ID, FG = NULL, selection = c("Ftest","AIC","AICc","BIC","BICc"), step0 = FALSE, step4 = TRUE)
y |
The column name of the response vector, which must be in quotes, for example, |
block |
The name of the block variable (if present), which must be in quotes, for example, |
density |
The name of the density variable (if present), which must be in quotes, for example, |
prop |
A vector of s column names identifying the species proportions in each row in the dataset. For example, if the species proportions columns are labelled p1 to p4, then |
treat |
The name of a column in the dataset containing the value of a treatment factor or covariate. The treatment name must be included in quotes, for example, |
ID |
This argument takes a text list (of length s) dsecirbing groupings for the identity effects of the species. For example, if there are four species and you wish to group the identity effects all four species into a single term:
|
FG |
If species are classified by g functional groups, this parameter gives a text list (of length s) of the functional group to which each species belongs. For example, for four grassland species with two grasses and two legumes: FG could be |
data |
Specify the dataset, for example, |
selection |
Selection method to be used in the automated model selection process. Options are |
step0 |
By default, |
step4 |
Step 4 performs a lack of fit test for the final model selected by steps 1 - 3. By default, |
What are Diversity-Interactions models?
Diversity-Interactions (DI) models (Kirwan et al 2009) are a set of tools for analysing and interpreting data from experiments that explore the effects of species diversity on community-level responses. We recommend that users of the DImodels package read the short introduction to DI models (available at: DImodels). Further information on DI models is available in Kirwan et al 2009 and Connolly et al 2013.
Checks on data prior to using autoDI.
Before using autoDI, check that the species proportions for each row in your dataset sum to one. See the 'Examples' section for code to do this. An error message will be generated if the proportions don't sum to one.
How does the autoDI function work?
The autoDI function identifies the 'best' Diversity-Interactions (DI) model from a specific range of proposed models using a three-step process of selection (Steps 1 to 3) and performs a lack of fit test on the selected model (Step 4). Only a limited subset of all possible models are examined under autoDI, for example, interactions involving experiment structural terms (block, density, treat) are not explored. The autoDI function can provide an excellent initial analysis, but often additional modelling and exploration will be required using the DI function.
Steps 1-3 outlined below provide details on the automated model selection process followed by autoDI. Step 4 is a lack of fit test for the selected model. Step 0 may also be included (step0 = TRUE) as an initial step to sequentially test the inclusion of experiment structural variables (block, density, treat), prior to fitting the DI models in Step 1.
The default selection method used is selection = "Ftest", which will return the appropriate F test statistic value(s) and p-value(s) in each step. When any of the information theoretic approaches ("AIC", "AICc", "BIC" or "BICc") are specified, the model with the lowest value is selected in each step, even if it is only the lowest by a tiny margin; therefore, it is recommended to examine the information theoretic values across all models.
Step 1
The AV model is fitted twice: considering the non-linear parameter theta equal to 1, and estimating it by maximising the profile likelihood. For example, for a design with 4 species, for the AV model, the parameter theta enters the model as:
AV_theta = (p1*p2)^theta + (p1*p3)^theta + (p1*p4)^theta + (p2*p3)^theta + (p2*p4)^theta + (p3*p4)^theta
In the profile likelihood estimation, theta is tested across the range 0.01 to 1.5. Then, the two models (with theta estimated and with theta set to 1) are compared using the method specified by the selection argument.
Details on DI models that include theta are described in Connolly et al 2013.
Step 2
Five models are fitted, sequentially, each building on the previous model, and compared. If in Step 1 the conclusion is that theta is not different from 1, all models are fitted without estimating theta. If, however, in Step 1 the conclusion is that theta is different from 1, all models are fitted by fixing theta to its estimate obtained in Step 1.
If the experiment structural variables treat, block or density are specified, they will be included as an additive factor or covariate in each of the five models, but interactions between them and the DI model terms will not be included or tested.
Assume that FG (functional groups) have been specified. The five DI models are:
STR: This model contains an intercept, and block, density and treatment if present; STR stand for 'structural' and represents experiment structural variables.
ID: The species proportions are added to the STR model. The proportions sum to 1, and are included in the model as 0 + p1 + ... + ps, where s is the number of species in the pool, as specified in the prop option.
AV: The terms in the ID model, plus a single 'average' pairwise interaction term. For the Switzerland dataset, the single variable that is added to the model is computed as: AV = p1*p2 + p1*p3 + p1*p4 + p2*p3 + p2*p4 + p3*p4.
FG: The terms in the ID model, plus interaction terms related to functional groups. These terms describe the average effects of interaction between pairs of species within each functional group, and between pairs of species from different functional groups. For example, in the Switzerland dataset there are four species with FG = c("Grass", "Grass", "Legume", "Legume"), and there are six pairwise interactions, one between the two grasses, one between the two legumes, and four between grass and legume species. Grouping these interactions gives three terms, one for interactions between grasses, one for interactions between legumes and one for between a grass and a legume species. The model assumes that any grass will interaction with any legume in the same way and the 'between functional group grass-legume' variable is computed as: bfg_G_L = p1*p3 + p1*p4 + p2*p3 + p2*p4. If there were more than two grasses in the dataset, this model would assume that any pair of grasses interact in the same way, similarly for legumes. If the FG argument is not specified, this model is omitted from Step 1.
FULL: The terms in the ID model, plus an interaction term for each pair of species. When there are s species, there are s*(s-1)/2 pairwise interaction terms, i.e., for the Switzerland dataset with four species, there are six interactions that are each added to the model (in R formula syntax: p1:p2, p1:p3, p1:p4, p2:p3, p2:p4 and p3:p4).
If the FG argument is omitted, the FG model will be replaced by:
ADD: The terms in the ID model, plus a species specific 'additive' interaction term for each species. These terms measure the interactive contribution of each species with any other species and are denoted for the ith species. The interaction between any pair of species i and j is computed as .
If selection by information criteria is specified, both ADD and FG models will be fitted and compared using the selected information criterion.
Step 3
If treat is specified, the selected model in Step 2 includes the treat variable (since all models in Step 2 include treat if present). Here, the selected model from Step 2 is re-fitted without treat and the models with and without the treatment are compared using the method specified by the selection argument.
If treat was not specified, this step is redundant.
Step 4
Step 4 provides a lack of fit test for the model selected by Steps 1 to 3. A factor called 'community' is created that has a level for each unique setting of the species proportions (as specified in the prop argument). The 'reference' model includes all terms in the model that was selected by Steps 1 to 3, plus the factor community. The reference model is compared to the model selected by Steps 1 to 3 via an F-test (regardless of the selection argument value), thus providing a lack of fit test.
Note, that the reference model is never intended to be a candidate model, it is only fitted for the purpose of testing lack of fit. If the test result is significant, it indicates that there is a lack of fit in the Diversity-Interactions model selected by autoDI.
Note also, that the test will not work if all combinations of the species proportions are unique. In this instance, the option Step4 = FALSE should be selected.
The function returns the final selected model, an object of class DI.
Rafael A. Moral, John Connolly, Rishabh Vishwakarma and Caroline Brophy
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
Other examples using autoDI:
The sim1 dataset examples.
The sim2 dataset examples.
The sim3 dataset examples.
The sim4 dataset examples.
The sim5 dataset examples.
The Switzerland dataset examples.
## Load the Switzerland data data(Switzerland) ## Summarise the Switzerland data summary(Switzerland) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p4 are in the 4th to 7th columns in Switzerland Switzerlandsums <- rowSums(Switzerland[4:7]) summary(Switzerlandsums) ## Perform automated model fitting on the Switzerland dataset ## Model selection by F-test auto1 <- autoDI(y = "yield", density = "density", prop = c("p1","p2","p3","p4"), treat = "nitrogen", FG = c("G", "G", "L", "L"), data = Switzerland, selection = "Ftest") summary(auto1) ## Running autoDI after grouping identity effects using the "ID" argument auto2 <- autoDI(y = "yield", density = "density", prop = c("p1","p2","p3","p4"), treat = "nitrogen", ID = c("ID1", "ID1", "ID1", "ID1"), FG = c("G", "G", "L", "L"), data = Switzerland, selection = "Ftest") summary(auto2) ## Using column numbers to indicate which columns contain the proportions and including Step 0 auto2 <- autoDI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen", FG = c("G", "G", "L", "L"), data = Switzerland, selection = "Ftest", step0 = TRUE) summary(auto2) ## Exclude the FG (functional group) argument to include the additive species "ADD" model in Step 1 auto3 <- autoDI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen", data = Switzerland, selection = "Ftest") summary(auto3)## Load the Switzerland data data(Switzerland) ## Summarise the Switzerland data summary(Switzerland) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p4 are in the 4th to 7th columns in Switzerland Switzerlandsums <- rowSums(Switzerland[4:7]) summary(Switzerlandsums) ## Perform automated model fitting on the Switzerland dataset ## Model selection by F-test auto1 <- autoDI(y = "yield", density = "density", prop = c("p1","p2","p3","p4"), treat = "nitrogen", FG = c("G", "G", "L", "L"), data = Switzerland, selection = "Ftest") summary(auto1) ## Running autoDI after grouping identity effects using the "ID" argument auto2 <- autoDI(y = "yield", density = "density", prop = c("p1","p2","p3","p4"), treat = "nitrogen", ID = c("ID1", "ID1", "ID1", "ID1"), FG = c("G", "G", "L", "L"), data = Switzerland, selection = "Ftest") summary(auto2) ## Using column numbers to indicate which columns contain the proportions and including Step 0 auto2 <- autoDI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen", FG = c("G", "G", "L", "L"), data = Switzerland, selection = "Ftest", step0 = TRUE) summary(auto2) ## Exclude the FG (functional group) argument to include the additive species "ADD" model in Step 1 auto3 <- autoDI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen", data = Switzerland, selection = "Ftest") summary(auto3)
This dataset comes from a bacterial biodiversity experiment (Bell et al 2005). The bacterial ecosystems used were from semi-permanent rainpools that form in bark-lined depressions near the base of large European beech trees (Fagus sylvatica). Microcosms consisting of sterile beech leaf disks and 10 ml of liquid (phosphate buffer) were inoculated with random combinations of 72 bacterial species isolated from these ecosystems. A total of 1,374 microcosms were constructed at richness levels of 1, 2, 3, 4, 6, 8, 9, 12, 18, 24, 36 and 72 species. The daily respiration rate of the bacterial community in each microcosm was measured over three time intervals (days 0-7, 7-14 and 14-28) and the average over the three time intervals was recorded.
data("Bell")data("Bell")
A data frame with 1374 observations on the following 76 variables:
idA numeric vector uniquely identifying each row of the dataset.
communityA numeric vector identifying each unique community, i.e., two rows with the same community value also share the same set of p1 to p72 values.
richnessThe number of species included in the initial composition, i.e., the number of proportions from p1 to p72 that are >0.
p1A numeric vector indicating the initial proportion of species 1 in the community.
p2A numeric vector indicating the initial proportion of species 2 in the community.
p3A numeric vector indicating the initial proportion of species 3 in the community.
p4A numeric vector indicating the initial proportion of species 4 in the community.
p5A numeric vector indicating the initial proportion of species 5 in the community.
p6A numeric vector indicating the initial proportion of species 6 in the community.
p7A numeric vector indicating the initial proportion of species 7 in the community.
p8A numeric vector indicating the initial proportion of species 8 in the community.
p9A numeric vector indicating the initial proportion of species 9 in the community.
p10A numeric vector indicating the initial proportion of species 10 in the community.
p11A numeric vector indicating the initial proportion of species 11 in the community.
p12A numeric vector indicating the initial proportion of species 12 in the community.
p13A numeric vector indicating the initial proportion of species 13 in the community.
p14A numeric vector indicating the initial proportion of species 14 in the community.
p15A numeric vector indicating the initial proportion of species 15 in the community.
p16A numeric vector indicating the initial proportion of species 16 in the community.
p17A numeric vector indicating the initial proportion of species 17 in the community.
p18A numeric vector indicating the initial proportion of species 18 in the community.
p19A numeric vector indicating the initial proportion of species 19 in the community.
p20A numeric vector indicating the initial proportion of species 20 in the community.
p21A numeric vector indicating the initial proportion of species 21 in the community.
p22A numeric vector indicating the initial proportion of species 22 in the community.
p23A numeric vector indicating the initial proportion of species 23 in the community.
p24A numeric vector indicating the initial proportion of species 24 in the community.
p25A numeric vector indicating the initial proportion of species 25 in the community.
p26A numeric vector indicating the initial proportion of species 26 in the community.
p27A numeric vector indicating the initial proportion of species 27 in the community.
p28A numeric vector indicating the initial proportion of species 28 in the community.
p29A numeric vector indicating the initial proportion of species 29 in the community.
p30A numeric vector indicating the initial proportion of species 30 in the community.
p31A numeric vector indicating the initial proportion of species 31 in the community.
p32A numeric vector indicating the initial proportion of species 32 in the community.
p33A numeric vector indicating the initial proportion of species 33 in the community.
p34A numeric vector indicating the initial proportion of species 34 in the community.
p35A numeric vector indicating the initial proportion of species 35 in the community.
p36A numeric vector indicating the initial proportion of species 36 in the community.
p37A numeric vector indicating the initial proportion of species 37 in the community.
p38A numeric vector indicating the initial proportion of species 38 in the community.
p39A numeric vector indicating the initial proportion of species 39 in the community.
p40A numeric vector indicating the initial proportion of species 40 in the community.
p41A numeric vector indicating the initial proportion of species 41 in the community.
p42A numeric vector indicating the initial proportion of species 42 in the community.
p43A numeric vector indicating the initial proportion of species 43 in the community.
p44A numeric vector indicating the initial proportion of species 44 in the community.
p45A numeric vector indicating the initial proportion of species 45 in the community.
p46A numeric vector indicating the initial proportion of species 46 in the community.
p47A numeric vector indicating the initial proportion of species 47 in the community.
p48A numeric vector indicating the initial proportion of species 48 in the community.
p49A numeric vector indicating the initial proportion of species 49 in the community.
p50A numeric vector indicating the initial proportion of species 50 in the community.
p51A numeric vector indicating the initial proportion of species 51 in the community.
p52A numeric vector indicating the initial proportion of species 52 in the community.
p53A numeric vector indicating the initial proportion of species 53 in the community.
p54A numeric vector indicating the initial proportion of species 54 in the community.
p55A numeric vector indicating the initial proportion of species 55 in the community.
p56A numeric vector indicating the initial proportion of species 56 in the community.
p57A numeric vector indicating the initial proportion of species 57 in the community.
p58A numeric vector indicating the initial proportion of species 58 in the community.
p59A numeric vector indicating the initial proportion of species 59 in the community.
p60A numeric vector indicating the initial proportion of species 60 in the community.
p61A numeric vector indicating the initial proportion of species 61 in the community.
p62A numeric vector indicating the initial proportion of species 62 in the community.
p63A numeric vector indicating the initial proportion of species 63 in the community.
p64A numeric vector indicating the initial proportion of species 64 in the community.
p65A numeric vector indicating the initial proportion of species 65 in the community.
p66A numeric vector indicating the initial proportion of species 66 in the community.
p67A numeric vector indicating the initial proportion of species 67 in the community.
p68A numeric vector indicating the initial proportion of species 68 in the community.
p69A numeric vector indicating the initial proportion of species 69 in the community.
p70A numeric vector indicating the initial proportion of species 70 in the community.
p71A numeric vector indicating the initial proportion of species 71 in the community.
p72A numeric vector indicating the initial proportion of species 72 in the community.
responseA numeric vector giving the average daily respiration rate of the bacterial community.
What are Diversity-Interactions (DI) models?
Diversity-Interactions (DI) models (Kirwan et al 2009) are a set of tools for analysing and interpreting data from experiments that explore the effects of species diversity on community-level responses. We strongly recommend that users read the short introduction to Diversity-Interactions models (available at: DImodels). Further information on Diversity-Interactions models is also available in Kirwan et al 2009 and Connolly et al 2013.
The Bell dataset is analysed using Diversity-Interactions models in both Brophy et al 2017 and Connolly et al 2013.
Bell T, JA Newman, BW Silverman, SL Turner and AK Lilley (2005) The contribution of species richness and composition to bacterial services. Nature, 436, 1157-1160.
Brophy C, A Dooley, L Kirwan, JA Finn, J McDonnell, T Bell, MW Cadotte and J Connolly (2017) Biodiversity and ecosystem function: Making sense of numerous species interactions in multi-species communities. Ecology, 98, 1771-1778.
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
## Load the Bell data data(Bell) ## View the first five entries head(Bell) ## Explore the variabes in sim1 str(Bell) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p72 are in the 4th to 75th columns in Bell Bellsums <- rowSums(Bell[4:75]) summary(Bellsums) ## Check characteristics of Bell hist(Bell$response) summary(Bell$response) plot(Bell$richness, Bell$response) ## This code takes around 11 seconds to run ## Fit the average pairwise model using DI and the AV tag, with theta estimated m1 <- DI(y = "response", prop = 4:75, DImodel = "AV", estimate_theta = TRUE, data = Bell) summary(m1) CI_95 <- theta_CI(m1, conf = .95) CI_95 plot(m1) library(hnp) hnp(m1) ## Graph the profile likelihood library(ggplot2) ggplot(m1$profile_loglik, aes(x = grid, y = prof)) + theme_bw() + geom_line() + xlim(0,1.5) + xlab(expression(theta)) + ylab("Log-likelihood") + geom_vline(xintercept = CI_95, lty = 3) + labs(title = " Log-likelihood versus theta", caption = "dotted vertical lines are upper and lower bounds of 95% CI for theta")## Load the Bell data data(Bell) ## View the first five entries head(Bell) ## Explore the variabes in sim1 str(Bell) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p72 are in the 4th to 75th columns in Bell Bellsums <- rowSums(Bell[4:75]) summary(Bellsums) ## Check characteristics of Bell hist(Bell$response) summary(Bell$response) plot(Bell$richness, Bell$response) ## This code takes around 11 seconds to run ## Fit the average pairwise model using DI and the AV tag, with theta estimated m1 <- DI(y = "response", prop = 4:75, DImodel = "AV", estimate_theta = TRUE, data = Bell) summary(m1) CI_95 <- theta_CI(m1, conf = .95) CI_95 plot(m1) library(hnp) hnp(m1) ## Graph the profile likelihood library(ggplot2) ggplot(m1$profile_loglik, aes(x = grid, y = prof)) + theme_bw() + geom_line() + xlim(0,1.5) + xlab(expression(theta)) + ylab("Log-likelihood") + geom_vline(xintercept = CI_95, lty = 3) + labs(title = " Log-likelihood versus theta", caption = "dotted vertical lines are upper and lower bounds of 95% CI for theta")
compare_communities performs comparisons between species and assigns each a compact letter display (CLD) using glht and multcompLetters for a model object created by the DI or autoDI functions.
compare_communities(object, data, ref = NULL, adjust = formals(multcomp::adjusted)$type, verbose = FALSE, alpha.level = 0.05, Letters = c(letters, LETTERS, "."), ...)compare_communities(object, data, ref = NULL, adjust = formals(multcomp::adjusted)$type, verbose = FALSE, alpha.level = 0.05, Letters = c(letters, LETTERS, "."), ...)
object |
|
data |
A data.frame containing the proportions for the species communities which are to be compared. Any additional variables such as specific values for a treatment variable can also be specified. |
ref |
An optional argument specifying a particular to be considered a reference. If specified, all comparisons would be made with respect to this community, as opposed to pairwise comparisons. |
adjust |
A character string specifying the p-value adjustment method to be used to account for multiple comparisons. All arguments supported by the |
verbose |
A boolean (TRUE/FALSE) value indicating whether to print the internally generated contrast matrix. Default is TRUE |
alpha.level |
A value between 0 and 1, specifying the singificance level for the comparisons. Defaults to 0.05. |
Letters |
The characters to be assigned to the communities to indicate significant differences. Defaults to lowercase latin characters (i.e., a, b, c, etc.). |
... |
Additional arguments passed to the |
The contrasts are calculated and tested using the glht function in the multcomp package while the cld letters are assigned using the multcompLetters function from multcompView.
The rownames of the specified data.frame are used to identify the communities and are labelled 1, 2, ..., n, etc. by default, but this can be changed by updating the rownames of the specified data.frame.
Note: If any variables needed for making predictions are missing in 'data', the would be assumed to be at their median value (if numeric) or base level (if categorical). Specify the values for all variables manually to perform comparisons at a specific level of any categorical variable (treatment for example).
An list containing the following objects is returned.
Contrasts |
An object of class |
CLD |
An object of class |
Rafael A. Moral, John Connolly, Rishabh Vishwakarma and Caroline Brophy
Piepho, Hans-Peter (2004) "An Algorithm for a Letter-Based Representation of All-Pairwise Comparisons", Journal of Computational and Graphical Statistics, 13(2)456-466.
Torsten Hothorn, Frank Bretz and Peter Westfall (2008). Simultaneous Inference in General Parametric Models. Biometrical Journal 50(3), 346–363.
Frank Bretz, Torsten Hothorn and Peter Westfall (2010), Multiple Comparisons Using R, CRC Press, Boca Raton.
Jason C. Hsu (1996), Multiple Comparisons. Chapman & Hall, London.
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
DI
autoDI
contrasts_DI
glht
adjusted
multcompLetters
## Load the Switzerland data data(Switzerland) ## Summarise the Switzerland data summary(Switzerland) ## Fit a DI model m1 <- DI(y = "yield", prop = 4:7, treat = 'nitrogen', DImodel = 'AV', density = 'density', estimate_theta = FALSE, data = Switzerland) summary(m1) ## Basic comparison of the first five observations in the Switzeraland data head(Switzerland, 5) compare_communities(m1, data = Switzerland[1:5, ]) ## The pairiwise comparisons between each community as well as the cld letters are returned. ## By default, single-step p-value adjustment are performed to account for multiple comparisons. ## This can be changed using adjust argument. For example "none" would perform no adjustment compare_communities(m1, data = Switzerland[1:5, ], adjust = "none") ## Bonferroni adjustment compare_communities(m1, data = Switzerland[1:5, ], adjust = "bonferroni") ## If any variable needed for predictions is missing in the data, ## a warning will be thrown and sensible values will be used. ## In this example, we only specify the species proportions and nothing for nitrogen and density compare_communities(m1, data = Switzerland[1:5, 4:7], adjust = "bonferroni") ## The alpha.level and symbols to be used for cld can be changed as follows compare_communities(m1, data = Switzerland[1:5, ], adjust = "none", alpha.level = 0.01, Letters = LETTERS) ## Change row.names of data to get informative labels for the tests comp_data <- Switzerland[1:5, ] rownames(comp_data) <- c("p1_dom", "p2_dom", "p3_dom", "p4_dom", "Centroid") compare_communities(m1, data = comp_data, adjust = "none", alpha.level = 0.01, Letters = LETTERS) ## Comparisons can also be performed against a specific community as reference ## using the `ref` argument. Using "Centroid" as ref in this example, ## can also be specified using row-number, i.e., 5 compare_communities(m1, data = comp_data, ref = 5, adjust = "none", Letters = LETTERS)## Load the Switzerland data data(Switzerland) ## Summarise the Switzerland data summary(Switzerland) ## Fit a DI model m1 <- DI(y = "yield", prop = 4:7, treat = 'nitrogen', DImodel = 'AV', density = 'density', estimate_theta = FALSE, data = Switzerland) summary(m1) ## Basic comparison of the first five observations in the Switzeraland data head(Switzerland, 5) compare_communities(m1, data = Switzerland[1:5, ]) ## The pairiwise comparisons between each community as well as the cld letters are returned. ## By default, single-step p-value adjustment are performed to account for multiple comparisons. ## This can be changed using adjust argument. For example "none" would perform no adjustment compare_communities(m1, data = Switzerland[1:5, ], adjust = "none") ## Bonferroni adjustment compare_communities(m1, data = Switzerland[1:5, ], adjust = "bonferroni") ## If any variable needed for predictions is missing in the data, ## a warning will be thrown and sensible values will be used. ## In this example, we only specify the species proportions and nothing for nitrogen and density compare_communities(m1, data = Switzerland[1:5, 4:7], adjust = "bonferroni") ## The alpha.level and symbols to be used for cld can be changed as follows compare_communities(m1, data = Switzerland[1:5, ], adjust = "none", alpha.level = 0.01, Letters = LETTERS) ## Change row.names of data to get informative labels for the tests comp_data <- Switzerland[1:5, ] rownames(comp_data) <- c("p1_dom", "p2_dom", "p3_dom", "p4_dom", "Centroid") compare_communities(m1, data = comp_data, adjust = "none", alpha.level = 0.01, Letters = LETTERS) ## Comparisons can also be performed against a specific community as reference ## using the `ref` argument. Using "Centroid" as ref in this example, ## can also be specified using row-number, i.e., 5 compare_communities(m1, data = comp_data, ref = 5, adjust = "none", Letters = LETTERS)
contrast_vars calculates and tests contrasts using glht for a model object created by the DI or autoDI functions.
contrast_matrix is a helper function which can be used to prepare a contrast matrix which contains all the ID effects, interactions and additional model variables. The output of this function can then be passed to contrasts_DI to test contrasts.
contrasts_DI(object, contrast_vars, contrast, verbose, ...) contrast_matrix(object, contrast_vars)contrasts_DI(object, contrast_vars, contrast, verbose, ...) contrast_matrix(object, contrast_vars)
object |
|
contrast_vars |
A matrix or data-frame containing the values of the species prportions in a contrast style (i.e., positive and negative values). The species ID effects and interactions would be calculated automatically (the value of the non-linear parameter theta would also be incorporated) and added to the contrast. The user can also specify any additional variable (like treatment, density) from the model to calculate the contrast at the specific value of that variable(s). Furthermore, it is also possible to manually specify the ID or interaction effects to have a specific value and override those calculated by the function. Any species proportion or additional variable not specified would be assumed to be 0. See examples showing the uses of these options. Will be overridden if |
contrast |
A data-frame or matrix containing the values for generating the contrast. The number of columns in the matrix/data-frame should be the same as the number of linear coefficients in the model. It is assumed that the columns in Recommended only for advanced users who want more control for generating the contrast as most general contrasts can be generated using the Overrides |
verbose |
A boolean (TRUE/FALSE) value indicating whether to print the internally generated contrast matrix. Default is TRUE |
... |
Additional arguments passed to the |
The contrasts are calculated and tested using the glht function in the multcomp package. If the various contrasts are not explicitly named, they'll be given names 'Test 1', 'Test 2', ..., 'Test n'.
An object of class glht is returned
Rafael A. Moral, John Connolly, Rishabh Vishwakarma and Caroline Brophy
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
Torsten Hothorn, Frank Bretz and Peter Westfall (2008). Simultaneous Inference in General Parametric Models. Biometrical Journal 50(3), 346–363.
## Load the Switzerland data data(Switzerland) ## Summarise the Switzerland data summary(Switzerland) ## Fit a DI model m1 <- DI(y = "yield", prop = 4:7, treat = 'nitrogen', DImodel = 'AV', density = 'density', estimate_theta = FALSE, data = Switzerland) summary(m1) ## Create contrasts ## We test the following five sample contrasts ## 1. p1 monoculture against p2 monoculture ## 2. p1 and p2 binary (50-50) mixture against the p4 monoculture ## 3. p1 and p2 binary (50-50) mixture against p3 and p4 (50-50) binary mixture ## 4. Three species equi-proportional mixture (1/3 each) of p2, p3, and p4 against p1 monoculture ## 5. p4 monoculture against a 70-30 mixture of p2 and p3 respectively. contrast_vars <- data.frame("p1" = c(1, 0.5, 0.5, -1, 0), "p2" = c(-1, 0.5, 0.5, 1/3, -0.7), "p3" = c(0, 0, -0.5, 1/3, -0.3), "p4" = c(0, -1, -0.5, 1/3, 1)) con1 <- contrasts_DI(object = m1, contrast_vars = contrast_vars) ## Calling summary on the returned object would show a significane table for the tests summary(con1) ## In the previous example, note how the species ID effects and interaction term ## was added to the contrast matrix. Additionally, the nitrogen50 and densityhigh ## variables were added with a value 0 ## We could also calculate the contrast at a specific value for these additional variables ## Calculate contrast at nitrogen = 50 for the p1 monoculture against p3 monoculture contrast_vars2 <- data.frame("p1" = 1, "p3" = -1, "nitrogen50" = 1) # Name contrast to track it rownames(contrast_vars2) <- "p1 mono vs p3 mono at 50 N" con2 <- contrasts_DI(object = m1, contrast_vars = contrast_vars2) ## Notice that p2 and p4 were added to the contrast with a value 0 summary(con2) ## It is also possible to specify the ID or interactions effects to have a set value contrast_vars3 <- data.frame("p2_ID" = 0.5, "p3_ID" = -0.5, "AV" = 0.375) con3 <- contrasts_DI(object = m1, contrast_vars = contrast_vars3) ## Notice the values for terms specified are preserved while others are calculated or set to 0 summary(con3) ############################################################################# ## Using the `contrast` agrument for creating contrasts ## Contrasts for difference between monocultures of p1 and p2, p3 and p4, p2 and p3 con4 <- contrasts_DI(object = m1, contrast = data.frame('p1vp2 Mono' = c(1, -1, 0, 0, 0, 0, 0), 'p3vp4 Mono' = c(0, 0, 1, -1, 0, 0, 0), 'p2vp3 Mono' = c(0, -1, 1, 0, 0, 0, 0))) summary(con4) ## Contrasts for 50:50 mixture of p1 and p2 vs 50:50 mixture of p3 and p4 con5 <- contrasts_DI(object = m1, contrast = data.frame('p1p2 vs p3p4' = c(0.5, 0.5, -0.5, -0.5, 0, 0, 0))) summary(con5) ## There is also a helper function called `contrast_matrix` to create the ## full contrast matrix with interaction terms and any additional variables ## in the model which can be modified further by the user and passed ## into the contrast parameter ## Compare the three species mixture of p1, p2 and p3 to the centroid mixture mix1 <- contrast_matrix(object = m1, contrast_vars = data.frame("p1" = 1/3, "p2" = 1/3, "p3" = 1/3)) mix2 <- contrast_matrix(object = m1, contrast_vars = data.frame("p1" = 1/4, "p2" = 1/4, "p3" = 1/4, "p4" = 1/4)) ## The interaction terms and nitrogen and density terms are all added ## Subtract the two vectors to get a contrast contr <- mix1 - mix2 con6 <- contrasts_DI(object = m1, contrast = contr) summary(con6) ## Could also modify a variable the returned output to test another contrast ## Suppose we wish to compare the three species mixture at 150 kg N vs the ## centroid at 50 kg N mix1[, "nitrogen50"] <- 0 mix2[, "nitrogen50"] <- 1 contr <- mix1 - mix2 con7 <- contrasts_DI(object = m1, contrast = contr) summary(con7)## Load the Switzerland data data(Switzerland) ## Summarise the Switzerland data summary(Switzerland) ## Fit a DI model m1 <- DI(y = "yield", prop = 4:7, treat = 'nitrogen', DImodel = 'AV', density = 'density', estimate_theta = FALSE, data = Switzerland) summary(m1) ## Create contrasts ## We test the following five sample contrasts ## 1. p1 monoculture against p2 monoculture ## 2. p1 and p2 binary (50-50) mixture against the p4 monoculture ## 3. p1 and p2 binary (50-50) mixture against p3 and p4 (50-50) binary mixture ## 4. Three species equi-proportional mixture (1/3 each) of p2, p3, and p4 against p1 monoculture ## 5. p4 monoculture against a 70-30 mixture of p2 and p3 respectively. contrast_vars <- data.frame("p1" = c(1, 0.5, 0.5, -1, 0), "p2" = c(-1, 0.5, 0.5, 1/3, -0.7), "p3" = c(0, 0, -0.5, 1/3, -0.3), "p4" = c(0, -1, -0.5, 1/3, 1)) con1 <- contrasts_DI(object = m1, contrast_vars = contrast_vars) ## Calling summary on the returned object would show a significane table for the tests summary(con1) ## In the previous example, note how the species ID effects and interaction term ## was added to the contrast matrix. Additionally, the nitrogen50 and densityhigh ## variables were added with a value 0 ## We could also calculate the contrast at a specific value for these additional variables ## Calculate contrast at nitrogen = 50 for the p1 monoculture against p3 monoculture contrast_vars2 <- data.frame("p1" = 1, "p3" = -1, "nitrogen50" = 1) # Name contrast to track it rownames(contrast_vars2) <- "p1 mono vs p3 mono at 50 N" con2 <- contrasts_DI(object = m1, contrast_vars = contrast_vars2) ## Notice that p2 and p4 were added to the contrast with a value 0 summary(con2) ## It is also possible to specify the ID or interactions effects to have a set value contrast_vars3 <- data.frame("p2_ID" = 0.5, "p3_ID" = -0.5, "AV" = 0.375) con3 <- contrasts_DI(object = m1, contrast_vars = contrast_vars3) ## Notice the values for terms specified are preserved while others are calculated or set to 0 summary(con3) ############################################################################# ## Using the `contrast` agrument for creating contrasts ## Contrasts for difference between monocultures of p1 and p2, p3 and p4, p2 and p3 con4 <- contrasts_DI(object = m1, contrast = data.frame('p1vp2 Mono' = c(1, -1, 0, 0, 0, 0, 0), 'p3vp4 Mono' = c(0, 0, 1, -1, 0, 0, 0), 'p2vp3 Mono' = c(0, -1, 1, 0, 0, 0, 0))) summary(con4) ## Contrasts for 50:50 mixture of p1 and p2 vs 50:50 mixture of p3 and p4 con5 <- contrasts_DI(object = m1, contrast = data.frame('p1p2 vs p3p4' = c(0.5, 0.5, -0.5, -0.5, 0, 0, 0))) summary(con5) ## There is also a helper function called `contrast_matrix` to create the ## full contrast matrix with interaction terms and any additional variables ## in the model which can be modified further by the user and passed ## into the contrast parameter ## Compare the three species mixture of p1, p2 and p3 to the centroid mixture mix1 <- contrast_matrix(object = m1, contrast_vars = data.frame("p1" = 1/3, "p2" = 1/3, "p3" = 1/3)) mix2 <- contrast_matrix(object = m1, contrast_vars = data.frame("p1" = 1/4, "p2" = 1/4, "p3" = 1/4, "p4" = 1/4)) ## The interaction terms and nitrogen and density terms are all added ## Subtract the two vectors to get a contrast contr <- mix1 - mix2 con6 <- contrasts_DI(object = m1, contrast = contr) summary(con6) ## Could also modify a variable the returned output to test another contrast ## Suppose we wish to compare the three species mixture at 150 kg N vs the ## centroid at 50 kg N mix1[, "nitrogen50"] <- 0 mix2[, "nitrogen50"] <- 1 contr <- mix1 - mix2 con7 <- contrasts_DI(object = m1, contrast = contr) summary(con7)
This function accepts a DImodel (i.e., regression models fit using the DI or autoDI functions) object and returns a short text summary describing the model.
describe_model(model)describe_model(model)
model |
A short text describing the supplied DImodel object.
Rafael A. Moral, John Connolly, Rishabh Vishwakarma and Caroline Brophy
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
## Load the data data(sim2) ## Fit model mod_FG <- DI(y = "response", FG = c("G", "G", "L", "L"), prop = 3:6, data = sim2, DImodel = "FG") ## Describe model describe_model(mod_FG) mod_FULL <- DI(y = "response", estimate_theta = TRUE, prop = 3:6, data = sim2, DImodel = "FULL") describe_model(mod_FULL) mod_AV <- DI(y = "response", ID = c("ID1", "ID1", "ID2", "ID2"), estimate_theta = TRUE, prop = 3:6, data = sim2, DImodel = "AV") describe_model(mod_AV)## Load the data data(sim2) ## Fit model mod_FG <- DI(y = "response", FG = c("G", "G", "L", "L"), prop = 3:6, data = sim2, DImodel = "FG") ## Describe model describe_model(mod_FG) mod_FULL <- DI(y = "response", estimate_theta = TRUE, prop = 3:6, data = sim2, DImodel = "FULL") describe_model(mod_FULL) mod_AV <- DI(y = "response", ID = c("ID1", "ID1", "ID2", "ID2"), estimate_theta = TRUE, prop = 3:6, data = sim2, DImodel = "AV") describe_model(mod_AV)
This dataset contains a set of proportions p1 to p9 where each row sums to 1. It is used as the design matrix for simulating other datasets in the DImodels package.
data("design_a")data("design_a")
A data frame with 206 observations on the following 11 variables:
communityA numeric vector identifying each unique community, i.e., two rows with the same community value also share the same set of p1 to p9 values.
richnessA numeric vector indicating the number of species in the initial composition, i.e., the number of proportions from p1 to p9 that are >0.
p1A numeric vector indicating a proportion (of species 1).
p2A numeric vector indicating a proportion (of species 2).
p3A numeric vector indicating a proportion (of species 3).
p4A numeric vector indicating a proportion (of species 4).
p5A numeric vector indicating a proportion (of species 5).
p6A numeric vector indicating a proportion (of species 6).
p7A numeric vector indicating a proportion (of species 7).
p8A numeric vector indicating a proportion (of species 8).
p9A numeric vector indicating a proportion (of species 9).
The columns p1 to p9 form a simplex space.
## Load the design_a data data(design_a) ## View the first five entries head(design_a) ## Explore the variables in design_a str(design_a) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p9 are in the 3rd to 11th columns in design_a design_a_sums <- rowSums(design_a[3:11]) summary(design_a_sums)## Load the design_a data data(design_a) ## View the first five entries head(design_a) ## Explore the variables in design_a str(design_a) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p9 are in the 3rd to 11th columns in design_a design_a_sums <- rowSums(design_a[3:11]) summary(design_a_sums)
This dataset contains a set of proportions p1 to p6 where each row sums to 1. It is used as the design matrix for simulating other datasets in the DImodels package.
data("design_b")data("design_b")
A data frame with 47 observations on the following seven variables:
richnessA numeric vector indicating the number of species in the initial composition, i.e., the number of proportions from p1 to p6 that are >0.
p1A numeric vector indicating a proportion (of species 1).
p2A numeric vector indicating a proportion (of species 2).
p3A numeric vector indicating a proportion (of species 3).
p4A numeric vector indicating a proportion (of species 4).
p5A numeric vector indicating a proportion (of species 5).
p6A numeric vector indicating a proportion (of species 6).
The columns p1 to p6 form a simplex space.
## Load the design_b data data(design_b) ## View the first five entries head(design_b) ## Explore the variables in design_b str(design_b) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p6 are in the 2nd to 7th columns in design_b design_b_sums <- rowSums(design_b[2:7]) summary(design_b_sums)## Load the design_b data data(design_b) ## View the first five entries head(design_b) ## Explore the variables in design_b str(design_b) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p6 are in the 2nd to 7th columns in design_b design_b_sums <- rowSums(design_b[2:7]) summary(design_b_sums)
This function will fit a wide range of Diversity-Interactions (DI) models, one at a time. It provides some assisted automated ways to fit DI models, and includes the flexibility to extend DI models in several directions.
DI(y, prop, DImodel, custom_formula, data, block, density, treat, ID, FG, extra_formula, estimate_theta = FALSE, theta = 1)DI(y, prop, DImodel, custom_formula, data, block, density, treat, ID, FG, extra_formula, estimate_theta = FALSE, theta = 1)
The minimum required arguments to use DI are either:
Argument DImodel with data, y and prop, or
Argument custom_formula with data.
The DImodel argument allows fitting of DI models via a range of 'tag' options that determine the form of the species interactions terms (the tags, described below, are STR, ID, AV, FG, ADD and FULL) and extra terms can be added to the model using the extra_formula argument. Using the argument custom_formula requires full specification of the model to be fitted using standard lm or glm syntax.
y |
The column name of the response vector, which must be in quotes, for example, |
block |
The name of the block variable (if present), which must be in quotes, for example, |
density |
The name of the density variable (if present), which must be in quotes, for example, |
prop |
A vector of s column names identifying the species proportions in each community in the dataset. For example, if the species proportions columns are labelled p1 to p4, then |
treat |
The name of a column in the dataset containing the value of a treatment factor or covariate. The treatment name must be included in quotes, for example,
|
ID |
This argument takes a text list (of length s) dsecirbing groupings for the identity effects of the species. For example, if there are four species and you wish to group the identity effects all four species into a single term:
|
FG |
If species are classified by g functional groups, this argument takes a text list (of length s) of the functional group to which each species belongs. For example, for four grassland species with two grasses and two legumes: FG could be
|
DImodel |
This argument is chosen (over
Each of the following includes the species proportions as specified in
The DImodel tag should appear in quotes, for example, |
extra_formula |
In conjunction with |
custom_formula |
To specify your own DI model, write your own model formula using the |
data |
Specify the dataset, for example, |
estimate_theta |
By default, theta (the power parameter on all |
theta |
Users may specify a value of theta different than 1 to fit the DI model. Note that if |
What are Diversity-Interactions models?
Diversity-Interactions (DI) models (Kirwan et al 2009) are a set of tools for analysing and interpreting data from experiments that explore the effects of species diversity on community-level responses. We recommend that users of the DImodels package read the short introduction to DI models (available at: DImodels). Further information on DI models is available in Kirwan et al 2009 and Connolly et al 2013.
Checks on data prior to using DI.
Before using DI, check that the species proportions for each row in your dataset sum to one. See the 'Examples' section for code to do this. An error message will be generated if the proportions don't sum to one.
How does the DI function work?
The DI function provides wide flexibility in the types of Diversity-Interactions (DI) models that can be fitted. There are two ways to fit models in DI: 1) using DImodel, possibly augmented by extra_formula, or 2) using custom_formula. Models are estimated using iteratively reweighted least squares, via the glm package, when the option estimate_theta = FALSE.
Consider the following DI model, for example (in R formula syntax):
y ~ p1 + p2 + p3 + treatment + p1:p2 + p1:p3 + p2:p3 + p1:p2:treatment + p1:p3:treatment + p2:p3:treatment
This model can be fitted using DImodel and extra_formula:
DI(y = "y", prop = c("p1", "p2", "p3"), treat = "nitrogen", DImodel = "FULL", extra_formula = ~ p1:p2:treatment + p1:p3:treatment + p2:p3:treatment, data = datasetname)
or, by specifying all of the terms in the model using custom_formula:
DI(custom_formula = y ~ p1 + p2 + p3 + treatment + p1:p2 + p1:p3 + p2:p3 + p1:p2:treatment + p1:p3:treatment + p2:p3:treatment, data = datasetname)
We recommend to use DImodel where possible, to augment with extra_formula where required, and to only use custom_formula when DImodel plus extra_formula is insufficient.
Including theta in DI models
A non-linear parameter can be included in DI models as a power on all components of each pairwise interaction variable. For example (in R formula syntax):
y ~ p1 + p2 + p3 + (p1:p2)^theta + (p1:p3)^theta + (p2:p3)^theta
for the full pairwise interaction model. Including alters the contribution of the interaction term to the response (Connolly et al 2013).
By default, the value of is 1. By specifying estimate_theta = TRUE within DI, a value of will be estimated using profile likelihood over the space = 0.01 to 1.5. The option estimate_theta = TRUE can only be used with DImodel, it is not available when using custom_formula.
As a general guideline to testing if is required, we recommend:
1) finding the best form of the species interaction terms assuming theta = 1, and then,
2) testing if theta differs from 1.
If no species interaction terms are needed, then there is no need to do any testing for theta.
A model object of class "glm" including the components detailed in glm, plus the following:
DIcall |
the call of the |
DIinternalcall |
an internal call made within the DI model fitting process |
Rafael A. Moral, John Connolly and Caroline Brophy
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
Other examples using DI:
The Bell dataset examples.
The sim1 dataset examples.
The sim2 dataset examples.
The sim3 dataset examples.
The sim4 dataset examples.
The sim5 dataset examples.
The Switzerland dataset examples.
## Load the Switzerland data data(Switzerland) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p4 are in the 4th to 7th columns in Switzerland Switzerlandsums <- rowSums(Switzerland[4:7]) summary(Switzerlandsums) ## Fit the a simple AV DImodel with theta = 1 mod <- DI(y = "yield", prop = c("p1", "p2", "p3", "p4"), DImodel = "AV", data = Switzerland) summary(mod) ## Fit the same model but group the 4 species identity effect into 2 groups mod <- DI(y = "yield", prop = c("p1", "p2", "p3", "p4"), ID = c("ID1", "ID1", "ID2", "ID2"), DImodel = "AV", data = Switzerland) summary(mod) ## Combine the four identity effects into a single term and estimate theta mod <- DI(y = "yield", prop = c("p1", "p2", "p3", "p4"), ID = c("ID1", "ID1", "ID1", "ID1"), DImodel = "AV", estimate_theta = TRUE, data = Switzerland) summary(mod) ## Fit the FG DImodel, with factors density and treatment, and with theta = 1 m1 <- DI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen", FG = c("G","G","L","L"), DImodel = "FG", data = Switzerland) summary(m1) ## Fit the FG DImodel, with factors density and treatment, and a fixed value of theta not equal to 1 m2 <- DI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen", FG = c("G","G","L","L"), DImodel = "FG", data = Switzerland, theta = 0.5) summary(m2) ## Test if the identity effects interact with nitrogen (and main effect of nitrogen excluded) m3 <- DI(y = "yield", density = "density", prop = 4:7, FG = c("G", "G", "L", "L"), DImodel = "FG", extra_formula = ~ (p1 + p2 + p3 + p4):nitrogen, data = Switzerland) summary(m3) ## Fit the average pairwise model and check for a quadratic term using extra_formula. ## Need to create AV variable to be included in extra_formula and put in new dataset Switzerland2. AV_variable <- DI_data(prop = c("p1","p2","p3","p4"), data = Switzerland, what = "AV") Switzerland2 <- data.frame(Switzerland, "AV" = AV_variable) m4 <- DI(y = "yield", density = "density", prop = 4:7, DImodel = "AV", extra_formula = ~ I(AV**2), data = Switzerland2) summary(m4) ## Using the custom_formula option to fit some, but not all, of the FG interactions. ## Fit the FG DImodel using custom_formula, with factors density and treatment, and theta = 1. ## Need to create functional group interaction variables for inclusion in custom_formulaand put ## in new dataset Switzerland3. FG_matrix <- DI_data(prop = 4:7, FG = c("G","G","L","L"), data = Switzerland, what = "FG") Switzerland3 <- data.frame(Switzerland, FG_matrix) m5 <- DI(y = "yield", prop = c("p1","p2","p3","p4"), custom_formula = yield ~ 0 + p1 + p2 + p3 + p4 + bfg_G_L + wfg_G + density + nitrogen, data = Switzerland3) summary(m5)## Load the Switzerland data data(Switzerland) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p4 are in the 4th to 7th columns in Switzerland Switzerlandsums <- rowSums(Switzerland[4:7]) summary(Switzerlandsums) ## Fit the a simple AV DImodel with theta = 1 mod <- DI(y = "yield", prop = c("p1", "p2", "p3", "p4"), DImodel = "AV", data = Switzerland) summary(mod) ## Fit the same model but group the 4 species identity effect into 2 groups mod <- DI(y = "yield", prop = c("p1", "p2", "p3", "p4"), ID = c("ID1", "ID1", "ID2", "ID2"), DImodel = "AV", data = Switzerland) summary(mod) ## Combine the four identity effects into a single term and estimate theta mod <- DI(y = "yield", prop = c("p1", "p2", "p3", "p4"), ID = c("ID1", "ID1", "ID1", "ID1"), DImodel = "AV", estimate_theta = TRUE, data = Switzerland) summary(mod) ## Fit the FG DImodel, with factors density and treatment, and with theta = 1 m1 <- DI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen", FG = c("G","G","L","L"), DImodel = "FG", data = Switzerland) summary(m1) ## Fit the FG DImodel, with factors density and treatment, and a fixed value of theta not equal to 1 m2 <- DI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen", FG = c("G","G","L","L"), DImodel = "FG", data = Switzerland, theta = 0.5) summary(m2) ## Test if the identity effects interact with nitrogen (and main effect of nitrogen excluded) m3 <- DI(y = "yield", density = "density", prop = 4:7, FG = c("G", "G", "L", "L"), DImodel = "FG", extra_formula = ~ (p1 + p2 + p3 + p4):nitrogen, data = Switzerland) summary(m3) ## Fit the average pairwise model and check for a quadratic term using extra_formula. ## Need to create AV variable to be included in extra_formula and put in new dataset Switzerland2. AV_variable <- DI_data(prop = c("p1","p2","p3","p4"), data = Switzerland, what = "AV") Switzerland2 <- data.frame(Switzerland, "AV" = AV_variable) m4 <- DI(y = "yield", density = "density", prop = 4:7, DImodel = "AV", extra_formula = ~ I(AV**2), data = Switzerland2) summary(m4) ## Using the custom_formula option to fit some, but not all, of the FG interactions. ## Fit the FG DImodel using custom_formula, with factors density and treatment, and theta = 1. ## Need to create functional group interaction variables for inclusion in custom_formulaand put ## in new dataset Switzerland3. FG_matrix <- DI_data(prop = 4:7, FG = c("G","G","L","L"), data = Switzerland, what = "FG") Switzerland3 <- data.frame(Switzerland, FG_matrix) m5 <- DI(y = "yield", prop = c("p1","p2","p3","p4"), custom_formula = yield ~ 0 + p1 + p2 + p3 + p4 + bfg_G_L + wfg_G + density + nitrogen, data = Switzerland3) summary(m5)
This function fits the reference model internally and compares a DI model fit to it using anova.
DI_compare(model, ...)DI_compare(model, ...)
model |
A DI model object. |
... |
Further arguments passed to |
This function takes a DI model object as input, fits the reference model internally and compares the two models using anova. The reference model includes an additive effect of community (each unique combination of species proportions) in the linear predictor. For more details on the reference model, see Connolly et al. (2013).
The function returns the reference model, a glm model object.
Rafael A. Moral, John Connolly and Caroline Brophy
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
## Load the sim1 data data(sim1) ## Fit the FULL model m1 <- DI(y = "response", block = "block", prop = 3:6, DImodel = "FULL", data = sim1) ## Compare with the reference model DI_compare(m1, test = "F")## Load the sim1 data data(sim1) ## Fit the FULL model m1 <- DI(y = "response", block = "block", prop = 3:6, DImodel = "FULL", data = sim1) ## Compare with the reference model DI_compare(m1, test = "F")
This function can be used to compute additional variables for the various types of interactions among pairs of species proportions. These variables are defined following Kirwan et al 2007 and 2009, and Connolly et al 2013.
The variables are denoted:
AV and E: the average pairwise interaction variable (AV) and a scaled version of it (E), each a single variable. The variable AV is the sum of products of the proportions of each pair of species in the mixture. The variable E is a scaled version of AV that ranges between 0 (for a monoculture community) to 1 for the equi-proportional mixture of all species in the pool.
FG: the functional group interaction variables. There is a variable for (within) each functional group and one for (between) each pair of functional groups, i.e., if there are two functional groups, there will be three functional group interaction variables, while if there are three functional groups, there will be six functional group interaction variables.
ADD: the additive species interaction variables, one for each species.
FULL: all individual pairwise interactions. There will be new variables created, where s is the number of species in the pool.
By default, the interaction variables described above are created with theta = 1, but a different value of theta can also be specified (Connolly et al 2013).
DI_data(prop, FG, data, theta = 1, what = c("E","AV","FG","ADD","FULL"))DI_data(prop, FG, data, theta = 1, what = c("E","AV","FG","ADD","FULL"))
prop |
A vector of column names identifying the species proportions in the dataset. For example, if the species proportions columns are labelled p1 to p4, then |
FG |
If species are classified by g functional groups, this argument takes a text list (of length s) of the functional group to which each species belongs. For example, for four grassland species with two grasses and two legumes, it could be |
data |
Specify the dataset, for example, |
theta |
Interaction variables will be computed with the theta power, equal to the value specified, on all |
what |
The interactions to be computed. By default each set of variables will be computed: AV, E, FG (if |
What are Diversity-Interactions models?
Diversity-Interactions (DI) models (Kirwan et al 2009) are a set of tools for analysing and interpreting data from experiments that explore the effects of species diversity on community-level responses. We recommend that users of the DImodels package read the short introduction to DI models (available at: DImodels). Further information on DI models is available in Kirwan et al 2009 and Connolly et al 2013.
Checks on data prior to using the DI_data function.
Before using the DI_data function, check that the species proportions in each row of your dataset sum to one. See the 'Examples' section for code to do this. An error message will be generated if the proportions don't sum to one.
When is the DI_data function needed?
It is not required to use the DI_data function if using the autoDI function, or the DImodel option in the DI function, as they will automatically create the species interaction variables needed. If using species interaction variables in the extra_formula or custom_formula options in DI, then it is required to have the variables already in the dataset and DI_data can do that.
Short worked example to illustrate how the DI_data function works
The code to implement this example is provided in the 'Examples' section.
Assume four species with initial proportions in two communities: (0.1, 0.2, 0.3, 0.4) and (0.25, 0.25, 0.25, 0.25), with FG = c("G","G","L","L").
For community 1: (0.1,0.2,0.3,0.4), assuming theta = 1, the data preparation functions will compute the following additional variables (details in Kirwan et al 2007 and 2009):
AV = 0.1*0.2 + 0.1*0.3 + 0.1*0.4 + 0.2*0.3 + 0.2*0.4 + 0.3*0.4 = 0.35
E = (2s/(s-1))*AV = 0.9333
p1_add = 0.1 * (1 - 0.1) = 0.09
p2_add = 0.2 * (1 - 0.2) = 0.16
p3_add = 0.3 * (1 - 0.3) = 0.21
p4_add = 0.4 * (1 - 0.4) = 0.24
bfg_G_L = 0.1*0.3 + 0.1*0.4 + 0.2*0.3 + 0.2*0.4 = 0.21
wfg_G = 0.1*0.2 = 0.02
wfg_L = 0.3*0.4 = 0.12
For community 1: (0.1,0.2,0.3,0.4), assuming theta = 0.5, the data preparation functions will compute the follow additional variables (details in Connolly et al 2013):
AV = (0.1*0.2)^0.5 + (0.1*0.3)^0.5 + (0.1*0.4)^0.5 + (0.2*0.3)^0.5 + (0.2*0.4)^0.5 + (0.3*0.4)^0.5 = 1.3888
E =(2s/(s-1))*AV = 3.7035
p1_add = 0.1^0.5 * (0.2^0.5 + 0.3^0.5 + 0.4^0.5) = 0.5146
p2_add = 0.2^0.5 * (0.1^0.5 + 0.3^0.5 + 0.4^0.5) = 0.6692
p3_add = 0.3^0.5 * (0.1^0.5 + 0.2^0.5 + 0.4^0.5) = 0.7646
p4_add = 0.4^0.5 * (0.1^0.5 + 0.2^0.5 + 0.3^0.5) = 0.8293
bfg_G_L = (0.1*0.3)^0.5 + (0.1*0.4)^0.5 + (0.2*0.3)^0.5 + (0.2*0.4)^0.5 = 0.9010
wfg_G = (0.1*0.2)^0.5 = 0.1414
wfg_L = (0.3*0.4)^0.5 = 0.3464
When using the DI_data function to create interactions for theta values for a value different from 1, it is recommended to rename the new interaction variables to include _theta.
The data manipulation values for community 2 (0.25, 0.25, 0.25,0.25) can be seen when the 'Examples' section code is run.
DI_data returns a named list with one or more the following components (depending on the specification of the what argument:
E |
a numeric vector containing the evenness interaction variable |
AV |
a numeric vector containing the average pairwise interaction variable |
FG |
a matrix containing the functional group-related interaction variables |
ADD |
a matrix containing the additive species contributions interaction variables |
FULL |
a matrix containing all pairwise interactions |
Rafael A. Moral, John Connolly and Caroline Brophy
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Kirwan L, A Lüscher, MT Sebastia, JA Finn, RP Collins, C Porqueddu, A Helgadottir, OH Baadshaug, C Brophy, C Coran, S Dalmannsdottir, I Delgado, A Elgersma, M Fothergill, BE Frankow-Lindberg, P Golinski, P Grieu, AM Gustavsson, M Höglind, O Huguenin-Elie, C Iliadis, M Jørgensen, Z Kadziuliene, T Karyotis, T Lunnan, M Malengier, S Maltoni, V Meyer, D Nyfeler, P Nykanen-Kurki, J Parente, HJ Smit, U Thumm, & J Connolly (2007) Evenness drives consistent diversity effects in intensive grassland systems across 28 European sites. Journal of Ecology, 95, 530-539.
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
Other examples using the DI_data function:
The sim2 dataset examples.
The sim3 dataset examples.
The sim4 dataset examples.
The sim5 dataset examples.
The Switzerland dataset examples.
################################ #### Data manipulation for the Switzerland dataset ## Load the Switzerland data data(Switzerland) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p4 are in the 4th to 7th columns in Switzerland Switzerlandsums <- rowSums(Switzerland[4:7]) summary(Switzerlandsums) ## Create FG interaction variables and incorporate them into a new data frame Switzerland2. ## Switzerland2 will contain the new variables: bfg_G_L, wfg_G and wfg_L. FG_matrix <- DI_data(prop = c("p1","p2","p3","p4"), FG = c("G","G","L","L"), data = Switzerland, what = "FG") Switzerland2 <- data.frame(Switzerland, FG_matrix) ## Create FG interaction variables and incorporate them into a new data frame Switzerland3. ## Use theta = 0.5. FG_matrix <- DI_data(prop = c("p1","p2","p3","p4"), FG = c("G","G","L","L"), data = Switzerland, what = "FG", theta = 0.5) Switzerland3 <- data.frame(Switzerland, FG_matrix) ## Add "_theta" to the new interaction variables to differentiate from when theta = 1 names(Switzerland3)[9:11] <- paste0(names(Switzerland3)[9:11], "_theta") #### All interactions can be added to a new dataset all together: ## Create all interactions and add them to a new data frame called Switzerland4 newlist <- DI_data(prop = c("p1","p2","p3","p4"), FG = c("G","G","L","L"), data = Switzerland, what = c("E", "AV", "FG", "ADD", "FULL")) Switzerland4 <- data.frame(Switzerland, "E" = newlist$E, "AV" = newlist$AV, newlist$FG, newlist$ADD, newlist$FULL) #### Or the various interactions can also be added to a new dataset individually: ## Create the average pairwise interaction and evenness variables ## and store them in a new data frame called Switzerland5. ## Switzerland5 will contain the new variables: AV, E E_variable <- DI_data(prop = c("p1","p2","p3","p4"), data = Switzerland, what = "E") AV_variable <- DI_data(prop = c("p1","p2","p3","p4"), data = Switzerland, what = "AV") Switzerland5 <- data.frame(Switzerland, "AV" = AV_variable, "E" = E_variable) ## Create the functional group variables and add them to Switzerland5. ## In the FG names vector: G stands for grass, L stands for legume. ## Switzerland5 will contain: bfg_G_L, wfg_G and wfg_L FG_matrix <- DI_data(prop = 4:7, FG = c("G","G","L","L"), data = Switzerland, what = "FG") Switzerland5 <- data.frame(Switzerland5, FG_matrix) ## Create the additive species variables and add them to Switzerland5. ## Switzerland5 will contain the new variables: p1_add, p2_add, p3_add and p4_add. ADD_matrix <- DI_data(prop = c("p1","p2","p3","p4"), data = Switzerland, what = "ADD") Switzerland5 <- data.frame(Switzerland5, ADD_matrix) ## Create all pairwise interaction variables and add them to Switzerland5. ## Switzerland5 will contain the new variables: p1.p2, p1.p3, p1.p4, p2.p3, p2.p4, p3.p4. FULL_matrix <- DI_data(prop = c("p1","p2","p3","p4"), data = Switzerland, what = "FULL") Switzerland5 <- data.frame(Switzerland5, FULL_matrix) ################################ ################################ #### Short worked example (as illustrated the Details section) ## Create a dataframe p1 <- c(0.1, 0.25) p2 <- c(0.2, 0.25) p3 <- c(0.3, 0.25) p4 <- c(0.4, 0.25) minidataset1 <- data.frame(p1,p2,p3,p4) ## Check the rows sum to 1 rowSums(minidataset1[1:4]) ## Create the FG variables, assume two functional groups and theta = 1 FG_matrix <- DI_data(prop = c("p1","p2","p3","p4"), FG = c("G","G","L","L"), data = minidataset1, what = "FG") minidataset2 <- data.frame(minidataset1, FG_matrix) ## Create the FG variables, assume two functional groups and theta = 0.5 FG_matrix <- DI_data(prop = c("p1","p2","p3","p4"), FG = c("G","G","L","L"), data = minidataset1, what = "FG", theta = 0.5) minidataset3 <- data.frame(minidataset1, FG_matrix) ## Create the ADD variables, assume theta = 0.5 ADD_matrix <- DI_data(prop = c("p1","p2","p3","p4"), data = minidataset1, what = "ADD", theta = 0.5) minidataset3 <- data.frame(minidataset3, ADD_matrix) ## Add "_theta" to the new interaction variables to differentiate from when theta = 1 names(minidataset3)[5:11] <- paste0(names(minidataset3)[5:11], "_theta") ################################################################ #### Data manipulation for the Switzerland dataset ## Load the Switzerland data data(Switzerland) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p4 are in the 4th to 7th columns in Switzerland Switzerlandsums <- rowSums(Switzerland[4:7]) summary(Switzerlandsums) ## Create FG interaction variables and incorporate them into a new data frame Switzerland2. ## Switzerland2 will contain the new variables: bfg_G_L, wfg_G and wfg_L. FG_matrix <- DI_data(prop = c("p1","p2","p3","p4"), FG = c("G","G","L","L"), data = Switzerland, what = "FG") Switzerland2 <- data.frame(Switzerland, FG_matrix) ## Create FG interaction variables and incorporate them into a new data frame Switzerland3. ## Use theta = 0.5. FG_matrix <- DI_data(prop = c("p1","p2","p3","p4"), FG = c("G","G","L","L"), data = Switzerland, what = "FG", theta = 0.5) Switzerland3 <- data.frame(Switzerland, FG_matrix) ## Add "_theta" to the new interaction variables to differentiate from when theta = 1 names(Switzerland3)[9:11] <- paste0(names(Switzerland3)[9:11], "_theta") #### All interactions can be added to a new dataset all together: ## Create all interactions and add them to a new data frame called Switzerland4 newlist <- DI_data(prop = c("p1","p2","p3","p4"), FG = c("G","G","L","L"), data = Switzerland, what = c("E", "AV", "FG", "ADD", "FULL")) Switzerland4 <- data.frame(Switzerland, "E" = newlist$E, "AV" = newlist$AV, newlist$FG, newlist$ADD, newlist$FULL) #### Or the various interactions can also be added to a new dataset individually: ## Create the average pairwise interaction and evenness variables ## and store them in a new data frame called Switzerland5. ## Switzerland5 will contain the new variables: AV, E E_variable <- DI_data(prop = c("p1","p2","p3","p4"), data = Switzerland, what = "E") AV_variable <- DI_data(prop = c("p1","p2","p3","p4"), data = Switzerland, what = "AV") Switzerland5 <- data.frame(Switzerland, "AV" = AV_variable, "E" = E_variable) ## Create the functional group variables and add them to Switzerland5. ## In the FG names vector: G stands for grass, L stands for legume. ## Switzerland5 will contain: bfg_G_L, wfg_G and wfg_L FG_matrix <- DI_data(prop = 4:7, FG = c("G","G","L","L"), data = Switzerland, what = "FG") Switzerland5 <- data.frame(Switzerland5, FG_matrix) ## Create the additive species variables and add them to Switzerland5. ## Switzerland5 will contain the new variables: p1_add, p2_add, p3_add and p4_add. ADD_matrix <- DI_data(prop = c("p1","p2","p3","p4"), data = Switzerland, what = "ADD") Switzerland5 <- data.frame(Switzerland5, ADD_matrix) ## Create all pairwise interaction variables and add them to Switzerland5. ## Switzerland5 will contain the new variables: p1.p2, p1.p3, p1.p4, p2.p3, p2.p4, p3.p4. FULL_matrix <- DI_data(prop = c("p1","p2","p3","p4"), data = Switzerland, what = "FULL") Switzerland5 <- data.frame(Switzerland5, FULL_matrix) ################################ ################################ #### Short worked example (as illustrated the Details section) ## Create a dataframe p1 <- c(0.1, 0.25) p2 <- c(0.2, 0.25) p3 <- c(0.3, 0.25) p4 <- c(0.4, 0.25) minidataset1 <- data.frame(p1,p2,p3,p4) ## Check the rows sum to 1 rowSums(minidataset1[1:4]) ## Create the FG variables, assume two functional groups and theta = 1 FG_matrix <- DI_data(prop = c("p1","p2","p3","p4"), FG = c("G","G","L","L"), data = minidataset1, what = "FG") minidataset2 <- data.frame(minidataset1, FG_matrix) ## Create the FG variables, assume two functional groups and theta = 0.5 FG_matrix <- DI_data(prop = c("p1","p2","p3","p4"), FG = c("G","G","L","L"), data = minidataset1, what = "FG", theta = 0.5) minidataset3 <- data.frame(minidataset1, FG_matrix) ## Create the ADD variables, assume theta = 0.5 ADD_matrix <- DI_data(prop = c("p1","p2","p3","p4"), data = minidataset1, what = "ADD", theta = 0.5) minidataset3 <- data.frame(minidataset3, ADD_matrix) ## Add "_theta" to the new interaction variables to differentiate from when theta = 1 names(minidataset3)[5:11] <- paste0(names(minidataset3)[5:11], "_theta") ################################
DI Objects
Different methods that can be used with objects of class DI.
## S3 method for class 'DI' anova(object, ...) ## S3 method for class 'DI' AIC(object, ...) ## S3 method for class 'DI' BIC(object, ...) ## S3 method for class 'DI' AICc(obj) ## S3 method for class 'DI' BICc(obj) ## S3 method for class 'DI' logLik(object, ...) ## S3 method for class 'DI' fortify(model, data, ...)## S3 method for class 'DI' anova(object, ...) ## S3 method for class 'DI' AIC(object, ...) ## S3 method for class 'DI' BIC(object, ...) ## S3 method for class 'DI' AICc(obj) ## S3 method for class 'DI' BICc(obj) ## S3 method for class 'DI' logLik(object, ...) ## S3 method for class 'DI' fortify(model, data, ...)
object |
a |
obj |
a |
model |
a |
data |
data to which to add model fit statistics. Defaults to the model frame. |
... |
further arguments passed to |
Rafael A. Moral, John Connolly, Rishabh Vishwakarma and Caroline Brophy
Internal functions used within the DI and autoDI functions.
Rafael A. Moral, John Connolly, Rishabh Vishwakarma and Caroline Brophy
Extracts terms from a DI or autoDI model object that were calculated using the data preparation functions prior to the model fitting procedure.
extract(obj, what = c("FULL","ADD","FG","AV","E")) ## S3 method for class 'DI' extract(obj, what = c("FULL","ADD","FG","AV","E")) ## S3 method for class 'autoDI' extract(obj, what = c("FULL","ADD","FG","AV","E"))extract(obj, what = c("FULL","ADD","FG","AV","E")) ## S3 method for class 'DI' extract(obj, what = c("FULL","ADD","FG","AV","E")) ## S3 method for class 'autoDI' extract(obj, what = c("FULL","ADD","FG","AV","E"))
obj |
a |
what |
the variable(s) to be extracted from the object. |
A list containing one or more of the following elements:
E |
the evenness interaction variable |
AV |
the average pairwise interaction variable |
FG |
the functional group-related interaction variables |
ADD |
the additive species contributions interaction variables |
FULL |
all pairwise interactions |
Rafael A. Moral, John Connolly and Caroline Brophy
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
## Load the Switzerland data data(Switzerland) ## Fit the FG DImodel, with factors density and treatment, and with theta = 1 m1 <- DI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen", FG = c("G","G","L","L"), DImodel = "FG", data = Switzerland) ## Extract the FG terms extract(m1, what = "FG")## Load the Switzerland data data(Switzerland) ## Fit the FG DImodel, with factors density and treatment, and with theta = 1 m1 <- DI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen", FG = c("G","G","L","L"), DImodel = "FG", data = Switzerland) ## Extract the FG terms extract(m1, what = "FG")
Generates predictions for a fitted DI models object and, optionally, the associated standard errors for those predictions.
## S3 method for class 'DI' predict(object, newdata, se.fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, weights = 1, type = c("link", "response", "terms"), ...)## S3 method for class 'DI' predict(object, newdata, se.fit = FALSE, interval = c("none", "confidence", "prediction"), level = 0.95, weights = 1, type = c("link", "response", "terms"), ...)
object |
|
newdata |
optionally, a data frame in which to look for variables with which to predict. If omitted, predictions will be made for data used to fit the model. |
se.fit |
A logical switch indicating whether to calculate the associated standard errors. |
interval |
The type of interval to be calculated. Accepts one of "none", "confidence" or "prediction", with "none"" being the default. |
level |
The level (1 - |
weights |
The variance weights for calculating the prediction intervals. By default all values get the same weight of 1. Can also specify a numeric vector of same length as number of rows in |
type |
the type of prediction required. One of "link", "response" or "terms". The default "link" is on the scale of the linear predictors while "response" is on the scale of the response variable. The "terms" option returns a matrix giving the fitted values of each term in the model formula on the linear predictor scale. |
... |
further arguments passed to or from other methods. For eg: |
If newdata doesn't contain all predictors from the model, necessary numeric predictors will be added in with a value of 0, while categorical predictors will be added in with their baseline value. See examples for more information.
if se.fit = FALSE, a vector of predictions is returned.
if se.fit = TRUE, a list with the following components is returned.
fit |
Predictions, as for |
se.fit |
Estimated standard error for each prediction |
residual.scale |
A scalar giving the square root of the dispersion used in computing the standard errors. |
Rafael A. Moral, John Connolly, Rishabh Vishwakarma and Caroline Brophy
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Luescher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Kirwan L, J Connolly, JA Finn, C Brophy, A Luescher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
## Load the Switzerland data data(Switzerland) ## Summarise the Switzerland data summary(Switzerland) ## Fit a DI model m1 <- DI(y = "yield", prop = 4:7, treat = 'nitrogen', DImodel = 'AV', density = 'density', estimate_theta = TRUE, data = Switzerland) summary(m1) ## Prediction without newdata, full dataset from model fit will be used predict(m1, se.fit = TRUE) ## Prediction with newdata newdata1 <- data.frame('p1' = c(1,0), 'p2' = c(0,1), 'p3' = c(0,0), 'p4' = c(0,0), 'nitrogen' = c(50, 150), 'density' = c('low','high')) predict(m1, newdata = newdata1, se.fit = TRUE) ## If any categorical variable is missing, the baseline level ## of the variable is used for prediction newdata2 <- newdata1[, -5] print(newdata2) predict(m1, newdata = newdata2) ## If a numerical variable is missing, the value 0 is used for it newdata3 <- newdata1[, -c(3,4)] print(newdata3) predict(m1, newdata = newdata3)## Load the Switzerland data data(Switzerland) ## Summarise the Switzerland data summary(Switzerland) ## Fit a DI model m1 <- DI(y = "yield", prop = 4:7, treat = 'nitrogen', DImodel = 'AV', density = 'density', estimate_theta = TRUE, data = Switzerland) summary(m1) ## Prediction without newdata, full dataset from model fit will be used predict(m1, se.fit = TRUE) ## Prediction with newdata newdata1 <- data.frame('p1' = c(1,0), 'p2' = c(0,1), 'p3' = c(0,0), 'p4' = c(0,0), 'nitrogen' = c(50, 150), 'density' = c('low','high')) predict(m1, newdata = newdata1, se.fit = TRUE) ## If any categorical variable is missing, the baseline level ## of the variable is used for prediction newdata2 <- newdata1[, -5] print(newdata2) predict(m1, newdata = newdata2) ## If a numerical variable is missing, the value 0 is used for it newdata3 <- newdata1[, -c(3,4)] print(newdata3) predict(m1, newdata = newdata3)
This function provides an automated way to fit the richness model and a (limited) range of Diversity-Interactions (DI) models.
richness_vs_DI(y, prop, data, extra_formula)richness_vs_DI(y, prop, data, extra_formula)
y |
The column name of the response vector, which must be in quotes, for example, |
prop |
A vector of s column names identifying the species proportions in each row in the dataset. For example, if the species proportions columns are labelled p1 to p4, then |
data |
Specify the dataset, for example, |
extra_formula |
Additional terms can be added using |
Connolly et al. (2013; Appendix 1) shows that there is an equivalence between DI models and different types of richness models (linear and nonlinear predictors using richness in different scales).
This function fits four models and compares them using AIC. The four models are:
1. The richness model
2. The average pairwise interactions (AV) DI model with common identity effects and theta equal to 0.5 (which is equivalent to model 1 when communities are all equi-proportional);
3. The average pairwise interactions (AV) DI model with common identity effects and estimating theta;
4. The average pairwise interactions (AV) DI model allowing for unique identity effects, but maintaining theta equal to 0.5;
5. The average pairwise interactions (AV) DI model allowing for unique identity effects, and estimating theta.
The function prints a table with AIC, AICc, BIC, and associated degrees of freedom for each of the four models above, and returns the model with the smallest AIC.
The function returns the final selected model, an object of class DI or lm.
Rafael A. Moral, John Connolly, Rishabh Vishwakarma and Caroline Brophy
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
## Load the Switzerland data data(Switzerland) ## compare the richness model with DI alternatives richness_vs_DI(y = "yield", prop = 4:7, data = Switzerland) ## include the density effects in the linear predictors of the four models richness_vs_DI(y = "yield", prop = 4:7, data = Switzerland, extra_formula = ~ density)## Load the Switzerland data data(Switzerland) ## compare the richness model with DI alternatives richness_vs_DI(y = "yield", prop = 4:7, data = Switzerland) ## include the density effects in the linear predictors of the four models richness_vs_DI(y = "yield", prop = 4:7, data = Switzerland, extra_formula = ~ density)
The sim0 dataset was simulated. There are four replicates and three species that vary in proportions (p1 - p3). There are 16 unique sets of proportions identified by the variable community. The response was simulated assuming that there were species identity effects and separate pairwise interactions effects.
data(sim0)data(sim0)
A data frame with 64 observations on the following six variables:
communityA numeric vector identifying each unique community, i.e., two rows with the same community value also share the same set of p1 to p3 values.
richnessA numeric vector indicating the number of species in the initial composition, i.e., the number of proportions from p1 to p3 that are >0.
p1A numeric vector indicating the initial proportion of species 1.
p2A numeric vector indicating the initial proportion of species 2.
p3A numeric vector indicating the initial proportion of species 3.
responseA numeric vector giving the simulated response variable.
What are Diversity-Interactions (DI) models?
Diversity-Interactions (DI) models (Kirwan et al 2009) are a set of tools for analysing and interpreting data from experiments that explore the effects of species diversity on community-level responses. We strongly recommend that users read the short introduction to Diversity-Interactions models (available at: DImodels). Further information on Diversity-Interactions models is also available in Kirwan et al 2009 and Connolly et al 2013.
Parameter values for the simulation
DI models take the general form of:
where y is a community-level response, the Identities are the effects of species identities and enter the model as individual species proportions at the beginning of the time period, the Interactions are the interactions among the species proportions, while Structures include other experimental structures such as blocks, treatments or density.
The dataset sim0 was simulated with:
identity effects for the four species with values = 25, 20, 15
all 3 pairwise interaction effects with values: 30, 20, 40 (for pairs of species 1-2, 1-3, and 2-3, respectively).
assumed normally distributed with mean 0 and standard deviation 2.
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
## load the sim0 dataset data(sim0) ## Find the best DI model using autoDI and F-test selection auto1 <- autoDI(y = "response", prop = c("p1","p2","p3"), data = sim0, selection = "Ftest") summary(auto1) ## Fit the FULL model using DI and the ID tag m1 <- DI(y = "response", prop = c("p1","p2","p3"), DImodel = "FULL", data = sim0) summary(m1) plot(m1) ## Check goodness-of-fit using a half-normal plot with a simulated envelope library(hnp) hnp(m1)## load the sim0 dataset data(sim0) ## Find the best DI model using autoDI and F-test selection auto1 <- autoDI(y = "response", prop = c("p1","p2","p3"), data = sim0, selection = "Ftest") summary(auto1) ## Fit the FULL model using DI and the ID tag m1 <- DI(y = "response", prop = c("p1","p2","p3"), DImodel = "FULL", data = sim0) summary(m1) plot(m1) ## Check goodness-of-fit using a half-normal plot with a simulated envelope library(hnp) hnp(m1)
This dataset contains a set of proportions p1 to p3 where each row sums to 1. It is used as the design matrix for simulating other datasets in the DImodels package.
data("sim0_design")data("sim0_design")
A data frame with 16 observations on the following 5 variables:
communityA numeric vector identifying each unique community, i.e., two rows with the same community value also share the same set of p1 to p3 values.
richnessA numeric vector indicating the number of species in the initial composition, i.e., the number of proportions from p1 to p3 that are >0.
p1A numeric vector indicating a proportion (of species 1).
p2A numeric vector indicating a proportion (of species 2).
p3A numeric vector indicating a proportion (of species 3).
The columns p1 to p3 form a simplex space.
## Load the sim0_design data data(sim0_design) ## View the first five entries head(sim0_design) ## Explore the variables in design_a str(sim0_design) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p3 are in the 3rd to 5th columns in sim0_design sim0_design_sums <- rowSums(sim0_design[3:5]) summary(sim0_design)## Load the sim0_design data data(sim0_design) ## View the first five entries head(sim0_design) ## Explore the variables in design_a str(sim0_design) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p3 are in the 3rd to 5th columns in sim0_design sim0_design_sums <- rowSums(sim0_design[3:5]) summary(sim0_design)
The sim1 dataset was simulated. There are four blocks and four species that vary in proportions (p1 - p4). There are 15 unique sets of proportions identified by the variable community. Each unique community appears once in each block. The response was simulated assuming that there were species identity effects and block effects, but no diversity effects.
data(sim1)data(sim1)
A data frame with 60 observations on the following seven variables:
communityA numeric vector identifying each unique community, i.e., two rows with the same community value also share the same set of p1 to p4 values.
blockA factor taking values 1 to 4 indicating block membership.
p1A numeric vector indicating the initial proportion of species 1.
p2A numeric vector indicating the initial proportion of species 2.
p3A numeric vector indicating the initial proportion of species 3.
p4A numeric vector indicating the initial proportion of species 4.
responseA numeric vector giving the simulated response variable.
What are Diversity-Interactions (DI) models?
Diversity-Interactions (DI) models (Kirwan et al 2009) are a set of tools for analysing and interpreting data from experiments that explore the effects of species diversity on community-level responses. We strongly recommend that users read the short introduction to Diversity-Interactions models (available at: DImodels). Further information on Diversity-Interactions models is also available in Kirwan et al 2009 and Connolly et al 2013.
Parameter values for the simulation
DI models take the general form of:
where y is a community-level response, the Identities are the effects of species identities and enter the model as individual species proportions at the beginning of the time period, the Interactions are the interactions among the species proportions, while Structures include other experimental structures such as blocks, treatments or density.
The dataset sim1 was simulated with:
identity effects for the four species with values = 10, 9, 8, 7
block effects for the four blocks with values = 1, 1.5, 2, 0
no interaction effects
assumed normally distributed with mean 0 and standard deviation 1.1.
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
#################################### ## Code to simulate the sim1 dataset ## Simulate dataset sim1 with species identity effects and block effects, but no interaction effect ## Use the proportions from the first fifteen plots in Switzerland data(Switzerland) ## Repeat the 15 plots over four blocks. ## Give each community type a unique (community) number. sim1 <- data.frame(community = rep(1:15, each = 4), block = factor(rep(1:4, times = 15)), p1 = rep(Switzerland$p1[1:15], each = 4), p2 = rep(Switzerland$p2[1:15], each = 4), p3 = rep(Switzerland$p3[1:15], each = 4), p4 = rep(Switzerland$p4[1:15], each = 4)) ## To simulate the response, first create a matrix of predictors that includes ## p1-p4 and the four block dummy variables. X <- model.matrix(~ p1 + p2 + p3 + p4 + block -1, data = sim1) ## Create a vector of 'known' parameter values for simulating the response. ## The first four are the p1-p4 parameters, the second four are the block effects. sim1_coeff <- c(10,9,8,7, 1,1.5,2,0) ## Create response and add normally distributed error sim1$response <- as.numeric(X %*% sim1_coeff) set.seed(2020) r <- rnorm(n = 60, mean = 0, sd = 1.1) sim1$response <- round(sim1$response + r, digits = 3) ########################### ## Analyse the sim1 dataset ## Load the sim1 data data(sim1) ## View the first few entries head(sim1) ## Explore the variables in sim1 str(sim1) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p4 are in the 3rd to 6th columns in sim1 sim1sums <- rowSums(sim1[3:6]) summary(sim1sums) ## Check characteristics of sim1 hist(sim1$response) summary(sim1$response) plot(sim1$p1, sim1$response) plot(sim1$p2, sim1$response) plot(sim1$p3, sim1$response) plot(sim1$p4, sim1$response) ## Find the best DI model using autoDI and F-test selection auto1 <- autoDI(y = "response", prop = c("p1","p2","p3","p4"), block = "block", data = sim1, selection = "Ftest") summary(auto1) ## Fit the identity model using DI and the ID tag m1 <- DI(y = "response", prop = c("p1","p2","p3","p4"), block = "block", DImodel = "ID", data = sim1) summary(m1) plot(m1) ## Check goodness-of-fit using a half-normal plot with a simulated envelope library(hnp) hnp(m1) ## Fit the identity model using DI and custom_formula m2 <- DI(y = "response", custom_formula = response ~ 0 + p1 + p2 + p3 + p4 + block, data = sim1) summary(m2)#################################### ## Code to simulate the sim1 dataset ## Simulate dataset sim1 with species identity effects and block effects, but no interaction effect ## Use the proportions from the first fifteen plots in Switzerland data(Switzerland) ## Repeat the 15 plots over four blocks. ## Give each community type a unique (community) number. sim1 <- data.frame(community = rep(1:15, each = 4), block = factor(rep(1:4, times = 15)), p1 = rep(Switzerland$p1[1:15], each = 4), p2 = rep(Switzerland$p2[1:15], each = 4), p3 = rep(Switzerland$p3[1:15], each = 4), p4 = rep(Switzerland$p4[1:15], each = 4)) ## To simulate the response, first create a matrix of predictors that includes ## p1-p4 and the four block dummy variables. X <- model.matrix(~ p1 + p2 + p3 + p4 + block -1, data = sim1) ## Create a vector of 'known' parameter values for simulating the response. ## The first four are the p1-p4 parameters, the second four are the block effects. sim1_coeff <- c(10,9,8,7, 1,1.5,2,0) ## Create response and add normally distributed error sim1$response <- as.numeric(X %*% sim1_coeff) set.seed(2020) r <- rnorm(n = 60, mean = 0, sd = 1.1) sim1$response <- round(sim1$response + r, digits = 3) ########################### ## Analyse the sim1 dataset ## Load the sim1 data data(sim1) ## View the first few entries head(sim1) ## Explore the variables in sim1 str(sim1) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p4 are in the 3rd to 6th columns in sim1 sim1sums <- rowSums(sim1[3:6]) summary(sim1sums) ## Check characteristics of sim1 hist(sim1$response) summary(sim1$response) plot(sim1$p1, sim1$response) plot(sim1$p2, sim1$response) plot(sim1$p3, sim1$response) plot(sim1$p4, sim1$response) ## Find the best DI model using autoDI and F-test selection auto1 <- autoDI(y = "response", prop = c("p1","p2","p3","p4"), block = "block", data = sim1, selection = "Ftest") summary(auto1) ## Fit the identity model using DI and the ID tag m1 <- DI(y = "response", prop = c("p1","p2","p3","p4"), block = "block", DImodel = "ID", data = sim1) summary(m1) plot(m1) ## Check goodness-of-fit using a half-normal plot with a simulated envelope library(hnp) hnp(m1) ## Fit the identity model using DI and custom_formula m2 <- DI(y = "response", custom_formula = response ~ 0 + p1 + p2 + p3 + p4 + block, data = sim1) summary(m2)
The sim2 dataset was simulated. There are four blocks and four species that vary in proportions (p1 - p4). There are 15 unique sets of proportions identified by the variable community. Each unique community appears once in each block. The response was simulated assuming that there were species identity effects, block effects, an average pairwise interaction effect and a theta value of 0.5.
data(sim2)data(sim2)
A data frame with 60 observations on the following seven variables:
communityA numeric vector identifying each unique community, i.e., two rows with the same community value also share the same set of p1 to p4 values.
blockA factor taking values 1 to 4 indicating block membership.
p1A numeric vector indicating the initial proportion of species 1.
p2A numeric vector indicating the initial proportion of species 2.
p3A numeric vector indicating the initial proportion of species 3.
p4A numeric vector indicating the initial proportion of species 4.
responseA numeric vector giving the simulated response variable.
What are Diversity-Interactions (DI) models?
Diversity-Interactions (DI) models (Kirwan et al 2009) are a set of tools for analysing and interpreting data from experiments that explore the effects of species diversity on community-level responses. We strongly recommend that users read the short introduction to Diversity-Interactions models (available at: DImodels). Further information on Diversity-Interactions models is also available in Kirwan et al 2009 and Connolly et al 2013.
Parameter values for the simulation
DI models take the general form of:
where y is a community-level response, the Identities are the effects of species identities and enter the model as individual species proportions at the beginning of the time period, the Interactions are the interactions among the species proportions, while Structures include other experimental structures such as blocks, treatments or density.
The dataset sim2 was simulated with:
identity effects for the four species with values = 10, 9, 8, 7
block effects for the four blocks with values = 1, 1.5, 2, 0
an average pairwise interaction effect = 8
theta = 0.5 (where is a non-linear parameter included as a power on each product within interaction variables, see Connolly et al 2013 for details)
assumed normally distributed with mean 0 and standard deviation 1.1.
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
#################################### ## Code to simulate the sim2 dataset ## Simulate dataset sim2 with species identity effects, block effects and ## an average pairwise interaction effect with theta=0.5. ## Use the proportions from the first fifteen plots in Switzerland data(Switzerland) ## Repeat the 15 plots over four blocks. ## Give each community type a unique (community) number. sim2 <- data.frame(community = rep(1:15, each = 4), block = factor(rep(1:4, times = 15)), p1 = rep(Switzerland$p1[1:15], each = 4), p2 = rep(Switzerland$p2[1:15], each = 4), p3 = rep(Switzerland$p3[1:15], each = 4), p4 = rep(Switzerland$p4[1:15], each = 4)) ## Create the average pairwise interaction variable, with theta = 0.5 AV_variable <- DI_data(prop = c("p1","p2","p3","p4"), data = sim2, theta = 0.5, what = "AV") sim2 <- data.frame(sim2, "AV_theta" = AV_variable) ## To simulate the response, first create a matrix of predictors that includes p1-p4 and ## the four block variables and the average pairwise interaction variable with theta=0.5. X <- model.matrix(~ p1 + p2 + p3 + p4 + block + AV_theta -1, data = sim2) ## Create a vector of 'known' parameter values for simulating the response. ## The first four are the p1-p4 parameters, the second four are the block effects and ## the last one is the interaction parameter. sim2_coeff <- c(10,9,8,7, 1,1.5,2,0, 8) ## Create response and add normally distributed error sim2$response <- as.numeric(X %*% sim2_coeff) set.seed(328781) r <- rnorm(n = 60, mean = 0, sd = 1.1) sim2$response <- round(sim2$response + r, digits = 3) sim2$AV_theta <- NULL ################################################################################################### ################################################################################################### ## sim2 ########################### ## Analyse the sim2 dataset ## Load the sim2 data data(sim2) ## View the first few entries head(sim2) ## Explore the variables in sim2 str(sim2) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p4 are in the 3rd to 6th columns in sim2 sim2sums <- rowSums(sim2[3:6]) summary(sim2sums) ## Check characteristics of sim2 hist(sim2$response) summary(sim2$response) plot(sim2$p1, sim2$response) plot(sim2$p2, sim2$response) plot(sim2$p3, sim2$response) plot(sim2$p4, sim2$response) ## Find the best DI model using autoDI and F-test selection auto1 <- autoDI(y = "response", prop = c("p1", "p2", "p3", "p4"), block = "block", data = sim2, selection = "Ftest") summary(auto1) ## Fit the average pairwise model, including theta, using DI and the AV tag m1 <- DI(y = "response", prop = c("p1","p2","p3","p4"), block = "block", DImodel = "AV", estimate_theta = TRUE, data = sim2) summary(m1) CI_95 <- theta_CI(m1, conf = .95) CI_95 plot(m1) library(hnp) ## Check goodness-of-fit using a half-normal plot with a simulated envelope library(hnp) hnp(m1) ## Graph the profile likelihood library(ggplot2) ggplot(m1$profile_loglik, aes(x = grid, y = prof)) + theme_bw() + geom_line() + xlim(0,1.5) + xlab(expression(theta)) + ylab("Log-likelihood") + geom_vline(xintercept = CI_95, lty = 3) + labs(title = " Log-likelihood versus theta", caption = "dotted vertical lines are upper and lower bounds of 95% CI for theta") ## Fit the average pairwise model, including theta, using DI and custom_formula ## A value of theta must be 'chosen'. Take: 0.4533437 from m1. The 'estimate_theta' option is not ## available with custom_formula. AV_variable <- DI_data(prop = c(3:6), data = sim2, theta = 0.4533437, what = "AV") sim2a <- data.frame(sim2, "AV_theta" = AV_variable) m2 <- DI(y = "response", custom_formula = response ~ 0 + p1 + p2 + p3 + p4 + AV_theta + block, data = sim2a) ## This will adjust the standard errors in m2 for the 'estimation' of theta m2$df.residual <- m2$df.residual - 1 ## This will adjust the AIC in m2 for the 'estimation' of theta m2$aic <- m2$aic + 2 summary(m2)#################################### ## Code to simulate the sim2 dataset ## Simulate dataset sim2 with species identity effects, block effects and ## an average pairwise interaction effect with theta=0.5. ## Use the proportions from the first fifteen plots in Switzerland data(Switzerland) ## Repeat the 15 plots over four blocks. ## Give each community type a unique (community) number. sim2 <- data.frame(community = rep(1:15, each = 4), block = factor(rep(1:4, times = 15)), p1 = rep(Switzerland$p1[1:15], each = 4), p2 = rep(Switzerland$p2[1:15], each = 4), p3 = rep(Switzerland$p3[1:15], each = 4), p4 = rep(Switzerland$p4[1:15], each = 4)) ## Create the average pairwise interaction variable, with theta = 0.5 AV_variable <- DI_data(prop = c("p1","p2","p3","p4"), data = sim2, theta = 0.5, what = "AV") sim2 <- data.frame(sim2, "AV_theta" = AV_variable) ## To simulate the response, first create a matrix of predictors that includes p1-p4 and ## the four block variables and the average pairwise interaction variable with theta=0.5. X <- model.matrix(~ p1 + p2 + p3 + p4 + block + AV_theta -1, data = sim2) ## Create a vector of 'known' parameter values for simulating the response. ## The first four are the p1-p4 parameters, the second four are the block effects and ## the last one is the interaction parameter. sim2_coeff <- c(10,9,8,7, 1,1.5,2,0, 8) ## Create response and add normally distributed error sim2$response <- as.numeric(X %*% sim2_coeff) set.seed(328781) r <- rnorm(n = 60, mean = 0, sd = 1.1) sim2$response <- round(sim2$response + r, digits = 3) sim2$AV_theta <- NULL ################################################################################################### ################################################################################################### ## sim2 ########################### ## Analyse the sim2 dataset ## Load the sim2 data data(sim2) ## View the first few entries head(sim2) ## Explore the variables in sim2 str(sim2) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p4 are in the 3rd to 6th columns in sim2 sim2sums <- rowSums(sim2[3:6]) summary(sim2sums) ## Check characteristics of sim2 hist(sim2$response) summary(sim2$response) plot(sim2$p1, sim2$response) plot(sim2$p2, sim2$response) plot(sim2$p3, sim2$response) plot(sim2$p4, sim2$response) ## Find the best DI model using autoDI and F-test selection auto1 <- autoDI(y = "response", prop = c("p1", "p2", "p3", "p4"), block = "block", data = sim2, selection = "Ftest") summary(auto1) ## Fit the average pairwise model, including theta, using DI and the AV tag m1 <- DI(y = "response", prop = c("p1","p2","p3","p4"), block = "block", DImodel = "AV", estimate_theta = TRUE, data = sim2) summary(m1) CI_95 <- theta_CI(m1, conf = .95) CI_95 plot(m1) library(hnp) ## Check goodness-of-fit using a half-normal plot with a simulated envelope library(hnp) hnp(m1) ## Graph the profile likelihood library(ggplot2) ggplot(m1$profile_loglik, aes(x = grid, y = prof)) + theme_bw() + geom_line() + xlim(0,1.5) + xlab(expression(theta)) + ylab("Log-likelihood") + geom_vline(xintercept = CI_95, lty = 3) + labs(title = " Log-likelihood versus theta", caption = "dotted vertical lines are upper and lower bounds of 95% CI for theta") ## Fit the average pairwise model, including theta, using DI and custom_formula ## A value of theta must be 'chosen'. Take: 0.4533437 from m1. The 'estimate_theta' option is not ## available with custom_formula. AV_variable <- DI_data(prop = c(3:6), data = sim2, theta = 0.4533437, what = "AV") sim2a <- data.frame(sim2, "AV_theta" = AV_variable) m2 <- DI(y = "response", custom_formula = response ~ 0 + p1 + p2 + p3 + p4 + AV_theta + block, data = sim2a) ## This will adjust the standard errors in m2 for the 'estimation' of theta m2$df.residual <- m2$df.residual - 1 ## This will adjust the AIC in m2 for the 'estimation' of theta m2$aic <- m2$aic + 2 summary(m2)
The sim3 dataset was simulated. There are two treatments and nine species that vary in proportions (p1 - p9). It is assumed that species 1 to 5 come from functional group 1, species 6 and 7 from functional group 2 and species 8 and 9 from functional group 3. The response was simulated assuming that there were species identity effects and functional group specific interaction effects.
data(sim3)data(sim3)
A data frame with 412 observations on the following 13 variables:
communityA numeric vector identifying each unique community, i.e., two rows with the same community value also share the same set of p1 to p9 values.
richnessA numeric vector identifying the number of species in the initial composition.
treatmentA factor with levels A or B.
p1A numeric vector indicating the initial proportion of species 1.
p2A numeric vector indicating the initial proportion of species 2.
p3A numeric vector indicating the initial proportion of species 3.
p4A numeric vector indicating the initial proportion of species 4.
p5A numeric vector indicating the initial proportion of species 5.
p6A numeric vector indicating the initial proportion of species 6.
p7A numeric vector indicating the initial proportion of species 7.
p8A numeric vector indicating the initial proportion of species 8.
p9A numeric vector indicating the initial proportion of species 9.
responseA numeric vector giving the simulated response variable.
What are Diversity-Interactions (DI) models?
Diversity-Interactions (DI) models (Kirwan et al 2009) are a set of tools for analysing and interpreting data from experiments that explore the effects of species diversity on community-level responses. We strongly recommend that users read the short introduction to Diversity-Interactions models (available at: DImodels). Further information on Diversity-Interactions models is also available in Kirwan et al 2009 and Connolly et al 2013.
Parameter values for the simulation
DI models take the general form of:
where y is a community-level response, the Identities are the effects of species identities and enter the model as individual species proportions at the beginning of the time period, the Interactions are the interactions among the species proportions, while Structures include other experimental structures such as blocks, treatments or density.
The dataset sim3 was simulated with:
identity effects for the nine species with values = 10, 9, 8, 7, 11, 6, 5, 8, 9
treatment effects = 3, 0
functional group specific interact effects; assume functional groups are labelled FG1, FG2 and FG3, then the interaction parameter values are: between FG1 and FG2 = 4, between FG1 and FG3 = 9, between FG2 and FG3 = 3, within FG1 = 2, within FG2 = 3 and within FG3 = 1
theta = 1 (where is a non-linear parameter included as a power on each product within interaction variables, see Connolly et al 2013 for details)
assumed normally distributed with mean 0 and standard deviation 1.2.
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
#################################### ## Code to simulate the sim3 dataset ## Simulate dataset sim3 with 9 species, three functional groups and two levels of a treatment. ## The species 1-5 are FG1, species 6-7 are FG2 and species 8-9 are FG3. ## Assume ID effects and the FG interaction model, with a treatment (factor with two levels). ## Set up proportions data("design_a") sim3a <- design_a # Replicate the design over two treatments sim3b <- sim3a[rep(seq_len(nrow(sim3a)), each = 2), ] sim3c <- data.frame(treatment = factor(rep(c("A","B"), times = 206))) sim3 <- data.frame(sim3b[,1:2], sim3c, sim3b[,3:11]) row.names(sim3) <- NULL ## Create the functional group interaction variables FG_matrix <- DI_data(prop = 4:12, FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), data = sim3, what = "FG") sim3 <- data.frame(sim3, FG_matrix) ## To simulate the response, first create a matrix of predictors that includes p1-p9, the treatment ## and the interaction variables. X <- model.matrix(~ p1 + p2 + p3 + p4 + p5 + p6 + p7 + p8 + p9 + treatment + bfg_FG1_FG2 + bfg_FG1_FG3 + bfg_FG2_FG3 + wfg_FG1 + wfg_FG2 + wfg_FG3 -1, data=sim3) ## Create a vector of 'known' parameter values for simulating the response. ## The first nine are the p1-p9 parameters, and the second set of two are the treatment effects ## and the third set of six are the interaction parameters. sim3_coeff <- c(10,9,8,7,11, 6,5, 8,9, 3,0, 4,9,3, 2,3,1) ## Create response and add normally distributed error sim3$response <- as.numeric(X %*% sim3_coeff) set.seed(1657914) r <- rnorm(n = 412, mean = 0, sd = 1.2) sim3$response <- round(sim3$response + r, digits = 3) sim3[,13:18] <- NULL ########################### ## Analyse the sim3 dataset ## Load the sim3 data data(sim3) ## View the first few entries head(sim3) ## Explore the variables in sim3 str(sim3) ## Check characteristics of sim3 hist(sim3$response) summary(sim3$response) plot(sim3$richness, sim3$response) plot(sim3$p1, sim3$response) plot(sim3$p2, sim3$response) plot(sim3$p3, sim3$response) plot(sim3$p4, sim3$response) plot(sim3$p5, sim3$response) plot(sim3$p6, sim3$response) plot(sim3$p7, sim3$response) plot(sim3$p8, sim3$response) plot(sim3$p9, sim3$response) ## What model fits best? Selection using F-test in autoDI auto1 <- autoDI(y = "response", prop = 4:12, treat = "treatment", FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), data = sim3, selection = "Ftest") summary(auto1) ## Fit the functional group model, with treatment, using DI and the FG tag m1 <- DI(y = "response", prop = 4:12, FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), treat = "treatment", DImodel = "FG", data = sim3) summary(m1) plot(m1) ## Check goodness-of-fit using a half-normal plot with a simulated envelope library(hnp) hnp(m1) ## Create the functional group interactions and store them in a new dataset FG_matrix <- DI_data(prop = 4:12, FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), data = sim3, what = "FG") sim3a <- data.frame(sim3, FG_matrix) ## Test if the FG interaction variables interact with treatment using 'extra_formula' m2 <- DI(y = "response", prop = 4:12, FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), treat = "treatment", DImodel = "FG", extra_formula = ~ bfg_FG1_FG2:treatment + bfg_FG1_FG3:treatment + bfg_FG2_FG3:treatment + wfg_FG1:treatment + wfg_FG2:treatment + wfg_FG3:treatment, data = sim3a) summary(m2) ## Fit the functional group model using DI and custom_formula ## Set up a dummy variable for treatment first (required). sim3a$treatmentA <- as.numeric(sim3a$treatment=="A") m3 <- DI(y = "response", custom_formula = response ~ 0 + p1 + p2 + p3 + p4 + p5 + p6 + p7 + p8 + p9 + treatmentA + bfg_FG1_FG2 + bfg_FG1_FG3 + bfg_FG2_FG3 + wfg_FG1 + wfg_FG2 + wfg_FG3, data = sim3a) summary(m3)#################################### ## Code to simulate the sim3 dataset ## Simulate dataset sim3 with 9 species, three functional groups and two levels of a treatment. ## The species 1-5 are FG1, species 6-7 are FG2 and species 8-9 are FG3. ## Assume ID effects and the FG interaction model, with a treatment (factor with two levels). ## Set up proportions data("design_a") sim3a <- design_a # Replicate the design over two treatments sim3b <- sim3a[rep(seq_len(nrow(sim3a)), each = 2), ] sim3c <- data.frame(treatment = factor(rep(c("A","B"), times = 206))) sim3 <- data.frame(sim3b[,1:2], sim3c, sim3b[,3:11]) row.names(sim3) <- NULL ## Create the functional group interaction variables FG_matrix <- DI_data(prop = 4:12, FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), data = sim3, what = "FG") sim3 <- data.frame(sim3, FG_matrix) ## To simulate the response, first create a matrix of predictors that includes p1-p9, the treatment ## and the interaction variables. X <- model.matrix(~ p1 + p2 + p3 + p4 + p5 + p6 + p7 + p8 + p9 + treatment + bfg_FG1_FG2 + bfg_FG1_FG3 + bfg_FG2_FG3 + wfg_FG1 + wfg_FG2 + wfg_FG3 -1, data=sim3) ## Create a vector of 'known' parameter values for simulating the response. ## The first nine are the p1-p9 parameters, and the second set of two are the treatment effects ## and the third set of six are the interaction parameters. sim3_coeff <- c(10,9,8,7,11, 6,5, 8,9, 3,0, 4,9,3, 2,3,1) ## Create response and add normally distributed error sim3$response <- as.numeric(X %*% sim3_coeff) set.seed(1657914) r <- rnorm(n = 412, mean = 0, sd = 1.2) sim3$response <- round(sim3$response + r, digits = 3) sim3[,13:18] <- NULL ########################### ## Analyse the sim3 dataset ## Load the sim3 data data(sim3) ## View the first few entries head(sim3) ## Explore the variables in sim3 str(sim3) ## Check characteristics of sim3 hist(sim3$response) summary(sim3$response) plot(sim3$richness, sim3$response) plot(sim3$p1, sim3$response) plot(sim3$p2, sim3$response) plot(sim3$p3, sim3$response) plot(sim3$p4, sim3$response) plot(sim3$p5, sim3$response) plot(sim3$p6, sim3$response) plot(sim3$p7, sim3$response) plot(sim3$p8, sim3$response) plot(sim3$p9, sim3$response) ## What model fits best? Selection using F-test in autoDI auto1 <- autoDI(y = "response", prop = 4:12, treat = "treatment", FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), data = sim3, selection = "Ftest") summary(auto1) ## Fit the functional group model, with treatment, using DI and the FG tag m1 <- DI(y = "response", prop = 4:12, FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), treat = "treatment", DImodel = "FG", data = sim3) summary(m1) plot(m1) ## Check goodness-of-fit using a half-normal plot with a simulated envelope library(hnp) hnp(m1) ## Create the functional group interactions and store them in a new dataset FG_matrix <- DI_data(prop = 4:12, FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), data = sim3, what = "FG") sim3a <- data.frame(sim3, FG_matrix) ## Test if the FG interaction variables interact with treatment using 'extra_formula' m2 <- DI(y = "response", prop = 4:12, FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), treat = "treatment", DImodel = "FG", extra_formula = ~ bfg_FG1_FG2:treatment + bfg_FG1_FG3:treatment + bfg_FG2_FG3:treatment + wfg_FG1:treatment + wfg_FG2:treatment + wfg_FG3:treatment, data = sim3a) summary(m2) ## Fit the functional group model using DI and custom_formula ## Set up a dummy variable for treatment first (required). sim3a$treatmentA <- as.numeric(sim3a$treatment=="A") m3 <- DI(y = "response", custom_formula = response ~ 0 + p1 + p2 + p3 + p4 + p5 + p6 + p7 + p8 + p9 + treatmentA + bfg_FG1_FG2 + bfg_FG1_FG3 + bfg_FG2_FG3 + wfg_FG1 + wfg_FG2 + wfg_FG3, data = sim3a) summary(m3)
The sim4 dataset was simulated. There is a covariate treatment and six species that vary in proportions (p1 - p6). It is assumed that species 1 and 2 come from functional group 1, species 3 and 4 from functional group 2 and species 5 and 6 from functional group 3. The response was simulated assuming that there were species identity effects, separate pairwise interaction effects and a covariate effect.
data(sim4)data(sim4)
A data frame with 141 observations on the following nine variables:
richnessA numeric vector identifying the number of species in the initial composition.
treatmentA covariate taking values 50, 150 or 250.
p1A numeric vector indicating the initial proportion of species 1.
p2A numeric vector indicating the initial proportion of species 2.
p3A numeric vector indicating the initial proportion of species 3.
p4A numeric vector indicating the initial proportion of species 4.
p5A numeric vector indicating the initial proportion of species 5.
p6A numeric vector indicating the initial proportion of species 6.
responseA numeric vector giving the simulated response variable.
What are Diversity-Interactions (DI) models?
Diversity-Interactions (DI) models (Kirwan et al 2009) are a set of tools for analysing and interpreting data from experiments that explore the effects of species diversity on community-level responses. We strongly recommend that users read the short introduction to Diversity-Interactions models (available at: DImodels). Further information on Diversity-Interactions models is also available in Kirwan et al 2009 and Connolly et al 2013.
Parameter values for the simulation
DI models take the general form of:
where y is a community-level response, the Identities are the effects of species identities and enter the model as individual species proportions at the beginning of the time period, the Interactions are the interactions among the species proportions, while Structures include other experimental structures such as blocks, treatments or density.
The dataset sim4 was simulated with:
identity effects for the six species with values = 25, 16, 18, 20, 10, 12
a covariate effect = 0.03
all 15 pairwise interaction effects with values: 30, 27, 20, 15, 10, 9, 14, 18, 36, 17, 26, 32, 9, 21, 16 (for pairs of species 1-2, 1-3, 1-4, 1-5, 1-6, 2-3, 2-4, ... , 5-6 respectively).
theta = 1 (where is a non-linear parameter included as a power on each product within interaction variables, see Connolly et al 2013 for details)
assumed normally distributed with mean 0 and standard deviation 2.
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
#################################### ## Code to simulate the sim4 dataset ## Simulate dataset sim4 with 6 species, three functional groups and three levels of a covariate ## The species 1-2 are FG1, species 3-4 are FG2 and species 5-6 are FG3. ## Assume ID effects and the full pairwise interaction model, with a covariate. ## Set up proportions data("design_b") sim4a <- design_b # Replicate the design for three values of a covariate sim4b <- sim4a[rep(seq_len(nrow(sim4a)), times = 3), ] sim4c <- data.frame(treatment = rep(c(50, 150, 250), each = 47)) sim4 <- data.frame(richness = sim4b[,1], sim4c, sim4b[,2:7]) row.names(sim4) <- NULL ## To simulate the response, first create a matrix of predictors that includes p1-p6, the treatment ## and all pairwise interaction variables X <- model.matrix(~ p1 + p2 + p3 + p4 + p5 + p6 + treatment + (p1 + p2 + p3 + p4 + p5 + p6)^2 -1, data = sim4) ## Create a vector of 'known' parameter values for simulating the response. ## The first six are the p1-p6 parameters, and the second set of one is the treatment parameter ## and the third set of 15 are the interaction parameters. sim4_coeff <- c(25,16,18,20,10,12, 0.03, 30,27,20,15,10,9,14,18,36,17,26,32,9,21,16) ## Create response and add normally distributed error sim4$response <- as.numeric(X %*% sim4_coeff) set.seed(34261) r <- rnorm(n = 141, mean = 0, sd = 2) sim4$response <- round(sim4$response + r, digits = 3) ########################### ## Analyse the sim4 dataset ## Load the sim4 data data(sim4) ## View the first few entries head(sim4) ## Explore the variables in sim4 str(sim4) ## Check characteristics of sim4 hist(sim4$response) summary(sim4$response) plot(sim4$richness, sim4$response) plot(sim4$richness[sim4$treatment==50], sim4$response[sim4$treatment==50], ylim=c(0,40)) plot(sim4$richness[sim4$treatment==150], sim4$response[sim4$treatment==150], ylim=c(0,40)) plot(sim4$richness[sim4$treatment==250], sim4$response[sim4$treatment==250], ylim=c(0,40)) plot(sim4$p1, sim4$response) plot(sim4$p2, sim4$response) plot(sim4$p3, sim4$response) plot(sim4$p4, sim4$response) plot(sim4$p5, sim4$response) plot(sim4$p6, sim4$response) ## What model fits best? Selection using F-test auto1 <- autoDI(y = "response", prop = 3:8, treat = "treatment", FG = c("FG1","FG1","FG2","FG2","FG3","FG3"), data = sim4, selection = "Ftest") summary(auto1) ## Ignore functional groups (will replace FG model with ADD model in Step 1 selection) auto2 <- autoDI(y = "response", prop = 3:8, treat = "treatment", data = sim4, selection = "Ftest") summary(auto2) ## Fit the functional group model using DI and the FG tag m1 <- DI(y = "response", prop = 3:8, treat = "treatment", FG = c("FG1","FG1","FG2","FG2","FG3","FG3"), DImodel = "FG", data = sim4) summary(m1) ## Fit the additive species model using DI and the ADD tag m2 <- DI(y = "response", prop = 3:8, treat = "treatment", DImodel = "ADD", data = sim4) summary(m2) ## Fit the full pairwise model using DI and the FULL tag m3 <- DI(y = "response", prop = 3:8, treat = "treatment", DImodel = "FULL", data = sim4) summary(m3) plot(m3) ## Check goodness-of-fit using a half-normal plot with a simulated envelope library(hnp) hnp(m3) ## Create the functional group and additive species interaction variables, ## and store in a new data frame called sim4a newlist <- DI_data(prop = 3:8, FG = c("FG1","FG1","FG2","FG2","FG3","FG3"), data = sim4, what = c("FG", "ADD")) sim4a <- data.frame(sim4, newlist$FG, newlist$ADD) ## Fit the functional group model using DI and custom_formula (equivalent to m1) m4 <- DI(custom_formula = response ~ 0 + p1 + p2 + p3 + p4 + p5 + p6 + bfg_FG1_FG2 + bfg_FG1_FG3 + bfg_FG2_FG3 + wfg_FG1 + wfg_FG2 + wfg_FG3 + treatment, data = sim4a) summary(m4) ## Fit the additive species model using DI and custom_formula (equivalent to m2) m5 <- DI(custom_formula = response ~ 0 + p1 + p2 + p3 + p4 + p5 + p6 + p1_add + p2_add + p3_add + p4_add + p5_add + p6_add + treatment, data = sim4a) summary(m5) ## Fit the full pairwise model using DI and custom_formula (equivalent to m3) m6 <- DI(custom_formula = response ~ 0 + p1 + p2 + p3 + p4 + p5 + p6 + (p1 + p2 + p3 + p4 + p5 + p6)^2 + treatment, data = sim4a) summary(m6)#################################### ## Code to simulate the sim4 dataset ## Simulate dataset sim4 with 6 species, three functional groups and three levels of a covariate ## The species 1-2 are FG1, species 3-4 are FG2 and species 5-6 are FG3. ## Assume ID effects and the full pairwise interaction model, with a covariate. ## Set up proportions data("design_b") sim4a <- design_b # Replicate the design for three values of a covariate sim4b <- sim4a[rep(seq_len(nrow(sim4a)), times = 3), ] sim4c <- data.frame(treatment = rep(c(50, 150, 250), each = 47)) sim4 <- data.frame(richness = sim4b[,1], sim4c, sim4b[,2:7]) row.names(sim4) <- NULL ## To simulate the response, first create a matrix of predictors that includes p1-p6, the treatment ## and all pairwise interaction variables X <- model.matrix(~ p1 + p2 + p3 + p4 + p5 + p6 + treatment + (p1 + p2 + p3 + p4 + p5 + p6)^2 -1, data = sim4) ## Create a vector of 'known' parameter values for simulating the response. ## The first six are the p1-p6 parameters, and the second set of one is the treatment parameter ## and the third set of 15 are the interaction parameters. sim4_coeff <- c(25,16,18,20,10,12, 0.03, 30,27,20,15,10,9,14,18,36,17,26,32,9,21,16) ## Create response and add normally distributed error sim4$response <- as.numeric(X %*% sim4_coeff) set.seed(34261) r <- rnorm(n = 141, mean = 0, sd = 2) sim4$response <- round(sim4$response + r, digits = 3) ########################### ## Analyse the sim4 dataset ## Load the sim4 data data(sim4) ## View the first few entries head(sim4) ## Explore the variables in sim4 str(sim4) ## Check characteristics of sim4 hist(sim4$response) summary(sim4$response) plot(sim4$richness, sim4$response) plot(sim4$richness[sim4$treatment==50], sim4$response[sim4$treatment==50], ylim=c(0,40)) plot(sim4$richness[sim4$treatment==150], sim4$response[sim4$treatment==150], ylim=c(0,40)) plot(sim4$richness[sim4$treatment==250], sim4$response[sim4$treatment==250], ylim=c(0,40)) plot(sim4$p1, sim4$response) plot(sim4$p2, sim4$response) plot(sim4$p3, sim4$response) plot(sim4$p4, sim4$response) plot(sim4$p5, sim4$response) plot(sim4$p6, sim4$response) ## What model fits best? Selection using F-test auto1 <- autoDI(y = "response", prop = 3:8, treat = "treatment", FG = c("FG1","FG1","FG2","FG2","FG3","FG3"), data = sim4, selection = "Ftest") summary(auto1) ## Ignore functional groups (will replace FG model with ADD model in Step 1 selection) auto2 <- autoDI(y = "response", prop = 3:8, treat = "treatment", data = sim4, selection = "Ftest") summary(auto2) ## Fit the functional group model using DI and the FG tag m1 <- DI(y = "response", prop = 3:8, treat = "treatment", FG = c("FG1","FG1","FG2","FG2","FG3","FG3"), DImodel = "FG", data = sim4) summary(m1) ## Fit the additive species model using DI and the ADD tag m2 <- DI(y = "response", prop = 3:8, treat = "treatment", DImodel = "ADD", data = sim4) summary(m2) ## Fit the full pairwise model using DI and the FULL tag m3 <- DI(y = "response", prop = 3:8, treat = "treatment", DImodel = "FULL", data = sim4) summary(m3) plot(m3) ## Check goodness-of-fit using a half-normal plot with a simulated envelope library(hnp) hnp(m3) ## Create the functional group and additive species interaction variables, ## and store in a new data frame called sim4a newlist <- DI_data(prop = 3:8, FG = c("FG1","FG1","FG2","FG2","FG3","FG3"), data = sim4, what = c("FG", "ADD")) sim4a <- data.frame(sim4, newlist$FG, newlist$ADD) ## Fit the functional group model using DI and custom_formula (equivalent to m1) m4 <- DI(custom_formula = response ~ 0 + p1 + p2 + p3 + p4 + p5 + p6 + bfg_FG1_FG2 + bfg_FG1_FG3 + bfg_FG2_FG3 + wfg_FG1 + wfg_FG2 + wfg_FG3 + treatment, data = sim4a) summary(m4) ## Fit the additive species model using DI and custom_formula (equivalent to m2) m5 <- DI(custom_formula = response ~ 0 + p1 + p2 + p3 + p4 + p5 + p6 + p1_add + p2_add + p3_add + p4_add + p5_add + p6_add + treatment, data = sim4a) summary(m5) ## Fit the full pairwise model using DI and custom_formula (equivalent to m3) m6 <- DI(custom_formula = response ~ 0 + p1 + p2 + p3 + p4 + p5 + p6 + (p1 + p2 + p3 + p4 + p5 + p6)^2 + treatment, data = sim4a) summary(m6)
The sim5 dataset was simulated. There are nine species that vary in proportions (p1 - p9). It is assumed that species 1 to 5 come from functional group 1, species 6 and 7 from functional group 2 and species 8 and 9 from functional group 3. The response was simulated assuming that there were species identity effects and functional group specific interaction effects, with theta = 0.7.
data(sim5)data(sim5)
A data frame with 206 observations on the following 12 variables:
communityA numeric vector identifying each unique community, i.e., two rows with the same community value also share the same set of p1 to p9 values.
richnessA numeric vector identifying the number of species in the initial composition.
p1A numeric vector indicating the initial proportion of species 1.
p2A numeric vector indicating the initial proportion of species 2.
p3A numeric vector indicating the initial proportion of species 3.
p4A numeric vector indicating the initial proportion of species 4.
p5A numeric vector indicating the initial proportion of species 5.
p6A numeric vector indicating the initial proportion of species 6.
p7A numeric vector indicating the initial proportion of species 7.
p8A numeric vector indicating the initial proportion of species 8.
p9A numeric vector indicating the initial proportion of species 9.
responseA numeric vector giving the simulated response variable.
What are Diversity-Interactions (DI) models?
Diversity-Interactions (DI) models (Kirwan et al 2009) are a set of tools for analysing and interpreting data from experiments that explore the effects of species diversity on community-level responses. We strongly recommend that users read the short introduction to Diversity-Interactions models (available at: DImodels). Further information on Diversity-Interactions models is also available in Kirwan et al 2009 and Connolly et al 2013.
Parameter values for the simulation
DI models take the general form of:
where y is a community-level response, the Identities are the effects of species identities and enter the model as individual species proportions at the beginning of the time period, the Interactions are the interactions among the species proportions, while Structures include other experimental structures such as blocks, treatments or density.
The dataset sim5 was simulated with:
identity effects for the nine species with values = 10, 9, 8, 7, 11, 6, 5, 8, 9
functional group specific interaction effects; assume functional groups are labelled FG1, FG2 and FG3, then the interaction parameter values are: between FG1 and FG2 = 8, between FG1 and FG3 = 3, between FG2 and FG3 = 6, within FG1 = 6, within FG2 = 4 and within FG3 = 5
theta = 0.7 (where is a non-linear parameter included as a power on each product within interaction variables, see Connolly et al 2013 for details)
assumed normally distributed with mean 0 and standard deviation 1.2.
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
#################################### ## Code to simulate the sim5 dataset ## Simulate dataset sim5 with 9 species and three functional groups. ## The species 1-5 are FG1, species 6-7 are FG2 and species 8-9 are FG3. ## Assume ID effects and the FG interactions model, with theta = 0.7. ## Set up proportions data("design_a") sim5 <- design_a ## Create the functional group interaction variables, with theta = 0.7. FG_matrix <- DI_data(prop = 3:11, FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), data = sim5, theta = 0.7, what = "FG") sim5 <- data.frame(sim5, FG_matrix) names(sim5)[12:17] <- paste0(names(sim5)[12:17], "_theta") ## To simulate the response, first create a matrix of predictors that includes p1-p9, the ## treatment and the interaction variables. X <- model.matrix(~ p1 + p2 + p3 + p4 + p5 + p6 + p7 + p8 + p9 + bfg_FG1_FG2_theta + bfg_FG1_FG3_theta + bfg_FG2_FG3_theta + wfg_FG1_theta + wfg_FG2_theta + wfg_FG3_theta -1, data = sim5) ## Create a vector of 'known' parameter values for simulating the response. ## The first nine are the p1-p9 parameters, and the second set of six are the interaction ## parameters. sim5_coeff <- c(10,9,8,7,11, 6,5, 8,9, 8,3,6, 6,4,5) ##Create response and add normally distributed error sim5$response <- as.numeric(X %*% sim5_coeff) set.seed(35748) r <- rnorm(n = 206, mean = 0, sd = 1.2) sim5$response <- round(sim5$response + r, digits = 3) sim5[,12:17] <- NULL ########################### ## Analyse the sim5 dataset ## Load the sim5 data data(sim5) ## View the first few entries head(sim5) ## Explore the variables in sim5 str(sim5) ## Check characteristics of sim5 hist(sim5$response) summary(sim5$response) plot(sim5$richness, sim5$response) plot(sim5$p1, sim5$response) plot(sim5$p2, sim5$response) plot(sim5$p3, sim5$response) plot(sim5$p4, sim5$response) plot(sim5$p5, sim5$response) plot(sim5$p6, sim5$response) plot(sim5$p7, sim5$response) plot(sim5$p8, sim5$response) plot(sim5$p9, sim5$response) ## What model fits best? Selection using F-test in autoDI auto1 <- autoDI(y = "response", prop = 3:11, FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), data = sim5, selection = "Ftest") summary(auto1) ## Fit the functional group model, with theta, using DI and the FG tag m1 <- DI(y = "response", prop = 3:11, FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), DImodel = "FG", estimate_theta = TRUE, data = sim5) summary(m1) CI_95 <- theta_CI(m1, conf = .95) CI_95 plot(m1) ## Check goodness-of-fit using a half-normal plot with a simulated envelope library(hnp) hnp(m1) ## Graph the profile likelihood library(ggplot2) ggplot(m1$profile_loglik, aes(x = grid, y = prof)) + theme_bw() + geom_line() + xlim(0,1.5) + xlab(expression(theta)) + ylab("Log-likelihood") + geom_vline(xintercept = CI_95, lty = 3) + labs(title = " Log-likelihood versus theta", caption = "dotted vertical lines are upper and lower bounds of 95% CI for theta") ## Fit the functional group model, with theta set equal to the estimate from m1, and custom_formula. ## Note, it is not possible to estimate theta with custom_formula (only select a 'known' value). ## First, create the functional group interactions (theta value as estimated from m1), ## store them in a new dataset and rename them with a theta indicator. FG_matrix <- DI_data(prop = 3:11, FG=c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), theta = 0.7296887, data = sim5, what = "FG") sim5new <- data.frame(sim5, FG_matrix) names(sim5new)[13:18] <- paste0(names(sim5new)[13:18], "_theta") m2 <- DI(custom_formula = response ~ 0 + p1 + p2 + p3 + p4 + p5 + p6 + p7 + p8 + p9 + bfg_FG1_FG2_theta + bfg_FG1_FG3_theta + bfg_FG2_FG3_theta + wfg_FG1_theta + wfg_FG2_theta + wfg_FG3_theta, data = sim5new) ## This will adjust the standard errors in m2 for the 'estimation' of theta m2$df.residual <- m2$df.residual - 1 ## This will adjust the AIC in m2 for the 'estimation' of theta m2$aic <- m2$aic + 2 summary(m2)#################################### ## Code to simulate the sim5 dataset ## Simulate dataset sim5 with 9 species and three functional groups. ## The species 1-5 are FG1, species 6-7 are FG2 and species 8-9 are FG3. ## Assume ID effects and the FG interactions model, with theta = 0.7. ## Set up proportions data("design_a") sim5 <- design_a ## Create the functional group interaction variables, with theta = 0.7. FG_matrix <- DI_data(prop = 3:11, FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), data = sim5, theta = 0.7, what = "FG") sim5 <- data.frame(sim5, FG_matrix) names(sim5)[12:17] <- paste0(names(sim5)[12:17], "_theta") ## To simulate the response, first create a matrix of predictors that includes p1-p9, the ## treatment and the interaction variables. X <- model.matrix(~ p1 + p2 + p3 + p4 + p5 + p6 + p7 + p8 + p9 + bfg_FG1_FG2_theta + bfg_FG1_FG3_theta + bfg_FG2_FG3_theta + wfg_FG1_theta + wfg_FG2_theta + wfg_FG3_theta -1, data = sim5) ## Create a vector of 'known' parameter values for simulating the response. ## The first nine are the p1-p9 parameters, and the second set of six are the interaction ## parameters. sim5_coeff <- c(10,9,8,7,11, 6,5, 8,9, 8,3,6, 6,4,5) ##Create response and add normally distributed error sim5$response <- as.numeric(X %*% sim5_coeff) set.seed(35748) r <- rnorm(n = 206, mean = 0, sd = 1.2) sim5$response <- round(sim5$response + r, digits = 3) sim5[,12:17] <- NULL ########################### ## Analyse the sim5 dataset ## Load the sim5 data data(sim5) ## View the first few entries head(sim5) ## Explore the variables in sim5 str(sim5) ## Check characteristics of sim5 hist(sim5$response) summary(sim5$response) plot(sim5$richness, sim5$response) plot(sim5$p1, sim5$response) plot(sim5$p2, sim5$response) plot(sim5$p3, sim5$response) plot(sim5$p4, sim5$response) plot(sim5$p5, sim5$response) plot(sim5$p6, sim5$response) plot(sim5$p7, sim5$response) plot(sim5$p8, sim5$response) plot(sim5$p9, sim5$response) ## What model fits best? Selection using F-test in autoDI auto1 <- autoDI(y = "response", prop = 3:11, FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), data = sim5, selection = "Ftest") summary(auto1) ## Fit the functional group model, with theta, using DI and the FG tag m1 <- DI(y = "response", prop = 3:11, FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), DImodel = "FG", estimate_theta = TRUE, data = sim5) summary(m1) CI_95 <- theta_CI(m1, conf = .95) CI_95 plot(m1) ## Check goodness-of-fit using a half-normal plot with a simulated envelope library(hnp) hnp(m1) ## Graph the profile likelihood library(ggplot2) ggplot(m1$profile_loglik, aes(x = grid, y = prof)) + theme_bw() + geom_line() + xlim(0,1.5) + xlab(expression(theta)) + ylab("Log-likelihood") + geom_vline(xintercept = CI_95, lty = 3) + labs(title = " Log-likelihood versus theta", caption = "dotted vertical lines are upper and lower bounds of 95% CI for theta") ## Fit the functional group model, with theta set equal to the estimate from m1, and custom_formula. ## Note, it is not possible to estimate theta with custom_formula (only select a 'known' value). ## First, create the functional group interactions (theta value as estimated from m1), ## store them in a new dataset and rename them with a theta indicator. FG_matrix <- DI_data(prop = 3:11, FG=c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), theta = 0.7296887, data = sim5, what = "FG") sim5new <- data.frame(sim5, FG_matrix) names(sim5new)[13:18] <- paste0(names(sim5new)[13:18], "_theta") m2 <- DI(custom_formula = response ~ 0 + p1 + p2 + p3 + p4 + p5 + p6 + p7 + p8 + p9 + bfg_FG1_FG2_theta + bfg_FG1_FG3_theta + bfg_FG2_FG3_theta + wfg_FG1_theta + wfg_FG2_theta + wfg_FG3_theta, data = sim5new) ## This will adjust the standard errors in m2 for the 'estimation' of theta m2$df.residual <- m2$df.residual - 1 ## This will adjust the AIC in m2 for the 'estimation' of theta m2$aic <- m2$aic + 2 summary(m2)
This dataset comes from a grassland biodiversity experiment that was conducted in Switzerland as part of the "Agrodiversity Experiment" (Kirwan et al 2014). A total of 68 grassland plots were established across a gradient of species diversity, and two additional treatments (nitrogen fertiliser and total seed density) were also manipulated. The proportions of four species were varied across the plots: there were plots with 100% of a single species, and 2- and 4-species mixtures with varying proportions (e.g., (0.5, 0.5, 0, 0) and (0.7, 0.1, 0.1, 0.1)). Nitrogen fertiliser was either 50 or 100 kg N per annum and total seed density was either low or high. Total annual yield per plot was recorded for the first year after establishment. An analysis of the Switzerland dataset is presented in Kirwan et al 2009.
data("Switzerland")data("Switzerland")
A data frame with 68 observations on the following 8 variables:
plotA numeric vector uniquely identifying each of the 68 plots.
nitrogenA factor with two levels: "50" or "150" to indicate the level of nitrogen fertiliser (kg N per annum) applied to the plot.
densityA factor with two levels: "low" and "high" to indicate the level of total seed density used when sowing the plot.
p1A numeric vector indicating the proportion of species 1 in the plot. Species 1 was the grass species Lolium perenne.
p2A numeric vector indicating the proportion of species 2 in the plot. Species 2 was the grass species Dactylis glomerata.
p3A numeric vector indicating the proportion of species 3 in the plot. Species 3 was the legume species Trifolium pratense.
p4A numeric vector indicating the proportion of species 4 in the plot. Species 4 was the legume species Trifolium repens.
yieldA numeric vector giving the total dry matter yield for the plot (tonnes per hectare per annum).
What are Diversity-Interactions (DI) models?
Diversity-Interactions (DI) models (Kirwan et al 2009) are a set of tools for analysing and interpreting data from experiments that explore the effects of species diversity on community-level responses. We strongly recommend that users read the short introduction to Diversity-Interactions models (available at: DImodels). Further information on Diversity-Interactions models is also available in Kirwan et al 2009 and Connolly et al 2013.
Functional groups
In Ecology, species can be categorised into 'functional groups' based on their traits and functions. Here, the four species comprise two grasses (species 1 and 2) and two legumes (species 3 and 4); this is one possible 'functional group' categorisation of the four species.
Kirwan L, J Connolly, C Brophy, O Baadshaug, G Belanger, A Black, T Carnus, R Collins, J Čop, I Delgado, A De Vliegher, A Elgersma, B Frankow-Lindberg, P Golinski, P Grieu, AM Gustavsson, Á Helgadóttir, M Höglind, O Huguenin-Elie, M Jørgensen, Ž Kadžiuliene, T Lunnan, A Lüscher, P Kurki, C Porqueddu, MT Sebastia, U Thumm, D Walmsley and JA Finn (2014) The Agrodiversity Experiment: three years of data from a multisite study in intensively managed grasslands. Ecology, 95, 2680.
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
## Load the Switzerland data data(Switzerland) ## View the first few entries head(Switzerland) ## Explore the variables in Switzerland str(Switzerland) ## Histogram of the response variable yield hist(Switzerland$yield) ## Explore the marginal relationship between yield and each predictor plot(Switzerland$p1, Switzerland$yield) plot(Switzerland$p2, Switzerland$yield) plot(Switzerland$p3, Switzerland$yield) plot(Switzerland$p4, Switzerland$yield) boxplot(yield ~ nitrogen, data = Switzerland) boxplot(yield ~ density, data = Switzerland) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p4 are in the 4th to 7th columns in Switzerland Switzerlandsums <- rowSums(Switzerland[4:7]) summary(Switzerlandsums) ## Model selection by F-test auto1 <- autoDI(y = "yield", density = "density", prop = c("p1","p2","p3","p4"), treat = "nitrogen", FG = c("G","G","L","L"), data = Switzerland, selection = "Ftest") summary(auto1) ## Fit the model chosen by autoDI using DI m1 <- DI(y = "yield", density = "density", prop = 4:7, DImodel = "FG", FG = c("G","G","L","L"), data = Switzerland) summary(m1) plot(m1) ## Check goodness-of-fit using a half-normal plot with a simulated envelope library(hnp) hnp(m1) ## Set up the functional group interactions and add to a new Switzerland2 dataset FG_matrix <- DI_data(prop = 4:7, FG = c("G","G","L","L"), data = Switzerland, what = "FG") Switzerland2 <- data.frame(Switzerland, FG_matrix) ## Additional model testing using DI to test for interactions with nitrogen m2 <- DI(y = "yield", block = "density", prop = 4:7, DImodel = "FG", FG = c("G","G","L","L"), data = Switzerland2, extra_formula = ~ nitrogen:bfg_G_L) summary(m2)## Load the Switzerland data data(Switzerland) ## View the first few entries head(Switzerland) ## Explore the variables in Switzerland str(Switzerland) ## Histogram of the response variable yield hist(Switzerland$yield) ## Explore the marginal relationship between yield and each predictor plot(Switzerland$p1, Switzerland$yield) plot(Switzerland$p2, Switzerland$yield) plot(Switzerland$p3, Switzerland$yield) plot(Switzerland$p4, Switzerland$yield) boxplot(yield ~ nitrogen, data = Switzerland) boxplot(yield ~ density, data = Switzerland) ## Check that the proportions sum to 1 (required for DI models) ## p1 to p4 are in the 4th to 7th columns in Switzerland Switzerlandsums <- rowSums(Switzerland[4:7]) summary(Switzerlandsums) ## Model selection by F-test auto1 <- autoDI(y = "yield", density = "density", prop = c("p1","p2","p3","p4"), treat = "nitrogen", FG = c("G","G","L","L"), data = Switzerland, selection = "Ftest") summary(auto1) ## Fit the model chosen by autoDI using DI m1 <- DI(y = "yield", density = "density", prop = 4:7, DImodel = "FG", FG = c("G","G","L","L"), data = Switzerland) summary(m1) plot(m1) ## Check goodness-of-fit using a half-normal plot with a simulated envelope library(hnp) hnp(m1) ## Set up the functional group interactions and add to a new Switzerland2 dataset FG_matrix <- DI_data(prop = 4:7, FG = c("G","G","L","L"), data = Switzerland, what = "FG") Switzerland2 <- data.frame(Switzerland, FG_matrix) ## Additional model testing using DI to test for interactions with nitrogen m2 <- DI(y = "yield", block = "density", prop = 4:7, DImodel = "FG", FG = c("G","G","L","L"), data = Switzerland2, extra_formula = ~ nitrogen:bfg_G_L) summary(m2)
This function allows the computation of a confidence interval for theta from a model object created from DI that includes the argument estimate_theta = TRUE, or from certain model objects created from autoDI.
A description of the non-linear parameter theta is available in Connolly et al 2013.
theta_CI(obj, conf = .95, n =100)theta_CI(obj, conf = .95, n =100)
obj |
DI model object. |
conf |
Confidence level of the interval. The default is 0.95. |
n |
Number of subintervals in which the root is sought. The default is 100. |
The confidence interval calculated here is based on the values obtained when profiling the log-likelihood function for different values of theta. It is obtained in four steps:
1. define a grid of values for theta ranging from 0.01 to 2.5 of length 101 including the profile likelihood estimate of theta
2. fit the DI model setting theta equal to each value in the grid and obtain the log-likelihood value corresponding to each value of theta
3. obtain linear interpolations between the log-likelihood () values (here we use approxfun)
4. calculate the lower and upper values of the CI by obtaining the values of theta corresponding to a log-likelihood value of , where is the maximum value of the profile log-likelihood obtained in the grid and is the % percentile of the chi-squared distribution with 1 d.f.
When fitting any DI model setting estimate_theta = TRUE, steps 1 and 2 are automatically done within the DI function call. The theta_CI function performs steps 3 and 4 above to return the CI.
Note that when maximising the log-likelihood to find the estimate for theta, the parametric space is limited between 0.01 and 1.5. The larger grid (up to 2.5) is constructed to allow for obtaining the upper bound of the confidence interval in case the estimate of theta is close to 1.5.
The function returns a named numeric vector with two values: the lower and upper limits of the conf*100% CI for theta.
Rafael A. Moral, John Connolly and Caroline Brophy
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Other examples using the theta_CI function:
The Bell dataset examples.
The sim2 dataset examples.
The sim5 dataset examples.
## Load the sim5 data data(sim5) ## View the first five entries head(sim5) ## Explore the variables in sim5 str(sim5) ## Fit the functional group model, with theta, using DI and the FG tag m1 <- DI(y = "response", prop = 3:11, FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), DImodel = "FG", estimate_theta = TRUE, data = sim5) summary(m1) CI_95 <- theta_CI(m1, conf = .95) CI_95 ## Graph the profile likelihood library(ggplot2) ggplot(m1$profile_loglik, aes(x = grid, y = prof)) + theme_bw() + geom_line() + xlim(0,1.5) + xlab(expression(theta)) + ylab("Log-likelihood") + geom_vline(xintercept = CI_95, lty = 3) + labs(title = " Log-likelihood versus theta", caption = "dotted vertical lines are upper and lower bounds of 95% CI for theta")## Load the sim5 data data(sim5) ## View the first five entries head(sim5) ## Explore the variables in sim5 str(sim5) ## Fit the functional group model, with theta, using DI and the FG tag m1 <- DI(y = "response", prop = 3:11, FG = c("FG1","FG1","FG1","FG1","FG1","FG2","FG2","FG3","FG3"), DImodel = "FG", estimate_theta = TRUE, data = sim5) summary(m1) CI_95 <- theta_CI(m1, conf = .95) CI_95 ## Graph the profile likelihood library(ggplot2) ggplot(m1$profile_loglik, aes(x = grid, y = prof)) + theme_bw() + geom_line() + xlim(0,1.5) + xlab(expression(theta)) + ylab("Log-likelihood") + geom_vline(xintercept = CI_95, lty = 3) + labs(title = " Log-likelihood versus theta", caption = "dotted vertical lines are upper and lower bounds of 95% CI for theta")
Update the call to the DI function. This function allows users to update a previous call to the DI function by only updating the relevant arguments from DI, without the need to write out the full DI code again.
update_DI(object, ...)update_DI(object, ...)
object |
a |
... |
other arguments passed to the |
A model object of class "glm" including the components detailed in glm, plus the following:
DIcall |
the call of the |
DIinternalcall |
an internal call made within the DI model fitting process |
Rafael A. Moral, John Connolly and Caroline Brophy
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
## Load the Switzerland data data(Switzerland) ## Fit the FG DImodel, with factors density and treatment, and with theta = 1 m1 <- DI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen", FG = c("G","G","L","L"), DImodel = "FG", data = Switzerland) summary(m1) ## Fit the FG DImodel, with factors density and treatment, and theta estimated m2 <- update_DI(m1, estimate_theta = TRUE) summary(m2) ## Fit the FULL DImodel, with factors density and treatment, and theta estimated m3 <- update_DI(m1, DImodel = "FULL", estimate_theta = TRUE) summary(m3)## Load the Switzerland data data(Switzerland) ## Fit the FG DImodel, with factors density and treatment, and with theta = 1 m1 <- DI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen", FG = c("G","G","L","L"), DImodel = "FG", data = Switzerland) summary(m1) ## Fit the FG DImodel, with factors density and treatment, and theta estimated m2 <- update_DI(m1, estimate_theta = TRUE) summary(m2) ## Fit the FULL DImodel, with factors density and treatment, and theta estimated m3 <- update_DI(m1, DImodel = "FULL", estimate_theta = TRUE) summary(m3)