GxE interaction

The performance of a plant is determined by three major factors: genes, environment and the interaction between genes and environment.

Genes are the building blocks to all living things. The genes present in a plant affect the productivity of that plant. A gene may influence how tall or short a plant is, or it may protect the plant from a particular disease. By collecting genetic data for the plants that scientists develop, they are able to make predictions about a plant’s productivity.

In addition to genes, a plant’s health and productivity is also directly impacted by the environment (weather and soil) in which it is grown. Plants needs water and sunlight. However, too much rain can cause disease or flooding. Or too much heat, especially in the absence of rainfall, can decrease productivity. The type of soil also has an effect on a plant. For example, if a plant is grown in soil that is able to hold more water than average, it will be able to better withstand an extended period of low rainfall. By characterizing the environments that plants are grown, we can better understand how plants react to the different environments. Scientists do this by precisely measuring the weather and soil in all growing locations.

A particular plant is adapted to grow best in a particular region due to many factors, including the length of the growing season (determined roughly by the time between the last frost in the spring and the first frost in the fall), expected rainfall, temperature, solar radiation, soil types, and others. Some plants may tolerate drought better than others. Some plants may prefer a soil that is sandy while others prefer clay. This is what is called a genetic by environment (GxE) interaction. The environment activates certain genes that allow the plant to thrive (or not) in that particular environment.

These interactions are often quite complex, involving multiple genes and multiple facets of the environment. Successfully modeling the effects will require careful consideration of many topics, including the best way to aggregate the weather and soil data, the possibility of regularizing the genetic data or clustering plants by their genetics, and which machine learning algorithms to use.