Syngenta Crop Challenge in Analytics

The population of Earth is growing daily and our world is running out of land needed to produce food. Meanwhile, the crops farmers plant face escalating challenges due to increasingly variable growing conditions and climate change.

Accelerating innovation in a changing world

Innovation is driven by passion. At Syngenta, it’s a passion to help farmers grow crops successfully year after year, increasing productivity, producing higher-quality crops and improving the sustainability of agriculture. Syngenta’s scientists are focused on accelerating innovation in plant science. Their goal → deliver consistent, reliable and high yield to farmers despite ever-changing environments due to variable weather conditions.

Plant breeders work to maximize the amount of food we gain from crops by breeding plants with the most resilient, highest-yielding genetics, and then providing the seeds from those efforts to farmers around the world.

With the advent of COVID-19, securing the world’s food supply has become even more critical. Planet Earth adds nearly 200,000 new mouths to feed every day. Yet our world is running out of cropland; land needed to produce food. We’ll add 2 billion more people by the year 2050, but we’re currently using our arable land and water 50 percent faster than the planet can sustain. At the same time, the crops farmers plant face an unprecedented set of obstacles due to increasingly challenging growing conditions driven by climate change.

How will we be able to grow enough food to meet world demand?

We’ve proven that data-driven strategies can help our industry breed more efficient, better seeds that require fewer resources and are adaptable to more diverse and variable environments. Developing models and analytical approaches that identify patterns and insights in our experimental data can help breeders more accurately choose seeds that increase the productivity of the crops we plant within shortened breeding cycles, and ultimately, help address the growing global food demand.


Commercial corn is processed into multiple food and industrial products and is widely known as one of the world’s most important crops. However, it typically requires many years of in-field testing to deliver new products to market. Recently, innovative and novel technologies have shortened the time required to develop new corn hybrids—new products that can deliver higher-yielding, better-adapted seed options for growers at a faster pace. These promising technologies decrease the amount of time needed to create the parents of commercial hybrids. Commercial hybrids are created by crossing two parents together, so by reducing the amount of time to create these parents, scientists can deliver novel products to growers years faster. By continuously optimizing our product development system with these promising technologies, scientists can ensure increased crop yields for global food security.

With the increased rate of producing parental lines comes new challenges—increased output (the number of harvested ears) can cause storage capacity limitations. Our year-round breeding process could be improved by optimizing planting schedules to achieve a consistent output – a weekly harvest quantity (number of ears).

Erratic weekly harvest quantities create logistical and productivity issues. How can we optimally schedule the planting of our seeds to ensure that when ears are harvested, facilities are not over capacity, and that there is a consistent number of ears each week? This issue is the basis for the 2021 Syngenta Crop Challenge in Analytics. Figure 1 provides diagram for this process.
Process diagram
Figure 1: Process diagram

Can an optimal scheduling model be created to ensure consistent weekly harvest quantities that are below the maximum capacity? Figure 2 illustrates a representation of this problem.

Weekly harvest Quantity
Figure 2: Illustration of research question

The objective is scheduling the planting date for each population to ensure the capacity constraints are met and that there is consistent harvest quantity. The following is the desired objective function. In summary, we desire an optimization model to schedule when planting should occur for a specific seed population so that when the ears are harvested, we are not over holding capacity.

Additional Notes



Submissions must be in MS-Word or LaTeX format using the appropriate submission template. You can download the submission template here (.zip)
  • Creation of a planting schedule that:
    1. Plants all populations within their available time window,
    2. Ensures maximum capacity is not exceeded, and
    3. Provides consistent weekly harvest quantity.
Additionally, observing the standards for academic publication, entries should include a written report with the following:
  • Quantitative results to justify your modeling techniques
  • A clear description of the methodology and theory
  • References or citations as appropriate


The entries will be evaluated based on:


You are provided with the following datasets described below.
  1. Dataset #1: This dataset describes the input variables for an optimization model as well as the number of growing degree units (GDUs) in Celsius needed for harvest. Succinctly, GDUs are a measure of heat accumulation and are used to estimate specific stages of a plant’s growth cycle. In our dataset, for a given population the “required_gdus” is the number of heat units required in order for the corn population to be ready for harvesting.
  2. Dataset #2: This dataset describes the growing degree units in Celsius accumulated for each day for sites 0 and 1 over the last 10 years. Note that due to the formula for calculating GDUs, year-to-year, GDUs will be different. The participant will need to determine the best way to make use of this historical dataset.
  3. Dataset #3 (Output): This dataset is used for evaluation of the optimization model. This is where the planting date will be entered.
  4. Dataset #4 (Output): This dataset is used for evaluation of the optimization model. This is where the weekly harvest quantity and recommended capacity for scenario 2 will be entered.
Key for Datasets: These tables provide the meaning of each variable in the four datasets.

Dataset #1 Description
PopulationSeed population identifier
sitePlanting site either 0 or 1
original_planting_dateActual planting date of the population
early_planting_dateEarliest the population could have been planted
late_planting_dateLatest the population could have been planted
required_gdusNumber of growing degree units needed for harvest
scenario_1_harvest_quantityHarvest quantity (number of ears) for each population in scenario 1. The value in this column must be used as the harvest quantity, not just a percentage of this value.
scenario_2_harvest_quantityHarvest quantity (number of ears) for each population in scenario 2. The value in this column must be used as the harvest quantity, not just a percentage of this value.

Dataset #2 Description
dateCalendar date
site_0GDUs accumulated for each calendar day at site_0
site_1GDUs accumulated for each calendar day at site_1

Dataset #3 Description (Planting Schedule Output):
populationPopulation of seed
scenarioScenario indicator
sitePlanting site either 0 or 1
planting_datePlanting date for the given population – to be completed by participant

Dataset #4 Description (Harvest Quantity Output):
scenarioScenario indicator
sitePlanting site either 0 or 1
weekWeek index starting from the first week of January 2020.
harvest_quantityHarvest quantity for the given week – to be completed by participant
capacityCapacity for scenario 1 – to be completed by participant for scenario 2

Optimization model representation
Figure 3: Optimization model representation


January 20, 2021
Deadline for Submissions

Week of March 15, 2021
Finalists Announced

April 11-13, 2021
Finalist presentations.
Winners announced.







Two Q&A webinars will be available, the first one in October and the second in early December, that all participants may attend. Details including archives available to view HERE.