The BRIDGE (Bristol Research Initiative for the Dynamic Global Environment) research group works with collaborators at XTBG (Xishuanbanna Tropical Botanical Garden) use computer models to research past, present and future climate change, investigate the causes of these changes, and the impact on plants and biodiversity. We use plant fossil data combined with computer models of climate and vegetation. Climate models require high performance computing.
Computer climate models work by simulating the day-to-day variations of weather, and then creating the climate by performing statistical analysis of the resulting output. Substantial computing resources are needed to both run the models, and store the “weather” output before it is processed into climate.
HadCM3/HadAM3/HadRM3 Software
We use the Hadley Centre climate model for all of our research. This is a relatively old climate model (developed in 1999) but has the benefit that it is computationally cheap so that we can perform long simulations and multiple sensitivity studies.
The software consists of three major components:
a. The scientific code base is about 500,000 lines of fortran 77 code.
b. Data output and the interface to parallisation library is about 20,000 line of C
c. A relatively complex set of korn shell scripts control the simulation (20,000 lines)
The code requires the mpirun library to be installed
In addition to the Hadley centre, there is also a code base required to convert output data to a common format (netcdf) and then process the output to produce the climate data. This is about another 200,000 lines of code, in a range of languages (C, Fortran, and bash). It also uses the nco and cdo operators (free netcdf processing software). Almost all of this code is single core.
To perform a simulation requires the following steps:
-
Prepare the “job” to be submitted. This uses a software tool called umui which is run on a UK university computer. Although this could also be copied to other servers, there is no requirement to do so. Furthermore, there is benefit in continuing to use one version so that there remains a unique and single database of all simulations. The output from the umui is a set of files which need to be copied onto the HPC machine.
-
Prepare the input data for the simulation. This typically uses a range of tools, depending on the user and the scientific motivation. This normally is very quick to run. The input data needs to be copied to the HPC machine.
-
Compile the fortran and c code on the HPC machine
-
Run the resulting executable. This may take weeks or more to run.
-
Depending on file storage on the HPC machines, it is likely that data needs to be transferred from the HPC machine to a machine which has significant storage.
-
The data also needs to be converted from a non-standard format to netcdf (a widely used format for geographical data). We often do this at the same time as transferring data
-
Once the whole simulation is completed, a set of programs and scripts need to be run to calculate the climate (the average of the weather data and other statistics). The climate output needs to be stored for many years.
-
Once the climate output has been produced, it is often OK to either delete or archive the weather data. Archiving is best for long simulations where it has taken a lot of effort to produce the data, whereas we some times delete smaller/shorter simulations when it would be simple to rerun the model is further analysis is required.
-
Scientific analysis and discovery of the climate output. This requires a lot of investigation and the production of many different graphics.
The performance of the model depends on the grid size used to represent the Earth system. It also depends on whether the model represents only the atmosphere or is also coupled to the ocean. If the model includes the ocean, then it is likely that longer simulations are required to allow the ocean to come into equilibrium with the rest of the system.
The following table summarises the computing and disk requirements for a range of different model configurations:
Model Name | Atmosphere Resolution | Ocean Levels | Typical number cores |
Run Speed |
Typical length of run | Typical Duration of run |
Total number of core hours | Typical Amount of raw “weather” output | Typical Amount of climate output |
---|---|---|---|---|---|---|---|---|---|
HadCM3 | 96 longitudes, 73 latitudes, 19 levels |
288 longitudes, 144 latitudes, 20 levels |
28 | 100 model years per day | 5000 years | 7 weeks | |||
HadCM3L | 96 longitudes, 73 latitudes, 19 levels |
96 longitudes, 73 latitudes, 20 levels |
28 | 5000 years | |||||
HadAM3 | 96 longitudes, 73 latitudes, 19 levels |
28 | 100 years | ||||||
HadAM3-N216 | 432 longitudes, 325 latitudes, 30 levels |
84 | 100 years | ||||||
HadRM3_0.44 | 96 longitudes, 73 latitudes, 19 levels |
28 | 100 years | ||||||
HadRM3_0.11 | 96 longitudes, 73 latitudes, 19 levels |
28 | 100 years |
Paleoclimate simulations have a number of uncertainties and to fully quantify this uncertainty requires...