ACCESS-S1 Calibrated 5km Gridded Hindcast Archive


The archive provides hindcasts on a 5km grid over Australia (the same grid as the AWAP data; Jones et al. 2009). The hindcasts have been calibrated using the quantile-quantile mapping approach (see appendix C for full details of the method). This calibration means that the hindcasts can be used to generate products or to drive applications tools directly without the need for any model bias correction.


The archive can be accessed in two ways:

Archive format

The data is NetCDF4 format. Each file contains the output for all gridpoints, for one ensemble member starting on a specified start date and for a specific variable (e.g. rainfall). Each file contains all the lead times available. For example a file with daily data will contain data each day through the full forecast (217 days), a monthly mean file will contain values each month throughout the forecast (7 months).

The location and name of each output file is as follows:

Descriptor Description Options
system The component of the model atmos - atmospheric fields
var variable pr - rainfall
tasmin - minimum daily air surface air temperature
tasmax - maximum daily air surface air temperature
vprp_09 - 9am vapour pressure
vprp_15 - 3pm vapour pressure
wind_speed - surface wind speed
rsds - incoming solar radiation
evap - daily evaporation

Full details of the variables is given in appendix A
period The time averaging period of the data daily - daily data
monthly - monthly average of the daily values
ensemble Ensemble number e01 ...e11 - ensembles 1 to 11
emn - ensemble mean
type Describes the type of field
Derived from period and system
da5 - daily mean atmospheric (containing values each day)
ma5 - monthly mean atmospheric (containing values each month)
YYYYMMDD The start date of the hindcast Hindcasts available are over the period 1990-2012

Note that the monthly rainfall is given as the average daily rainfall for the month, not the monthly rainfall.

Climatologies corresponding to each file are also available with the format:


The climatology is the average of each hindcast for a given start day and month (e.g. 1st Jan) over the hindcast period (e.g. 1990-2012) and averaged across all 11 ensemble members. This climatology can be used to create anomalies by subtracting the climatology from the forecast.

The data is available on the AWAP data grid (see appendix B for full grid details).

Accessing the data

The data is available directly on the NCI infra-structure in NetCDF format. The data can also be accessed through the OPeNDAP server, which allows small sub-sets of the data to be accessed efficiently. The OPeNDAP server can be accessed directly using a range of tools including Python, R, Matlab. See here.

Appendix A: Variables available

Atmospheric Variables (including land surface)

NetCDF name Description Units
pr Daily rainfall mm/day
tasmax Maximum daily surface temperature °C
tasmin Minimum daily surface temperature °C
rsds Daily solar radiation W/m2
wind_speed Daily average near surface wind speed m/s
vprp_09 9am Surface vapour pressure hPa
vprp_15 3pm Surface vapour pressure hPa
evap Daily evaporation mm/day

Appendix B: Spatial Grids

Domain covering Australia (44.5°S - 10°S ; 112°E - 156.25°E) on the AWAP analyses grid (Jones et al. 2009;

Latitude: 691 0.05° grid spacing Latitude of first grid box (centre) = 44.5°S
Longitude: 886 0.05° grid spacing Longitude of first grid box (centre) = 112°E

Appendix C: Method for quantile-quantile matching

This gives a brief outline of the method used to be referred to until a research paper is written.

The calibration of the hindcast data is carried out as follows:
To calibrate the data for a given variable, location, leadtime, year This is the calibrated value.

If the data-to-calibrate is outside the model data training data (ie has a new maximum or minimum) then we extrapolate using the following formula:
calibrated value = raw value * (observed max / model max)
calibrated value = raw value * (observed min / model min)

This is done this for all locations and all lead times independently for each variable.