|

Create NetCDF Files with Python

Gridded, spatial data are commonly stored in NetCDF files. This is especially true for climate data. NetCDF files offer more flexibility and transparency than some traditional raster formats by supporting multiple variables and detailed metadata. Because of the metadata and file structure NetCDF files can be more difficult to access than traditional raster formats. This article addresses the basics of creating a NetCDF file and writing data values in Python. I previously wrote about accessing metadata and variables from a NetCDF file with Python.

Create a NetCDF Dataset

Import the netCDF4 and numpy modules. Then define a file name with the .nc or .nc4 extension. Call Dataset and specify write mode with 'w' to create the NetCDF file by. The NetCDF file is not established and can be written to. When finished, be sure to call close() on the data set.

Add Dimensions

NetCDF files generally contain three dimesions: time, width (x or longitude) and height (y or latitude). Width and height dimensions are always fixed. The time dimension is dynamic (can grow), which allows time steps to be added to the file. Dynamic, or growing, dimensions are termed ‘unlimited’ in NetCDF.

Unlimited dimensions can be added to and are specified by None. We’ll use an unlimited dimension for the time variable so that it can grow. In other words, we can keep appending time steps to the file. Also create latitude and longitude dimensions. lat and lon define the geographical extents and dimensions of our file. Here were just creating a dimension of size 10. This means the resulting grid will have just 10 rows and 10 columns. The size, or geographic distance, of lat and lon are specified as variables. In fact, each dimension will have a corresponding variable.

Add NetCDF Variables

Variables contain the actual data of the file. They also define the grid the data are referenced to. This file will contain four variables. Latitude and longitude define the grid values and data location. times defines the layers in the data file. value contains the actual data. To create a variable, specify the variable name, data type, and shape. Shape is defined as a tuple by referencing dimension names. Additional metadata are also specified. Here we define the units of value as Unknown.

Assign Latitude and Longitude Values

Create a simple grid with grid cells that measure 1 degree by 1 degree with numpy.arange. Assign y values to lats and x values to lons. Now we just need to assign data values that match the dimensions of the grid we’ve created.

Assign NetCDF Data Values

Add data for two time steps to the value variable that we created. Each time step is represented by a 2D numpy array. The size of each array must match the lat and lon dimensions. Create an array of random numbers ranging from 0 to 100 with numpy.random. This array contains data for the first time step.

Next, create an array with values that increase linearly from 0.5 to 5.0. To do this, create two 1D arrays with numpy.linspace and add them together across opposing axes. The code below shows how it’s done. Close ds after you’ve created the arrays and assigned them to value. Your NetCDF is now saved and ready. Open the file in QGIS to visualize, or plot the arrays in Python. Images of the result are shown below.

Random values
Linear interpolation

Conclusion

Once you understand the basic structure of a NetCDF file it can be a very useful way to work with spatial data. In this example we create a file with only one data variable. But multiple variables can be added to a single file, potentially eliminating the number of files required to manage your data. One of the most useful aspects of NetCDF files is the documentation and metadata that clarify the data they contain.

Similar Posts