Read and Plot NetCDF Data in Python with xarray and rioxarray


Read and Plot NetCDF Data in Python...
Read and Plot NetCDF Data in Python with xarray and rioxarray

A few months ago I wrote an article about reading NetCDF data with Python. This article still receives many views on my website and on Medium. However, since then, I’ve found an even better way to read NetCDF data with Python.

In the initial article, I used the netCDF4 Python package to access data from NetCDF files. Recently, I’ve started using rioxarray to read NetCDF data into xarray format. If you’re not familiar with the xarray python package it’s basically a wrapper (for lack of a better term) around numpy arrays that allows metadata to be included with the arrays (more on this later with an example).

While the netCDF4 package is a complete, powerful package for interesting with NetCDF data, rioxarray provides more simple and concise data access that will be preferred by most users.

Why NetCDF

Those just getting started with NetCDF may find it a slightly confusing data format. NetCDF is based on the HDF5 file format, which allows for the segregation of metadata and different data types in the same file. This type of format is very powerful because it provides detailed documentation for the data and can store large volumes of data in an easily accessible way.

The NetCDF file format lends itself to climate data especially well. Datasets of climatic data, like precipitation and temperature, usually have repeat values at each recorded location at equally spaced time intervals. With NetCDF, the spatial structure of the data can be defined once, and then new data layers are added to the existing spatial definition.

You’ll learn more about this as we go through some examples.

Download NetCDF Data

For this example, I’ll be using 2020 daily precipitation data from the GRIDMET dataset. You can download the same data I’m using at the following link: https://www.northwestknowledge.net/metdata/data/pr_2020.nc.

Once downloaded, make note of the file location so you can read it into your Python environment later.

Install rioxarray

I recommend using Anaconda (or miniconda) for your Python environment and package management. I find it just makes things easier. This is especially true if you are newer to Python. For this tutorial, I created a fresh conda environment (conda create --name myenv python=3.9) then installed the required packages.

rioxarray is available from Anaconda’s conda-forge channel. It can be easily installed with the following command.

conda install -c conda-forge rioxarray

Note: If you are using the suggested data for this tutorial, you may get the following error when you try to read the NetCDF file (depending on your python environment and operating system.

ValueError: unable to decode time units 'days since 1900-01-01 00:00:00' with "calendar 'gregorian'". Try opening your dataset with decode_times=False or installing cftime if it is not installed.

This is easily fixed by installing the cftime package as indicated in the error message. For anaconda, install with the following line of code.

conda install -c conda-forge cftime

Install matplotlib

We’ll also install the matplotlib package, which will give us the ability to seamlessly plot the NetCDF data we read. We can install matplotlib in the same manner as the other packages.

conda install -c conda-forge matplotlib

Read NetCDF with rioxarray

Here, I’m inserting a Jupyter notebook (in HTML format) to show how to use rioxarray and xarray to read and plot NetCDF data (that’s why the formatting will be a little different).

netcdf_xarray

Conclusion

As you can see, using rioxarray to read NetCDF data gives you access to the power of xarray objects and numpy. This makes it nearly seamless to plot different slices of a NetCDF file as maps and time series plots. It also makes it easy to calculate basic statistics and plot them through time and space as desired.

At first, I was reluctant to learn rioxarray and xarray but these powerful packages are definitely worth the time invested in learning them.

Konrad Hafen

Konrad is a natural resources scientist. He develops models and analysis workflows to predict and evaluate changes to landscapes and water resources.

Latest Tutorials