The simple idea of calculating hourly total precipitation data from MPAS is that, start by calculating the differences in the rainc and rainnc fields, summing them up, and interpolate onto a lon-lat grid. (Assuming the diag files are output hourly)
3C's: Calculate diffs, Concatenate and Convert
1. First, we do the diffs using cdo.
Added parallel running support using python's subprocess.
In <project>/public/precip/call_precip.py:
#!/usr/bin/env python3
import sys
from datetime import datetime, timedelta
import subprocess
from concurrent.futures import ProcessPoolExecutor, as_completed
import os
def run_cdo_task(t1, t2):
file1 = f"diag.{t1.strftime('%Y-%m-%d_%H')}.00.00.nc"
file2 = f"diag.{t2.strftime('%Y-%m-%d_%H')}.00.00.nc"
outfile = f"diff_{t2.strftime('%Y-%m-%d_%H')}.nc"
print(f"[{os.getpid()}] Processing {file1} -> {file2} => {outfile}")
cmd = [
"cdo", "-b", "F32", "-selname,rainc,rainnc",
"-sub", file2, file1, outfile
]
try:
subprocess.run(cmd, check=True)
return f"✓ {outfile} done"
except subprocess.CalledProcessError as e:
return f"✗ Failed: {file1} / {file2} ({e})"
def generate_task_times(start_str, end_str):
start = datetime.strptime(start_str, "%Y-%m-%d_%H")
end = datetime.strptime(end_str, "%Y-%m-%d_%H")
tasks = []
current = start
while current + timedelta(hours=1) <= end:
tasks.append((current, current + timedelta(hours=1)))
current += timedelta(hours=1)
return tasks
def main(start_str, end_str, max_workers=None):
tasks = generate_task_times(start_str, end_str)
with ProcessPoolExecutor(max_workers=max_workers) as executor:
future_to_time = {executor.submit(run_cdo_task, t1, t2): (t1, t2) for t1, t2 in tasks}
for future in as_completed(future_to_time):
print(future.result())
if __name__ == "__main__":
if len(sys.argv) < 3:
print("Usage: ./calc_precip.py YYYY-MM-DD_HH YYYY-MM-DD_HH [max_workers]")
sys.exit(1)
start_time = sys.argv[1]
end_time = sys.argv[2]
max_workers = int(sys.argv[3]) if len(sys.argv) > 3 else None
main(start_time, end_time, max_workers)
After this, we also need to
2. concat them using ncrcat tool in nco.
I've prepared a useful script to look for all the files and concat them.
In <project>/public/precip/concat_precip_v2.py:
#!/usr/bin/env python3
import os
import re
import argparse
from datetime import datetime, timedelta
import subprocess
parser = argparse.ArgumentParser(description="Concatenate NetCDF files with 1-hour intervals using ncrcat.")
parser.add_argument("output", help="Name of the output NetCDF file")
args = parser.parse_args()
output_file = args.output
pattern = re.compile(r"diff_(\d{4}-\d{2}-\d{2})_(\d{2})\.nc") # matches files like diff_2000-01-01_00.nc
files = [f for f in os.listdir('.') if pattern.match(f)]
file_datetimes = []
file_map = {}
for f in files:
match = pattern.match(f)
if match:
date_str, hour_str = match.groups()
dt = datetime.strptime(f"{date_str} {hour_str}", "%Y-%m-%d %H")
file_datetimes.append(dt)
file_map[dt] = f
if not file_datetimes:
print("❌ No matching files found.")
exit(1)
file_datetimes.sort()
missing = []
for i in range(1, len(file_datetimes)):
expected = file_datetimes[i - 1] + timedelta(hours=1)
if file_datetimes[i] != expected:
dt = expected
while dt < file_datetimes[i]:
missing.append(dt)
dt += timedelta(hours=1)
if missing:
print("⚠️ Warning: Missing files for the following datetimes:")
for m in missing:
print(" ", m.strftime("%Y-%m-%d %H:%M"))
else:
print("✅ All hours are continuous.")
input_files = [file_map[dt] for dt in file_datetimes]
try:
subprocess.run(["ncrcat", "-O"] + input_files + [output_file], check=True)
except subprocess.CalledProcessError:
print("❌ ncrcat failed. Make sure NCO is installed and files are valid.")
exit(2)
# Report
print(f"\nStart: {file_datetimes[0]}")
print(f"End: {file_datetimes[-1]}")
print(f"Output file written: {output_file}")
The job scripts to run for 1. and 2., <project>/all_calc_precip.sh and <project>/all_concat_precip.sh are generated using <notebooks>/.../figs/precip/Batch calc precip.ipynb. These are trivial and contain experiment information, thus are not disclosed.
3. convert_mpas
After Calculating the diffs and Concatenating the process, step 3 is to Convert onto lon-lat grid. Use the script <project>/precip/convert-latlon.sh:
#!/bin/bash
# Note: not to be submitted. Load the modules first and run it interactively.
for file in *precip.nc; do
# Skip if no .nc file exists
[ -e "$file" ] || continue
echo "Processing $file..."
convert_mpas ../EXAMPLE.static.nc "$file"
# Check if latlon.nc was created
if [ -f latlon.nc ]; then
# Extract base name without extension
base="${file%.nc}"
# Rename latlon.nc to something like originalfilename_latlon.nc
mv latlon.nc "${base}_latlon.nc"
echo "Output saved as ${base}_latlon.nc"
else
echo "latlon.nc not found for $file"
fi
done
Since convert_mpas is called, don't forget to put copies of include_fields and target_domain in the same directory, if you need them.
Now you should get the hourly data of tp.