To-do List
This describes the parts of the code under development, the goals for implementing new features, bugs to be fixed, and elements to optimize. The sections below should be expected to be constantly changing. If a particular point becomes resolved, it should be deleted from this document and moved to a relevant location.
Items
Features
Regularization
Update examples that I use in the Analysis and Example notebooks to use model runs that utilized regularizers
Generating input files
Can I do this not by year? I would like to be able to specify the start and end date, to allow for more granular control of what time span the input files cover.
I made an attempt at this in the defunct / dead branch
refactor_input0I started by changing the input files, and then couldn’t get stage 2 to work, but couldn’t figure out where I went wrong
That branch still has a bunch of useful bits of code, but should not be used in its entirety
Here’s the plan:
Make a new branch in which to test this functionality
Make a new function in
load_inputcalledload_input()Base this off the
get_npy_from_netcdf(), but implement using date rangesUse arguments
start_dateandend_dateinstead ofyear
In
HPC.training.make_predictions(), useload_input()Keep the same structure, that is, going year-by-year
But, now I don’t just specify the year, I specify the start and end dates
If that works, then try running the predictions over the whole verification period (2019 and 2020) using the
load_input()with different start and end dates
Repeat the above process, replacing
get_npy_from_netcdf()withload_input()inHPC.data0.run_functions.prepare_input()Go year-by-year first
Then try across the whole time period at once
If both of those work, then restructure how I make input files
Generate the input files over the whole time period, not year by year
Don’t use the
noleapcalendar
Making time series plots
If I can get rid of the
noleapcalendar in both the input file and prediction file data, I should be able to
Build
Add
daskpackage to allow loading multiple files as a dataset usingxarray.open_mfdataset()See dead branch
refactor_input0to see how I did thatMake sure to try recreating an environment from scratch following the installation instructions
Update the Animus environment from Python 3.9 to Python 3.12
The environment on Trillium uses Python 3.12
Make a new test environment to try this out first
Make sure you can run all the parts of the code including the tests in the new environment
Documentation
Installation and setup
Configuring the test environment
Need to show how to set up and run tests that I’ve made in the
testsdirectory
Generating / copying the ERA5 files
Am I currently having the
input.pyfunctions pull from Evelyn’s directory? Make sure I document where the files are that are being used by default.
Generating CO input files
Document how to change the
**kwargsgiven to theinput.pyfunctions to create input files for other than NOx.Look into the
cdocommand line tool’s usage inmerge_CO.shand how it is used to merge a bunch of daily HEMCO filesSee: https://code.mpimet.mpg.de/projects/cdo/wiki/Cdo#Documentation
References to
Workflowin a lot of the setup documentation should probably actually referencerun_model
Documenting how to update the documentation
How did I set up the way it auto updates?
Links between internal pages.
Auto API and why writing good docstrings is important.
I have the
docs_dev/write_docs.mdfile where I am trying to document how I update these docs.
Documentation of stuff I’ve figured out, kinda like some results?
Results of using a regularizer
Results from running ZFI across the different input variables
Results from investigating the match outside where input values of
noxare availableDo the spatial patterns of
nox_predmatch up with the spatial patterns ofno2?
The Example Usage notebook
docs/example.ipynbI refer to this throughout the documentation, but pretty much all of the code in it is not up to date
I think I could rethink this notebook
It could be used as a very short, brief demonstration of what the code can do, like the “Basic Analysis” notebook
docs/analysis.ipynb, but just the flashy stuff to show off.
Cleaning up the repository
There are many items which are probably not needed any more that are in the repository
Here are some which I believe could be just deleted, but should probably be reviewed beforehand:
All the notebooks in the
analysis_examples/directory. I believe none of these are still relevant as they were mostly from before I (Mikhail) took over the projectThe code inside
src/unox/HPC/legacy/directory. I believe all the functions in there are from when the input / output files were.npyWould also need to remove the option to use these functions within
run_model.py
To be categorized
Explaining
**kwargsand how they’re used in functions.input_metadata.jsonfiles, created only just to be able to look more easily, not to be used by code.Scale factors in input files: when are they applied? Upon creating input file or upon plotting?
Should we be shifting just the mean of the values? Or also the standard deviation?
plot_var_maps()bug in choosing the start and end date for averaging over, the title is wrong.Emphasize that changing part of
unoxrequires restarting the kernel when testing new plotting functions in a jupyter notebook.ZFI runs
Don’t actually use the
zfi_varsattribute of configuration.jsonfiles