# To-do List This describes the parts of the code under development, the goals for implementing new features, bugs to be fixed, and elements to optimize. The sections below should be expected to be constantly changing. If a particular point becomes resolved, it should be deleted from this document and moved to a relevant location. ## Items - Features - Regularization - Update examples that I use in the Analysis and Example notebooks to use model runs that utilized regularizers - Generating input files - Can I do this not by year? I would like to be able to specify the start and end date, to allow for more granular control of what time span the input files cover. - I made an attempt at this in the defunct / dead branch `refactor_input0` - I started by changing the input files, and then couldn't get stage 2 to work, but couldn't figure out where I went wrong - That branch still has a bunch of useful bits of code, but should not be used in its entirety - Here's the plan: - Make a new branch in which to test this functionality - Make a new function in `load_input` called `load_input()` - Base this off the `get_npy_from_netcdf()`, but implement using date ranges - Use arguments `start_date` and `end_date` instead of `year` - In `HPC.training.make_predictions()`, use `load_input()` - Keep the same structure, that is, going year-by-year - But, now I don't just specify the year, I specify the start and end dates - If that works, then try running the predictions over the whole verification period (2019 and 2020) using the `load_input()` with different start and end dates - Repeat the above process, replacing `get_npy_from_netcdf()` with `load_input()` in `HPC.data0.run_functions.prepare_input()` - Go year-by-year first - Then try across the whole time period at once - If both of those work, then restructure how I make input files - Generate the input files over the whole time period, not year by year - Don't use the `noleap` calendar - Making time series plots - If I can get rid of the `noleap` calendar in both the input file and prediction file data, I should be able to - Build - Add `dask` package to allow loading multiple files as a dataset using `xarray.open_mfdataset()` - See dead branch `refactor_input0` to see how I did that - Make sure to try recreating an environment from scratch following the installation instructions - Update the Animus environment from Python 3.9 to Python 3.12 - The environment on Trillium uses Python 3.12 - Make a new test environment to try this out first - Make sure you can run all the parts of the code including the tests in the new environment - Documentation - Installation and setup - Configuring the test environment - Need to show how to set up and run tests that I've made in the `tests` directory - Generating / copying the ERA5 files - Am I currently having the `input.py` functions pull from Evelyn's directory? Make sure I document where the files are that are being used by default. - Generating CO input files - Document how to change the `**kwargs` given to the `input.py` functions to create input files for other than NOx. - Look into the `cdo` command line tool's usage in `merge_CO.sh` and how it is used to merge a bunch of daily HEMCO files - See: https://code.mpimet.mpg.de/projects/cdo/wiki/Cdo#Documentation - References to `Workflow` in a lot of the setup documentation should probably actually reference `run_model` - Documenting how to update the documentation - How did I set up the way it auto updates? - Links between internal pages. - Auto API and why writing good docstrings is important. - I have the `docs_dev/write_docs.md` file where I am trying to document how I update these docs. - Documentation of stuff I've figured out, kinda like some results? - Results of using a regularizer - Results from running ZFI across the different input variables - Results from investigating the match outside where input values of `nox` are available - Do the spatial patterns of `nox_pred` match up with the spatial patterns of `no2`? - The Example Usage notebook `docs/example.ipynb` - I refer to this throughout the documentation, but pretty much all of the code in it is not up to date - I think I could rethink this notebook - It could be used as a very short, brief demonstration of what the code can do, like the "Basic Analysis" notebook `docs/analysis.ipynb`, but just the flashy stuff to show off. - Cleaning up the repository - There are many items which are probably not needed any more that are in the repository - Here are some which I believe could be just deleted, but should probably be reviewed beforehand: - All the notebooks in the `analysis_examples/` directory. I believe none of these are still relevant as they were mostly from before I (Mikhail) took over the project - The code inside `src/unox/HPC/legacy/` directory. I believe all the functions in there are from when the input / output files were `.npy` - Would also need to remove the option to use these functions within `run_model.py` - To be categorized - Explaining `**kwargs` and how they're used in functions. - `input_metadata.json` files, created only just to be able to look more easily, not to be used by code. - Scale factors in input files: when are they applied? Upon creating input file or upon plotting? - Should we be shifting just the mean of the values? Or also the standard deviation? - `plot_var_maps()` bug in choosing the start and end date for averaging over, the title is wrong. - Emphasize that changing part of `unox` requires restarting the kernel when testing new plotting functions in a jupyter notebook. - ZFI runs - Don't actually use the `zfi_vars` attribute of configuration `.json` files