unox.data
Attributes
Functions
|
Generate latitude and longitude arrays from the given dataset. |
|
Get the latitude and longitude extent of the given xarray dataset. |
|
Get the latitude and longitude values from the given dataset. |
|
Get the latitude and longitude resolution of the given dataset. |
|
Print information about the latitude and longitude values. |
|
Clean the list of values that cannot be converted to a number. |
|
Verify that the given latitude value is valid. |
|
Verify that the given longitude value is valid. |
|
Get the minimum and maximum values across the given arrays. |
|
Get the maximum absolute value from the given list. |
|
Restrict the domain of the given arrays. |
|
Restrict the domain of the given xarray Datasets to match each other. |
|
Determine if a variable or file holds a valid numpy array. |
|
Extract numbers from a string. |
|
Get the day of the year from a date. |
|
Increment the month by a given number of months. |
|
Get the year, month, and day from a date. |
|
Get the increment value and unit from a string. |
|
Add an amount of time to a date. |
Module Contents
- unox.data.DEFAULT_LAT_MIN = 11
- unox.data.DEFAULT_LAT_MAX = 75
- unox.data.DEFAULT_LON_MIN = -175
- unox.data.DEFAULT_LON_MAX = -39
- unox.data.DEFAULT_EXTENT
- unox.data.generate_lats_lons(dataset='datafiles/sample_data/2019u10.nc', output_dir='datafiles/')[source]
Generate latitude and longitude arrays from the given dataset.
Create the lats.npy and lons.npy files from the latitude and longitude values in the given dataset. They were originally generated from the ERA5 concatenated data files created by the download_era5 and concatenate scripts in the datafiles directory.
- Parameters:
dataset (str or xarray.Dataset, optional) – The filepath to the dataset or an xarray Dataset object from which to extract latitude and longitude values.
output_dir (str, optional) – The directory in which to save the generated lats.npy and lons.npy files.
- Returns:
lats (numpy.ndarray) – The latitude values extracted from the dataset.
lons (numpy.ndarray) – The longitude values extracted from the dataset.
- unox.data.get_extent(xr_dataset=None, lats=None, lons=None, shift_lons=False, **kwargs)[source]
Get the latitude and longitude extent of the given xarray dataset.
Find the maximum and minimum latitude and longitude values in the given dataset.
- Parameters:
xr_dataset (xarray.Dataset or xarray.DataArray, optional) – The xarray data of which to find the extent.
lats (numpy.ndarray, optional) – The latitude values to use instead of those in the dataset.
lons (numpy.ndarray, optional) – The longitude values to use instead of those in the dataset.
shift_lons (bool, optional) – If True, shift the longitude values based on the PM_centered kwarg.
**kwargs (keyword arguments) – Additional keyword arguments to pass to verify_dataset() and shift_lon_arr().
- Returns:
extent – A tuple of np.float64 in the form (lat_min, lat_max, lon_min, lon_max).
- Return type:
tuple
Examples
>>> nox = xr.open_dataset('datafiles/nox_2019_t106_US.nc') >>> extent = get_extent(nox) (24.112, 58.878, -126.0, -59.625)
>>> lats, lons = get_lats_lons(nox) >>> extent = get_extent(lats=lats, lons=lons) (24.112, 58.878, -126.0, -59.625)
- unox.data.get_lats_lons(xr_dataset, **kwargs)[source]
Get the latitude and longitude values from the given dataset.
Load the latitude and longitude values from the given dataset and return them as numpy arrays.
- Parameters:
xr_dataset (xarray.Dataset or xarray.DataArray) – The xarray data to verify.
**kwargs (keyword arguments) – Additional keyword arguments to pass to verify_dataset().
- Returns:
lats (numpy.ndarray) – Array of latitude values.
lons (numpy.ndarray) – Array of longitude values.
Examples
>>> lats, lons = get_lats_lons()
- unox.data.get_latlon_resolution(xr_dataset=None, lats=None, lons=None, **kwargs)[source]
Get the latitude and longitude resolution of the given dataset.
Calculate the resolution of coordinate values in the dataset to find the resolution in latitude and longitude separately.
- Parameters:
xr_dataset (xarray.Dataset or xarray.DataArray, optional) – The xarray data of which to find the extent.
lats (numpy.ndarray, optional) – The latitude values to use instead of those in the dataset.
lons (numpy.ndarray, optional) – The longitude values to use instead of those in the dataset.
**kwargs (keyword arguments) – Additional keyword arguments to pass to verify_dataset() and get_lats_lons().
- Returns:
lat_res (str) – The resolution in latitude.
lon_res (str) – The resolution in longitude.
Examples
>>> nox = xr.open_dataset('datafiles/nox_2019_t106_US.nc') >>> lat_res, lon_res = get_latlon_resolution(nox) (0.25, 0.25)
- unox.data.print_latlon_info(xr_dataset=None, lats=None, lons=None, **kwargs)[source]
Print information about the latitude and longitude values.
Print the extent and resolution of the latitude and longitude values in the given dataset or arrays.
- Parameters:
xr_dataset (str or xarray.Dataset or xarray.DataArray, optional) – The filepath to, or the xarray data for which to print the latitude and longitude information.
lats (numpy.ndarray, optional) – The latitude values to use instead of those in the dataset.
lons (numpy.ndarray, optional) – The longitude values to use instead of those in the dataset.
**kwargs (keyword arguments) – Additional keyword arguments to pass to verify_dataset(), get_extent() and get_latlon_resolution().
- unox.data.clean_num_list(val_list)[source]
Clean the list of values that cannot be converted to a number.
For each value in the list, if it cannot be converted to a number, all instances of that value are removed from the list.
- Parameters:
val_list (list) – The list of values to clean.
- Returns:
return_list – The cleaned list of values.
- Return type:
list
Examples
>>> val_list = clean_list([1, 2, 3, "4", 5]) [1, 2, 3, 5] >>> val_list = clean_list([1, 2, 3, np.nan, None, np.inf, -np.inf]) [1, 2, 3]
- unox.data.verify_lat(lat_val)[source]
Verify that the given latitude value is valid.
If the given latitude value is within the range [-90, 90], return that value. Otherwise, raise a ValueError.
- Parameters:
lat_val (float) – The latitude value to verify.
- Returns:
lat_val – The verified latitude value.
- Return type:
float
Examples
>>> lat_val = verify_lat(45.0) 45.0 >>> lat_val = verify_lat(-100.0) ValueError: Latitude value must be in the range [-90, 90].
- unox.data.verify_lon(lon_val, PM_centered=None)[source]
Verify that the given longitude value is valid.
If the given longitude value is within the range [-180, 180], return that value. Otherwise, raise a ValueError.
- Parameters:
lon_val (float) – The longitude value to verify.
PM_centered (bool, optional) – If None, verify that the longitude value is in the range [-180, 360]. If True, verify that the longitude value is in the range [-180, 180]. If False, verify that the longitude value is in the range [0, 360].
- Returns:
lon_val – The verified longitude value.
- Return type:
float
Examples
>>> lon_val = verify_lon(45.0) 45.0 >>> lon_val = verify_lon(-200.0) ValueError: Longitude value must be in the range [-180, 180].
- unox.data.get_vminmax(arrays)[source]
Get the minimum and maximum values across the given arrays.
Flatten and concatenate the given arrays and return the minimum and maximum values, ignoring NaN values.
- Parameters:
arrays (list of numpy.ndarray) – The arrays to get the minimum and maximum values from.
- Returns:
vmin (float) – The minimum value across the arrays.
vmax (float) – The maximum value across the arrays.
Examples
>>> arrays = [np.array([1, 2, 3]), np.array([4, 5, 6])] >>> vmin, vmax = get_vminmax(arrays) (1, 6)
- unox.data.get_max_abs_val(val_list)[source]
Get the maximum absolute value from the given list.
Remove invalid numbers from the given list of values, then take the absolute value of the remaining values, and return the largest.
- Parameters:
val_list (list of numbers or numpy.ndarray) – The list of values to get the maximum absolute value from.
- Returns:
max_abs – The maximum absolute value of the given values.
- Return type:
float
Examples
>>> max_abs = get_max_abs_val(-11, 6) 6 >>> vmin, vmax = get_vminmax([np.array([1, 2, -3]), np.array([4, 5, -6])]) >>> max_abs = get_max_abs_val(vmin, vmax) 5
- unox.data.restrict_domain(arrs_to_restrict, lats, lons, restricting_data)[source]
Restrict the domain of the given arrays.
Restrict the domain of the given arrays to the same extent as that in the restricting data. The values of lats, lons are the latitude and longitude values of the arrays to restrict.
- Parameters:
arrs_to_restrict (list of numpy.ndarray) – The arrays to restrict in latitude and longitude.
lats (numpy.ndarray) – The latitude values of the arrays to restrict.
lons (numpy.ndarray) – The longitude values of the arrays to restrict.
restricting_data (xarray.Dataset or xarray.DataArray) – The dataset to restrict the arrays to.
- Returns:
arrs_to_return (list of numpy.ndarray) – The restricted arrays.
lat_r (numpy.ndarray) – The latitude values of the restricting data.
lon_r (numpy.ndarray) – The longitude values of the restricting data.
Examples
>>> stage1 = np.load(get_pred_data(stage=1, 'HPC_run'='no2_example_run', 'year'=2019)) >>> lats, lons = load_lats_lons() >>> nox = xr.open_dataset('datafiles/nox_2019_t106_US.nc') >>> stage1_restricted = restrict_domain([nox], lats, lons, nox)
- unox.data.match_domains(xr_a, xr_b, require_equal=True, require_len_gt_1=True)[source]
Restrict the domain of the given xarray Datasets to match each other.
Find the maximum extent covered by both given datasets and restrict both to match. Requires that at least some of the actual latitude and longitude values are present in both datasets.
- Parameters:
xr_a (xarray.Dataset or xarray.DataArray) – The first dataset.
xr_b (xarray.Dataset or xarray.DataArray) – The second dataset.
require_equal (bool, optional) – Whether to check that the latitude and longitude values in the two datasets are exactly the same after trimming. Default is True.
require_len_gt_1 (bool, optional) – Whether to check to make sure that the trimmed datasets have more than 1 value in each of the lat and lon dimensions, to catch cases where the datasets only overlap at a single point, resulting in either the lat or lon dimension being dropped. Default is True.
- Returns:
xr_a (xarray.Dataset or xarray.DataArray) – The first dataset, with the latitude and longitude extents trimmed to match xr_b.
xr_b (xarray.Dataset or xarray.DataArray) – The first dataset, with the latitude and longitude extents trimmed to match xr_a.
- unox.data.verify_npy(array)[source]
Determine if a variable or file holds a valid numpy array.
If a numpy array or a path to a file containing a numpy array was passed, return True. Otherwise, raise a TypeError, ValueError or FileNotFoundError.
- Parameters:
array (numpy.array or string) – A numpy array or a path to a file containing a numpy array.
- Returns:
nparray – The array being passed or pointed to as a np.ndarray.
- Return type:
np.ndarray
Examples
>>> import numpy as np >>> from tempfile import NamedTemporaryFile >>> arr = np.array([1, 2, 3]) >>> verify_npy(arr) array([1, 2, 3])
>>> with NamedTemporaryFile(suffix=".npy", delete=False) as f: ... np.save(f.name, arr) ... verify_npy(f.name) array([1, 2, 3])
>>> with NamedTemporaryFile(suffix=".txt", mode="w", delete=False) as f: ... _ = f.write("1,2,3\n4,5,6") >>> loaded = verify_npy(f.name) >>> isinstance(loaded, np.ndarray) True
- unox.data.get_num_from_string(str)[source]
Extract numbers from a string.
If the string contains numbers, return those numbers in a list. Otherwise, raise a ValueError.
- Parameters:
str (str) – The string to extract the number from.
- Returns:
nums – A list of numbers extracted from the string.
- Return type:
list of int or float
Examples
>>> num = get_num_from_string("There are 42.0 apples and 3 oranges.") [42, 3] >>> num = get_num_from_string("No number here") ValueError: No number found in the string.
- unox.data.get_DOY(date)[source]
Get the day of the year from a date.
Extract the day of the year from a given date and return it as an integer.
- Parameters:
date (np.datetime64 or str) – The date to extract the day of the year from.
- Returns:
doy – The day of the year of the date.
- Return type:
int
Examples
>>> get_DOY('2019-12-20') 354 >>> get_DOY(np.datetime64('2020-01-01')) 1
- unox.data.increment_month(month, increment)[source]
Increment the month by a given number of months.
Increment the month by the given number of months, wrapping around if the increment goes beyond December (12).
- Parameters:
month (int or str) – The month to increment (1 for January, 2 for February, …, 12 for December).
increment (int or str) – The number of months to increment by.
- Returns:
new_month (int or str) – The new month after incrementing. The type will match the type of month.
increment_year (bool) – Whether the increment caused a year change. True if the month is December and increment > 0.
Examples
>>> increment_month(1, 2) 3, False >>> increment_month(11, 3) 2, True >>> increment_month('5', '7') '12', False
- unox.data.get_YMD_from_date(this_date)[source]
Get the year, month, and day from a date.
Extract the year, month, and day from a given date and return them as integers.
- Parameters:
this_date (np.datetime64 or str) – The date to extract the year, month, and day from.
- Returns:
year (int) – The year of the date.
month (int) – The month of the date.
day (int) – The day of the date.
Examples
>>> get_YMD_from_date('2019-12-20') (2019, 12, 20) >>> get_YMD_from_date(np.datetime64('2020-01-01')) (2020, 1, 1)
- unox.data.get_increment_info(increment)[source]
Get the increment value and unit from a string.
Parse a string that represents an increment in the format ‘XD’, ‘XM’, or ‘XY’, where X is an integer and D, M, or Y are the units for days, months, or years respectively.
- Parameters:
increment (np.timedelta64 or str) – The amount of time to add to the date. If a string, it should be in the format ‘XD’, ‘XM’, or ‘XY’ where X is an integer and D, M, or Y are the units for days, months, or years respectively.
- Returns:
value (int) – The numeric value of the increment.
unit (str) – The unit of the increment (‘D’, ‘M’, or ‘Y’).
- Raises:
ValueError – If the increment string is not in the expected format.
TypeError – If the increment is not a np.timedelta64 or str.
Examples
>>> value, unit = get_increment_info('20D') (20, 'D') >>> value, unit = get_increment_info(np.timedelta64(20, 'D')) (20, 'D') >>> value, unit = get_increment_info('3M') (3, 'M') >>> value, unit = get_increment_info(np.timedelta64(2, 'Y')) (2, 'Y')
- unox.data.add_amount_to_date(this_date, increment, keep_within_year=False)[source]
Add an amount of time to a date.
Add the given amount of time to the given date and return the new date.
- Parameters:
this_date (np.datetime64 or str) – The date to add the time to.
increment (np.timedelta64 or str) – The amount of time to add to the date. If a string, it should be in the format ‘XD’, ‘XM’, or ‘XY’ where X is an integer and D, M, or Y are the units for days, months, or years respectively.
keep_within_year (bool, optional) – If True, the new date will be kept within the same year as this_date.
- Returns:
new_date – The new date after adding the time.
- Return type:
np.datetime64 or str
Examples
>>> add_amount_to_date('2019-12-20', '20D') '2020-01-09' >>> add_amount_to_date(np.datetime64('2019-12-25'), np.timedelta64(20, 'D')) np.datetime64('2020-01-14')