Quickstart
The following functions outline the main functionality of this Python package. The workflow that is recommended is :
graph LR
A[(Import Scatter Xarray)] --> B[function :
extract_scatter];
B -->|use output| C[function:
extract_goes];
C --> D[function:
package_data];
B --> D;
D --> E[function:
save_data]
E --> F{Data ready
For training model}
extract_scatter
extract_scatter(polar_data,start_datetime,end_datetime,lat_range,lon_range)
Description
This function extracts the scatterometer data from the polar_data dataset. The function extracts the scatterometer data for the given time range, latitude range and longitude range. The function then saves the data into 4 numpy files : time of observation, latitude, longitude and main variable.
Parameters
polar_data (xarray.Dataset): The scatterometer dataset (ASCAT, HYSCAT etc).
start_datetime (str): The start time of the data extraction in the format 'YYYY-MM-DD HH:MM:SS'.
end_datetime (str): The end time of the data extraction in the format 'YYYY-MM-DD HH:MM:SS'.
lat_range (list): The latitude range of the data extraction in the format [min_lat, max_lat].
lon_range (list): The longitude range of the data extraction in the format [min_lon, max_lon].
main_variable (str): The main variable to be extracted. This can be wind_speed, wind_direction etc.
returns
observation_times (numpy.ndarray): The time of observation of the scatterometer data. observation_lats (numpy.ndarray): The latitude of the scatterometer data. observation_lons (numpy.ndarray): The longitude of the scatterometer data. observation_main_parameter (numpy.ndarray): The wind speed of the scatterometer data.
extract_goes
extract_goes(observation_times,observation_lats,obseration_lons,channels,polar_data)
Description
This function extracts the GOES data from the observation data provided by the extract_scatter function. These are observation times, latitudes, longitudes. The function will download the GOES data from the AWS S3 bucket and extract the data for the given observation times. The function will then subset the GOES data to the observation latitudes and longitudes. This will be done for all the channels provided in the channels list. The function will return a numpy array containing all channel images corresponding to the observation data.
Parameters
observation_times (numpy.ndarray): The times of observation of the scatterometer data.
observation_lats (numpy.ndarray): The latitudes of the scatterometer data.
observation_lons (numpy.ndarray): The longitudes of the scatterometer data.
channels (list): The channels of interest. multiple channels can be given in form ['C01', 'C02', etc]
polar (xarray.Dataset): The scatterometer dataset (ASCAT, HYSCAT etc).
Returns
images (numpy.ndarray): The GOES images corresponding to the observation data.
package_data
package_data(images, observation_lats, observation_lons, observation_times, observation_wind_speeds, filter=True, solar_conversion=True)
Description
This function packages the images and numerical data into a format that can be used for training a machine learning model. The function will filter out invalid images and fill in any NaN values. (Invalid images = empty images from GOES data) The function will also convert the observation times, latitudes and longitudes to solar angles (sza, saa) if solar_conversion is set to True. The function will return the images and numerical data in a numpy array format.
Parameters
images (numpy.ndarray): The GOES images corresponding to the observation data.
observation_lats (numpy.ndarray): The latitudes of the scatterometer data.
observation_lons (numpy.ndarray): The longitudes of the scatterometer data.
observation_times (numpy.ndarray): The times of observation of the scatterometer data.
observation_main_parameter (numpy.ndarray): the main parameter of the scatterometer data (the target of the model).
Returns
images (numpy.ndarray): The GOES images corresponding to the observation data.
numerical_data (numpy.ndarray): The numerical data corresponding to the observation data. (sza, saa, main_parameter if solar_conversion is set to True or lat, lon, time, wind_speeds if solar_conversion is set to False)
save_data
save_data(images, numerical_data, polar_data, start_datetime, end_datetime,channels)
Description
The function will save the images and numerical data in a compressed numpy file format. The function will save the data in a folder called output_processed_data. The function will also save the data with the satellite name, channels, start and end datetime in the filename.
Parameters
images (numpy.ndarray): The GOES images corresponding to the observation data.
numerical_data (numpy.ndarray): The numerical data corresponding to the observation data.
polar_data (xarray.Dataset): The scatterometer dataset (ASCAT, HYSCAT etc). Used for metadata.
start_datetime (str): The start time of the data extraction in the format 'YYYY-MM-DD HH:MM:SS'. Used for metadata
end_datetime (str): The end time of the data extraction in the format 'YYYY-MM-DD HH:MM:SS'. Used for metadata
channels (list): The channels of interest. multiple channels can be given in form ['C01', 'C02', etc]. Used for metadata
Returns
None