dianna.utils.maskers
Functions
|
Generator function to create masks for tabular data. |
|
Generate masks for time series data given a probability of keeping any time step or channel unmasked. |
|
Generate masks that mask one or multiple channels independently at a time. |
|
Mask tabular data given using a set of masks. |
|
Mask data given using a set of masks. |
|
Calculates a masking value of the given type for the data. |
|
Determine the number of time steps that need to be masked. |
|
Generate masks that mask all channels simultaneously for clusters of time steps. |
|
Return a bool mask given a mask of floats and a ratio. |
Generates a set of random masks of float values to mask image data. |
|
Generates a set of random masks to mask time-series data. |
|
|
Projects a set of (low resolution) grids onto a target resolution masks. |
|
Up samples and crops the grid to result in an array with size up_size. |
Module Contents
- dianna.utils.maskers.generate_tabular_masks(input_data_shape: tuple[int], number_of_masks: int, p_keep: float = 0.5)[source]
Generator function to create masks for tabular data.
- Parameters:
input_data_shape – Shape of the tabular data to be masked.
number_of_masks – Number of masks to generate.
p_keep – probability that any value should remain unmasked.
Returns: Single array containing all masks where the first dimension represents the batch.
- dianna.utils.maskers.generate_time_series_masks(input_data_shape: tuple[int], number_of_masks: int, feature_res: int = 8, p_keep: float = 0.5)[source]
Generate masks for time series data given a probability of keeping any time step or channel unmasked.
Note that, for multivariate data, the resulting masks will be an evenly distributed sample of the following 3 kinds of masks: - Channel masks. These are masks that mask a whole channel at the time, masking all its time steps simultaneously. - Time step masks. These mask all channels simultaneously for selected time steps. Masked time steps are selected randomly, while grouping adjacent time steps. See for a complete description of how we generate time step masks in our blog post: https://medium.com/escience-center/masking-time-series-for-explainable-ai-90247ac252b4 - Combination masks: These masks are a combination of the above 2 types.
For univariate data, only time step masks are returned.
- Parameters:
input_data_shape – Shape of the time series data to be masked.
number_of_masks – Number of masks to generate.
p_keep – the probability that any value remains unmasked.
feature_res – Resolution of features in masks.
Returns: Single array containing all masks where the first dimension represents the batch.
- dianna.utils.maskers.generate_channel_masks(input_data_shape: tuple[int], number_of_masks: int, p_keep: float)[source]
Generate masks that mask one or multiple channels independently at a time.
- dianna.utils.maskers.mask_data_tabular(data: numpy.array, masks: numpy.array, training_data: numpy.array, mask_type: object | str) numpy.array[source]
Mask tabular data given using a set of masks.
- Parameters:
data – Input data.
masks – an array with shape [number_of_masks] + data.shape
mask_type – Masking strategy. Can be ‘most_frequent’, ‘mean’ or a function f(data, masks, training_data).
training_data – Data used to sample from for imputation of masked values.
- Returns:
Single array containing all masked input where the first dimension represents the batch.
- dianna.utils.maskers.mask_data(data: numpy.array, masks: numpy.array, mask_type: object | str)[source]
Mask data given using a set of masks.
- Parameters:
data – Input data.
masks – an array with shape [number_of_masks] + data.shape
mask_type – Masking strategy.
Returns: Single array containing all masked input where the first dimension represents the batch.
- dianna.utils.maskers._get_mask_value(data: numpy.array, mask_type: object) int[source]
Calculates a masking value of the given type for the data.
- dianna.utils.maskers._determine_number_masked(p_keep: float, series_length: int, element_name='feature') int[source]
Determine the number of time steps that need to be masked.
- dianna.utils.maskers.generate_time_step_masks(input_data_shape: tuple[int], number_of_masks: int, p_keep: float, number_of_features: int)[source]
Generate masks that mask all channels simultaneously for clusters of time steps.
For a conceptual description see: https://medium.com/escience-center/masking-time-series-for-explainable-ai-90247ac252b4.
- dianna.utils.maskers._mask_bottom_ratio(float_mask: numpy.ndarray, p_keep: float) numpy.ndarray[source]
Return a bool mask given a mask of floats and a ratio.
Return a mask containing bool values where the top p_keep values of the float mask remain unmasked and the rest is masked.
- Parameters:
float_mask – a mask containing float values
p_keep – the ratio of keeping cells unmasked
- Returns:
a mask containing bool
- dianna.utils.maskers.generate_interpolated_float_masks_for_image(image_shape: Iterable[int], p_keep: float, number_of_masks: int, number_of_features: int)[source]
Generates a set of random masks of float values to mask image data.
- Parameters:
image_shape (int) – Size of a single sample of input data, for images without the channel axis.
p_keep –
?
number_of_masks – Number of masks
number_of_features – Number of features (or blobs) in both dimensions
- Returns:
The generated masks (np.ndarray)
- dianna.utils.maskers.generate_interpolated_float_masks_for_timeseries(time_series_shape: Iterable[int], number_of_masks: int, number_of_features: int) numpy.ndarray[source]
Generates a set of random masks to mask time-series data.
- Parameters:
time_series_shape (int) – Size of a single sample of input time series.
number_of_masks – Number of masks
number_of_features – Number of features in the time dimension
- Returns:
The generated masks (np.ndarray)
- dianna.utils.maskers._project_grids_to_masks(grids: numpy.ndarray, masks_shape: tuple) numpy.ndarray[source]
Projects a set of (low resolution) grids onto a target resolution masks.
- Parameters:
grids – Set of grids with a pattern for each resulting mask
masks_shape – Resolution of the resulting masks
- Returns:
Set of masks with specified shape based on the grids