dianna.utils.maskers ==================== .. py:module:: dianna.utils.maskers Functions --------- .. autoapisummary:: dianna.utils.maskers.generate_tabular_masks dianna.utils.maskers.generate_time_series_masks dianna.utils.maskers.generate_channel_masks dianna.utils.maskers.mask_data_tabular dianna.utils.maskers.mask_data dianna.utils.maskers._get_mask_value dianna.utils.maskers._determine_number_masked dianna.utils.maskers.generate_time_step_masks dianna.utils.maskers._mask_bottom_ratio dianna.utils.maskers.generate_interpolated_float_masks_for_image dianna.utils.maskers.generate_interpolated_float_masks_for_timeseries dianna.utils.maskers._project_grids_to_masks dianna.utils.maskers._upscale Module Contents --------------- .. py:function:: generate_tabular_masks(input_data_shape: tuple[int], number_of_masks: int, p_keep: float = 0.5) Generator function to create masks for tabular data. :param input_data_shape: Shape of the tabular data to be masked. :param number_of_masks: Number of masks to generate. :param p_keep: probability that any value should remain unmasked. Returns: Single array containing all masks where the first dimension represents the batch. .. py:function:: generate_time_series_masks(input_data_shape: tuple[int], number_of_masks: int, feature_res: int = 8, p_keep: float = 0.5) Generate masks for time series data given a probability of keeping any time step or channel unmasked. Note that, for multivariate data, the resulting masks will be an evenly distributed sample of the following 3 kinds of masks: - Channel masks. These are masks that mask a whole channel at the time, masking all its time steps simultaneously. - Time step masks. These mask all channels simultaneously for selected time steps. Masked time steps are selected randomly, while grouping adjacent time steps. See for a complete description of how we generate time step masks in our blog post: https://medium.com/escience-center/masking-time-series-for-explainable-ai-90247ac252b4 - Combination masks: These masks are a combination of the above 2 types. For univariate data, only time step masks are returned. :param input_data_shape: Shape of the time series data to be masked. :param number_of_masks: Number of masks to generate. :param p_keep: the probability that any value remains unmasked. :param feature_res: Resolution of features in masks. Returns: Single array containing all masks where the first dimension represents the batch. .. py:function:: generate_channel_masks(input_data_shape: tuple[int], number_of_masks: int, p_keep: float) Generate masks that mask one or multiple channels independently at a time. .. py:function:: mask_data_tabular(data: numpy.array, masks: numpy.array, training_data: numpy.array, mask_type: Union[object, str]) -> numpy.array Mask tabular data given using a set of masks. :param data: Input data. :param masks: an array with shape [number_of_masks] + data.shape :param mask_type: Masking strategy. Can be 'most_frequent', 'mean' or a function f(data, masks, training_data). :param training_data: Data used to sample from for imputation of masked values. :returns: Single array containing all masked input where the first dimension represents the batch. .. py:function:: mask_data(data: numpy.array, masks: numpy.array, mask_type: Union[object, str]) Mask data given using a set of masks. :param data: Input data. :param masks: an array with shape [number_of_masks] + data.shape :param mask_type: Masking strategy. Returns: Single array containing all masked input where the first dimension represents the batch. .. py:function:: _get_mask_value(data: numpy.array, mask_type: object) -> int Calculates a masking value of the given type for the data. .. py:function:: _determine_number_masked(p_keep: float, series_length: int, element_name='feature') -> int Determine the number of time steps that need to be masked. .. py:function:: generate_time_step_masks(input_data_shape: tuple[int], number_of_masks: int, p_keep: float, number_of_features: int) Generate masks that mask all channels simultaneously for clusters of time steps. For a conceptual description see: https://medium.com/escience-center/masking-time-series-for-explainable-ai-90247ac252b4. .. py:function:: _mask_bottom_ratio(float_mask: numpy.ndarray, p_keep: float) -> numpy.ndarray Return a bool mask given a mask of floats and a ratio. Return a mask containing bool values where the top p_keep values of the float mask remain unmasked and the rest is masked. :param float_mask: a mask containing float values :param p_keep: the ratio of keeping cells unmasked :returns: a mask containing bool .. py:function:: generate_interpolated_float_masks_for_image(image_shape: Iterable[int], p_keep: float, number_of_masks: int, number_of_features: int) Generates a set of random masks of float values to mask image data. :param image_shape: Size of a single sample of input data, for images without the channel axis. :type image_shape: int :param p_keep: ? :param number_of_masks: Number of masks :param number_of_features: Number of features (or blobs) in both dimensions :returns: The generated masks (np.ndarray) .. py:function:: generate_interpolated_float_masks_for_timeseries(time_series_shape: Iterable[int], number_of_masks: int, number_of_features: int) -> numpy.ndarray Generates a set of random masks to mask time-series data. :param time_series_shape: Size of a single sample of input time series. :type time_series_shape: int :param number_of_masks: Number of masks :param number_of_features: Number of features in the time dimension :returns: The generated masks (np.ndarray) .. py:function:: _project_grids_to_masks(grids: numpy.ndarray, masks_shape: tuple) -> numpy.ndarray Projects a set of (low resolution) grids onto a target resolution masks. :param grids: Set of grids with a pattern for each resulting mask :param masks_shape: Resolution of the resulting masks :returns: Set of masks with specified shape based on the grids .. py:function:: _upscale(grid_i, up_size) Up samples and crops the grid to result in an array with size up_size.