_model_utils ============ .. py:module:: _model_utils Classes ------- .. autoapisummary:: _model_utils.StatementClassifierEUlaw Functions --------- .. autoapisummary:: _model_utils.load_data _model_utils.preprocess_function _model_utils.fill_segmentation _model_utils.load_model _model_utils.load_labels _model_utils.load_training_data _model_utils.load_sunshine _model_utils.load_penguins _model_utils.features_eulaw _model_utils.classify_texts_eulaw Module Contents --------------- .. py:function:: load_data(file) Open data from a file and returns it as pandas DataFrame. .. py:function:: preprocess_function(image) For LIME: we divided the input data by 256 for the model (binary mnist) and LIME needs RGB values. .. py:function:: fill_segmentation(values, segmentation) For KernelSHAP: fill each pixel with SHAP values. .. py:function:: load_model(file) .. py:function:: load_labels(file) .. py:function:: load_training_data(file) .. py:function:: load_sunshine(file) Tabular sunshine example. Load the csv file in a pandas dataframe and split the data in a train and test set. .. py:function:: load_penguins(penguins) Prep the data for the penguin model example as per ntoebook. .. py:function:: features_eulaw(texts: list[str], model_tag='law-ai/InLegalBERT') Create features for a list of texts. .. py:function:: classify_texts_eulaw(texts: list[str], model_path, return_proba: bool = False) Classifies every text in a list of texts using the xgboost model stored in model_path. The xgboost model will be loaded and used to classify the texts. The texts however will first be processed by a large language model which will do the feature extraction for every text. The classifications of the xgboost model will be returned. For training the xgboost model, see train_legalbert_xgboost.py. :param texts: A list of strings of which each needs to be classified. :param model_path: The path to a stored xgboost model :param return_proba: return the probabilities of the model :rtype: List of classifications, one for every text in the list .. py:class:: StatementClassifierEUlaw(model_path) .. py:attribute:: tokenizer .. py:attribute:: model_path .. py:method:: __call__(sentences)