_model_utils
Classes
Functions
|
Open data from a file and returns it as pandas DataFrame. |
|
For LIME: we divided the input data by 256 for the model (binary mnist) and LIME needs RGB values. |
|
For KernelSHAP: fill each pixel with SHAP values. |
|
|
|
|
|
|
|
Tabular sunshine example. |
|
Prep the data for the penguin model example as per ntoebook. |
|
Create features for a list of texts. |
|
Classifies every text in a list of texts using the xgboost model stored in model_path. |
Module Contents
- _model_utils.preprocess_function(image)[source]
For LIME: we divided the input data by 256 for the model (binary mnist) and LIME needs RGB values.
- _model_utils.fill_segmentation(values, segmentation)[source]
For KernelSHAP: fill each pixel with SHAP values.
- _model_utils.load_sunshine(file)[source]
Tabular sunshine example.
Load the csv file in a pandas dataframe and split the data in a train and test set.
- _model_utils.load_penguins(penguins)[source]
Prep the data for the penguin model example as per ntoebook.
- _model_utils.features_eulaw(texts: list[str], model_tag='law-ai/InLegalBERT')[source]
Create features for a list of texts.
- _model_utils.classify_texts_eulaw(texts: list[str], model_path, return_proba: bool = False)[source]
Classifies every text in a list of texts using the xgboost model stored in model_path.
The xgboost model will be loaded and used to classify the texts. The texts however will first be processed by a large language model which will do the feature extraction for every text. The classifications of the xgboost model will be returned. For training the xgboost model, see train_legalbert_xgboost.py.
- Parameters:
texts – A list of strings of which each needs to be classified.
model_path – The path to a stored xgboost model
return_proba – return the probabilities of the model
- Return type:
List of classifications, one for every text in the list