Modules¶

fife.base_modelers module¶

fife.lgb_modelers module¶

FIFE modelers based on LightGBM, which trains gradient-boosted trees.

class fife.lgb_modelers.GradientBoostedTreesModeler(**kwargs)¶

Bases: fife.lgb_modelers.LGBSurvivalModeler

Deprecated alias for LGBSurvivalModeler

class fife.lgb_modelers.LGBExitModeler(exit_col, **kwargs)¶

Bases: fife.lgb_modelers.LGBModeler, fife.base_modelers.ExitModeler

Use LightGBM to forecast the circumstance of exit conditional on exit.

class fife.lgb_modelers.LGBModeler(config: Union[None, dict] = {}, data: Union[None, pandas.core.frame.DataFrame] = None, duration_col: str = '_duration', event_col: str = '_event_observed', predict_col: str = '_predict_obs', test_col: str = '_test', validation_col: str = '_validation', period_col: str = '_period', max_lead_col: str = '_maximum_lead', spell_col: str = '_spell', weight_col: Union[None, str] = None, allow_gaps: bool = False)¶

Bases: fife.base_modelers.Modeler

Train a gradient-boosted tree model for each lead length using LightGBM.

config¶

User-provided configuration parameters.

Type: dict

data¶

User-provided panel data.

Type: pd.core.frame.DataFrame

categorical_features¶

Column names of categorical features.

Type: list

duration_col¶

Name of the column representing the number of future periods observed for the given individual.

Type: str

event_col¶

Name of the column indicating whether the individual is observed to exit the dataset.

Type: str

predict_col¶

Name of the column indicating whether the observation will be used for prediction after training.

Type: str

test_col¶

Name of the column indicating whether the observation will be used for testing model performance after training.

Type: str

validation_col¶

Name of the column indicating whether the observation will be used for evaluating model performance during training.

Type: str

period_col¶

Name of the column representing the number of periods since the earliest period in the data.

Type: str

max_lead_col¶

Name of the column representing the number of observable future periods.

Type: str

spell_col¶

Name of the column representing the number of previous spells of consecutive observations of the same individual.

Type: str

weight_col¶

Name of the column representing observation weights.

Type: str

reserved_cols¶

Column names of non-features.

Type: list

numeric_features¶

Column names of numeric features.

Type: list

n_intervals¶

The largest number of periods ahead to forecast.

Type: int

model¶

A trained LightGBM model (lgb.basic.Booster) for each lead length.

Type: list

objective¶

The LightGBM model objective appropriate for the outcome type, which is “binary” for binary classification.

Type: str

num_class¶

The num_class LightGBM parameter, which is 1 for binary classification.

Type: int

build_model(n_intervals: Union[None, int] = None, params: dict = None, parallelize: bool = True) → None¶: Train and store a sequence of gradient-boosted tree models.

compute_shap_values(subset: Union[None, pandas.core.series.Series] = None) → dict¶: Compute SHAP values by lead length, observation, and feature.

hyperoptimize(n_trials: int = 64, rolling_validation: bool = True, subset: Union[None, pandas.core.series.Series] = None) → dict¶

Search for hyperparameters with greater out-of-sample performance.

Parameters

n_trials – The number of hyperparameter sets to evaluate for each time horizon. Return None if non-positive.
rolling_validation – Whether or not to evaluate performance on the most recent possible periods instead of the validation set labeled by self.validation_col. Ignored for a given time horizon if there is only one possible period for training and evaluation.
subset – A Boolean Series that is True for observations on which to train and validate. If None, default to all observations not flagged by self.test_col or self.predict_col.

Returns

A dictionary containing the best-performing parameter dictionary for each time horizon.

predict(subset: Union[None, pandas.core.series.Series] = None, cumulative: bool = True) → numpy.ndarray¶

Use trained LightGBM models to predict the outcome for each observation and time horizon.

Parameters

subset – A Boolean Series that is True for observations for which predictions will be produced. If None, default to all observations.
cumulative – If True, produce cumulative survival probabilies. If False, produce marginal survival probabilities (i.e., one minus the hazard rate).

Returns

A numpy array of predictions by observation and lead length.

save_model(file_name: str = 'GBT_Model', path: str = '') → None¶: Save the horizon-specific LightGBM models that comprise the model to disk.

train(params: Union[None, dict] = None, subset: Union[None, pandas.core.series.Series] = None, validation_early_stopping: bool = True, parallelize: bool = True) → List[lightgbm.basic.Booster]¶: Train a LightGBM model for each lead length.

train_single_model(time_horizon: int, params: Union[None, dict] = None, subset: Union[None, pandas.core.series.Series] = None, validation_early_stopping: bool = True) → lightgbm.basic.Booster¶: Train a LightGBM model for a single lead length.

transform_features() → pandas.DataFrame¶: Transform features to suit model training.

class fife.lgb_modelers.LGBStateModeler(state_col, **kwargs)¶

Bases: fife.lgb_modelers.LGBModeler, fife.base_modelers.StateModeler

Use LightGBM to forecast the future value of a feature conditional on survival.

class fife.lgb_modelers.LGBSurvivalModeler(**kwargs)¶

Bases: fife.lgb_modelers.LGBModeler, fife.base_modelers.SurvivalModeler

Use LightGBM to forecast probabilities of being observed in future periods.

fife.nnet_survival module¶

FIFE uses the nnet_survival module of Gensheimer, M.F., and Narasimhan, B., “A scalable discrete-time survival model for neural networks,” PeerJ 7 (2019): e6257. The nnet_survival version packaged with FIFE is GitHub commit d5a8f26 on Nov 18, 2018 posted to https://github.com/MGensheimer/nnet-survival/blob/master/nnet_survival.py. nnet_survival is licensed under the MIT License. The FIFE development team modified lines 12 and 13 of nnet_survival for compatibility with TensorFlow 2.0.

class fife.nnet_survival.PropHazards(output_dim, **kwargs)¶

Bases: tensorflow.keras.layers.Layer

build(input_shape)¶

call(x)¶

compute_output_shape(input_shape)¶

get_config()¶

fife.nnet_survival.make_surv_array(t, f, breaks)¶

Transforms censored survival data into vector format that can be used in Keras.

Arguments: t: Array of failure/censoring times. f: Censoring indicator. 1 if failed, 0 if censored. breaks: Locations of breaks between time intervals for discrete-time survival model (always includes 0)
Returns: Two-dimensional array of survival data, dimensions are number of individuals X number of time intervals*2

fife.nnet_survival.nnet_pred_surv(y_pred, breaks, fu_time)¶

fife.nnet_survival.surv_likelihood(n_intervals)¶

Create custom Keras loss function for neural network survival model.

Arguments: n_intervals: the number of survival time intervals
Returns: Custom loss function that can be used with Keras

fife.nnet_survival.surv_likelihood_rnn(n_intervals)¶: Create custom Keras loss function for neural network survival model. Used for recurrent neural networks with time-distributed output. This function is very similar to surv_likelihood but deals with the extra dimension of y_true and y_pred that exists because of the time-distributed output.

fife.pd_modelers module¶

FIFE modeler based on Pandas, which tabulates interacted fixed effects.

class fife.pd_modelers.IFEExitModeler(exit_col, **kwargs)¶

Bases: fife.pd_modelers.IFEModeler, fife.base_modelers.ExitModeler

Forecast the circumstance of exit conditional on exit using the mean of observations with the same values.

class fife.pd_modelers.IFEModeler(config: Union[None, dict] = {}, data: Union[None, pandas.core.frame.DataFrame] = None, duration_col: str = '_duration', event_col: str = '_event_observed', predict_col: str = '_predict_obs', test_col: str = '_test', validation_col: str = '_validation', period_col: str = '_period', max_lead_col: str = '_maximum_lead', spell_col: str = '_spell', weight_col: Union[None, str] = None, allow_gaps: bool = False)¶

Bases: fife.base_modelers.Modeler

Predict with mean of training observations with same values.

config¶

User-provided configuration parameters.

Type: dict

data¶

User-provided panel data.

Type: pd.core.frame.DataFrame

categorical_features¶

Column names of categorical features.

Type: list

reserved_cols¶

Column names of non-features.

Type: list

numeric_features¶

Column names of numeric features.

Type: list

n_intervals¶

The largest number of one-period intervals any individual is observed to survive.

Type: int

model¶

Survival rates for each combination of categorical values in the training data.

Type: pd.core.frame.DataFrame

hyperoptimize(**kwargs) → dict¶: Returns None for InteractedFixedEffectsModeler, which does not have hyperparameters

predict(subset: Union[None, pandas.core.series.Series] = None, cumulative: bool = True) → numpy.ndarray¶

Map observations to outcome means from their categorical values.

Map observations with a combination of categorical values not seen in the training data to the mean of the outcome in the training set.

Parameters

subset – A Boolean Series that is True for observations for which predictions will be produced. If None, default to all observations.
cumulative – If True, will produce cumulative survival probabilities. If False, will produce marginal survival probabilities (i.e., one minus the hazard rate).

Returns

A numpy array of outcome means by observation and lead length.

save_model(file_name: str = 'IFE_Model', path: str = '') → None¶: Save the pandas DataFrame model to disk.

train() → pandas.core.frame.DataFrame¶: Compute the mean of the outcome for each combination of categorical values.

class fife.pd_modelers.IFEStateModeler(state_col, **kwargs)¶

Bases: fife.pd_modelers.IFEModeler, fife.base_modelers.StateModeler

Forecast the future value of a feature conditional on survival using the mean of observations with the same values.

class fife.pd_modelers.IFESurvivalModeler(**kwargs)¶

Bases: fife.pd_modelers.IFEModeler, fife.base_modelers.SurvivalModeler

Predict with survival rate of training observations with same values.

class fife.pd_modelers.InteractedFixedEffectsModeler(**kwargs)¶

Bases: fife.pd_modelers.IFESurvivalModeler

Deprecated alias for IFESurvivalModeler

fife.processors module¶

Data processing functions and classes for FIFE.

class fife.processors.DataProcessor(config: Union[None, dict] = {}, data: Union[None, pandas.core.frame.DataFrame] = None)¶

Bases: object

Prepare data by identifying features as degenerate or categorical.

is_categorical(col: str) → bool¶: Determine if the given feature should be processed as categorical, as opposed to numeric.

is_degenerate(col: str) → bool¶: Determine if a feature is constant or has too many missing values.

class fife.processors.PanelDataProcessor(config: Union[None, dict] = {}, data: Union[None, pandas.core.frame.DataFrame] = None)¶

Bases: fife.processors.DataProcessor

Ready panel data for modelling.

config¶

User-provided configuration parameters.

Type: dict

data¶

Processed panel data.

Type: pd.core.frame.DataFrame

raw_subset¶

An unprocessed sample from the final period of data. Useful for displaying meaningful values in SHAP plots.

Type: pd.core.frame.DataFrame

categorical_maps¶

Contains for each categorical feature a map from each unique value to a whole number.

Type: dict

numeric_ranges¶

Contains for each numeric feature the maximum and minimum value in the training set.

Type: pd.core.frame.DataFrame

build_processed_data(parallelize: bool = True) → None¶

Clean, augment, and store a panel dataset and related information.

Sort data by individual and time.
Drop degenerate features.
Label subsets for prediction, validation, and testing.
Compute survival duration and if departure is observed.
Store a subset of the raw input data from the final period.
Map categorical features to unsigned integers.
Scale numeric features.

build_reserved_cols()¶: Add data split and outcome-related columns to the data.

check_panel_consistency() → None¶: Ensure observations have unique individual-period combinations.

flag_validation_individuals() → pandas.core.series.Series¶: Flag observations from a random share of individuals.

process_all_columns(parallelize: bool = True) → None¶: Split, process, and merge all data columns.

process_single_column(colname: str) → Union[None, pandas.core.series.Series]¶: Apply data cleaning functions to an individual data column.

sort_panel_data() → pandas.core.frame.DataFrame¶: Sort the data by individual, then by period.

fife.processors.check_column_consistency(data: pandas.core.frame.DataFrame, colname: str) → None¶: Assert column exists, has no missing values, and is not constant.

fife.processors.deduplicate_column_values(data: pandas.core.frame.DataFrame, reserved_cols: List[str] = [], max_obs: int = 65536) → pandas.core.frame.DataFrame¶

Delete columns with the same values as a later column.

Parameters

df – A DataFrame.
reserved_cols – Names of columns to exclude from deduplication.
max_obs – The number of observations to sample if df has more than that many observations.

Returns

A DataFrame containing only the last instance of each unique column.

fife.processors.factorize_categorical_feature(col: pandas.core.series.Series, excluded_obs: Union[None, pandas.core.series.Series] = None) → Tuple[pandas.core.series.Series, dict]¶

Map categorical values to unsigned integers.

Parameters

col – A Series.
excluded_obs – True for observations to exclude from creating the map.

Returns

A pandas Series of unsigned integers.
A dictionary mapping each unique value among the included observations and np.nan to an integer and any other value not in the Series to 0.

fife.processors.normalize_numeric_feature(col: pandas.core.series.Series, excluded_obs: Union[None, pandas.core.series.Series] = None) → Tuple[pandas.core.series.Series, List[float]]¶

Scale numeric values to their empirical range.

Parameters

col – A numeric Series.
excluded_obs – True for observations to exclude when computing min/max.

Returns

A Series of floats.
A list containing the minimum and maximum values among the included observations.

fife.processors.process_categorical_feature(col: pandas.core.series.Series, cat_map: dict) → pandas.core.frame.DataFrame¶

Map categorical values to unsigned integers.

Parameters

col – A pandas Series.
cat_map – A dict containing a key for each unique value in col.

Returns

A pandas Series of unsigned integers.

fife.processors.process_numeric_feature(col: pandas.core.series.Series, minimum: float, maximum: float) → pandas.core.series.Series¶

Scale numeric values to a given range.

Parameters

col – A numeric Series.
minimum – The value to map to -0.5.
maximum – The value to map to 0.5.

Returns

A Series of floats.

fife.processors.produce_categorical_map(col: pandas.core.series.Series) → dict¶

Return a map from categorical values to unsigned integers.

Zero is reserved for values not seen in data used to create map.

Parameters: col – A Series.
Returns: A dictionary mapping each unique value in the Series and np.nan to an integer and any other value not in the Series to zero.

fife.tf_modelers module¶

FIFE modelers based on TensorFlow, which trains neural networks.

class fife.tf_modelers.CumulativeProduct(*args: Any, **kwargs: Any)¶

Bases: tensorflow.keras.layers.Layer

Transform an array into its row-wise cumulative products.

call(inputs, **kwargs)¶: Multiply each value with all previous values in the same row.

fife.tf_modelers.FeedForwardNeuralNetworkModeler¶: alias of fife.tf_modelers.FeedforwardNeuralNetworkModeler

class fife.tf_modelers.FeedforwardNeuralNetworkModeler(**kwargs)¶

Bases: fife.tf_modelers.TFSurvivalModeler

Deprecated alias for TFSurvivalModeler

class fife.tf_modelers.ProportionalHazardsEncodingModeler(**kwargs)¶

Bases: fife.tf_modelers.TFSurvivalModeler

Train a proportional hazards model with binary-encoded categorical features using Keras.

build_model(n_intervals: Union[None, int] = None) → None¶: Train and store a neural network with a proportional hazards restriction.

construct_network() → tensorflow.keras.Model¶

Set all features to feed directly into a single node.

The single node feeds into a proportional hazards layer.

Returns: An untrained Keras model.

format_input_data(data: Union[None, pandas.core.frame.DataFrame] = None, subset: Union[None, pandas.core.series.Series] = None) → List[Union[pandas.core.series.Series, pandas.core.frame.DataFrame]]¶: Keep only the features and observations desired for model input.

hyperoptimize(**kwargs) → dict¶: Returns None for ProportionalHazardsEncodingModeler, which does not have hyperparameters

save_model(file_name: str = 'PH_Encoded_Model', path: str = '') → None¶: Save the TensorFlow model to disk.

class fife.tf_modelers.ProportionalHazardsModeler(**kwargs)¶

Bases: fife.tf_modelers.TFSurvivalModeler

Train a proportional hazards model with embeddings for categorical features using Keras.

construct_embedding_network() → tensorflow.keras.Model¶

Set embedding layers that feed into a single node.

Each categorical feature passes through its own embedding layer, which maps whole numbers to the real line.

The embedded values and numeric features feed directly into a single node. The single node feeds into a proportional hazards layer.

Returns: An untrained Keras model.

hyperoptimize(**kwargs) → dict¶: Returns None for ProportionalHazardsModeler, which does not have hyperparameters

save_model(file_name: str = 'PH_Model', path: str = '') → None¶: Save the TensorFlow model to disk.

class fife.tf_modelers.TFModeler(config: Union[None, dict] = {}, data: Union[None, pandas.core.frame.DataFrame] = None, duration_col: str = '_duration', event_col: str = '_event_observed', predict_col: str = '_predict_obs', test_col: str = '_test', validation_col: str = '_validation', period_col: str = '_period', max_lead_col: str = '_maximum_lead', spell_col: str = '_spell', weight_col: Union[None, str] = None, allow_gaps: bool = False)¶

Bases: fife.base_modelers.Modeler

Train a neural network model using Keras with TensorFlow backend.

config¶

User-provided configuration parameters.

Type: dict

data¶

User-provided panel data.

Type: pd.core.frame.DataFrame

categorical_features¶

Column names of categorical features.

Type: list

reserved_cols¶

Column names of non-features.

Type: list

numeric_features¶

Column names of numeric features.

Type: list

n_intervals¶

The largest number of one-period intervals any individual is observed to survive.

Type: int

model¶

A trained neural network.

Type: keras.Model

build_model(n_intervals: Union[None, int] = None, params: dict = None) → None¶: Train and store a neural network, freezing embeddings midway.

compute_model_uncertainty(subset: Union[None, pandas.core.series.Series] = None, n_iterations: int = 200) → numpy.ndarray¶

Predict with dropout as proposed by Gal and Ghahramani (2015).

See https://arxiv.org/abs/1506.02142.

Parameters

subset – A Boolean Series that is True for observations for which predictions will be produced. If None, default to all observations.
n_iterations – Number of random dropout specifications to obtain predictions from.

Returns

A numpy array of predictions by observation, lead length, and iteration.

compute_shap_values(subset: Union[None, pandas.core.series.Series] = None) → dict¶

Compute SHAP values by lead length, observation, and feature.

SHAP values for networks with embedding layers are not supported as of 9 Jun 2020.

Compute SHAP values for restricted mean survival time in addition to each lead length.

Parameters: subset – A Boolean Series that is True for observations for which the shap values will be computed. If None, default to all observations.
Returns: A dictionary of numpy arrays, each of which contains SHAP values for the outcome given by its key.

construct_embedding_network(dense_layers: int = 2, nodes_per_dense_layer: int = 512, dropout_share: float = 0.25, embed_exponent: float = 0, embed_L2_reg: float = 2.0) → tensorflow.keras.Model¶

Set embedding layers followed by alternating dropout/dense layers.

Each categorical feature passes through its own embedding layer, which maps whole numbers to the real line.

Each dense layer has a sigmoid activation function. The output layer has one node for each lead length.

Parameters

dense_layers – The number of dense layers in the neural network.
nodes_per_dense_layer – The number of nodes per dense layer in the neural network.
dropout_share – The probability of a densely connected node of the neural network being set to zero weight during training.
embed_exponent – The ratio of the natural logarithm of the number of embedded values to the natural logarithm of the number of unique categories for each categorical feature.
embed_L2_reg – The L2 regularization coefficient for each embedding layer.

Returns

An untrained Keras model.

format_input_data(data: Union[None, pandas.core.frame.DataFrame] = None, subset: Union[None, pandas.core.series.Series] = None) → List[Union[pandas.core.series.Series, pandas.core.frame.DataFrame]]¶: List each categorical feature for input to own embedding layer.

hyperoptimize(n_trials: int = 64, subset: Union[None, pandas.core.series.Series] = None, max_epochs: int = 128) → dict¶

Search for hyperparameters with greater out-of-sample performance.

Parameters

n_trials – The number of hyperparameter sets to evaluate for each time horizon. Return None if non-positive.
subset – A Boolean Series that is True for observations on which to train and validate. If None, default to all observations not flagged by self.test_col or self.predict_col.

Returns

A dictionary containing the best-performing parameters.

predict(subset: Union[None, pandas.core.series.Series] = None, custom_data: Union[None, pandas.core.frame.DataFrame] = None, cumulative: bool = True) → numpy.ndarray¶

Use trained Keras model to predict observation survival rates.

Parameters

subset – A Boolean Series that is True for observations for which predictions will be produced. If None, default to all observations.
custom_data – A DataFrame in the same format as the input data for which predictions will be produced. If None, default to the assigned input data.
cumulative – If True, produce cumulative survival probabilies. If False, produce marginal survival probabilities (i.e., one minus the hazard rate).

Returns

A numpy array of survival probabilities by observation and lead length.

save_model(file_name: str = 'FFNN_Model', path: str = '') → None¶: Save the TensorFlow model to disk.

train(params: Union[None, dict] = None, subset: Union[None, pandas.core.series.Series] = None, validation_early_stopping: bool = True) → tensorflow.keras.Model¶

Train with survival loss function of Gensheimer, Narasimhan (2019).

See https://peerj.com/articles/6257/.

Use the AMSGrad variant of the Adam optimizer. See Reddi, Kale, and Kumar (2018) at https://openreview.net/forum?id=ryQu7f-RZ.

Train until validation set performance does not improve for the given number of epochs or the given maximum number of epochs.

Returns: A trained Keras model.

transform_features() → pandas.DataFrame¶: Transform features to suit model training.

class fife.tf_modelers.TFSurvivalModeler(**kwargs)¶

Bases: fife.tf_modelers.TFModeler, fife.base_modelers.SurvivalModeler

Use TensorFlow to forecast probabilities of being observed in future periods.

fife.tf_modelers.binary_encode_feature(col: pandas.core.series.Series) → pandas.core.frame.DataFrame¶

Map whole numbers to bits.

Parameters: col – a pandas Series of whole numbers.
Returns: A pandas DataFrame of Boolean values, each combination of values unique to each unique value in the given Series.

fife.tf_modelers.freeze_embedding_layers(model: tensorflow.keras.Model) → tensorflow.keras.Model¶: Prevent embedding layers of the given neural network from training.

fife.tf_modelers.make_predictions_cumulative(model: tensorflow.keras.Model) → tensorflow.keras.Model¶: Append a cumulative product layer to a neural network.

fife.tf_modelers.make_predictions_marginal(model: tensorflow.keras.Model) → tensorflow.keras.Model¶: Remove final layer of a neural network if a cumulative product layer.

fife.tf_modelers.split_categorical_features(data: pandas.core.frame.DataFrame, categorical_features: List[str], numeric_features: List[str]) → List[Union[pandas.core.series.Series, pandas.core.frame.DataFrame]]¶

Split each categorical column in a DataFrame into its own list item.

Necessary to specify inputs to a neural network with embeddings.

Parameters

df – A DataFrame.
categorical_features – A list of column names to split out.
numeric_features – A list of column names to keep in a single DataFrame.

Returns

A list where each element but the last is a Series and the last element is a DataFrame of the given numeric features.

fife.utils module¶

I/O, logging, calculation, and plotting functions for FIFE.

class fife.utils.FIFEArgParser¶

Bases: argparse.ArgumentParser

Argument parser for the FIFE command-line interface.

fife.utils.compute_aggregation_uncertainty(individual_probabilities: pandas.core.frame.DataFrame, percent_confidence: float = 0.95) → pandas.core.frame.DataFrame¶

Statistically bound number of events given each of their probabilities.

Parameters

individual_probabilities – A DataFrame of probabilities where each row represents an individual and each column represents an event.
percent_confidence – The percent confidence of the two-sided intervals defined by the computed bounds.

Raises

ValueError – If percent_confidence is outside of the interval (0, 1).

Returns

A DataFrame containing, for each column in individual_probabilities, the expected number of events and interval bounds on the number of events.

fife.utils.create_example_data(n_persons: int = 8192, n_periods: int = 12) → pandas.core.frame.DataFrame¶: Fabricate an unbalanced panel dataset suitable as FIFE input.

fife.utils.ensure_folder_existence(path: str = '') → None¶: Create a directory if it doesn’t already exist.

fife.utils.import_data_file(file_path: str = 'Input_Data') → pandas.core.frame.DataFrame¶: Return the data stored in given file in given folder.

fife.utils.make_results_reproducible(seed: int = 9999) → None¶: Ensure executing from a fresh state produces identical results.

fife.utils.plot_binary_prediction_errors(errors: dict, width: float = 8, height: float = 1, alpha: float = 0.00390625, color: str = 'black', center_tick_color: str = 'green', center_tick_height: float = 0.125, path: str = '') → None¶

Make a rug plot of binary prediction errors.

Parameters

errors – A dictionary of numpy arrays, each of which contains error values for the outcome given by its key.
width – Width of the rug plot.
height – Height of the rug plot.
alpha – The opacity of plotted ticks, from 2e-8 (nearly transparent) to 1 (opaque).
color – The color of plotted ticks.
center_tick_color – The color of the ticks marking the center of the plot.
center_tick_height – The height of the ticks marking the center of the plot.
path – The path preceding the Output folder in which the plots will be saved.

fife.utils.plot_shap_values(shap_values: dict, raw_data: pandas.core.frame.DataFrame, processed_data: Union[None, pandas.core.frame.DataFrame] = None, no_summary_col: str = typing.Union[NoneType, str], alpha: float = 0.5, path: str = '') → None¶

Make plots of SHAP values.

SHAP values quantify feature contributions to predictions.

Parameters

shap_values – A dictionary of numpy arrays, each of which contains SHAP values for the outcome given by its key.
raw_data – Feature values prior to processing into model input.
processed_data – Feature values used as model input.
no_summary_col – The name of a column to never use for summary plots.
alpha – The opacity of plotted points, from 2e-8 (nearly transparent) to 1 (opaque).
path – The path preceding the Output folder in which the plots will be saved.

fife.utils.print_config(config: dict) → None¶: Neatly print given dictionary of config parameters.

fife.utils.print_copyright() → None¶

fife.utils.redirect_output_to_log(path: str = '') → None¶: Send future output to a text file instead of console.

fife.utils.save_intermediate_data(data: pandas.core.frame.DataFrame, file_name: str, file_format: str = 'pickle', path: str = '') → None¶: Save given DataFrame in Intermediate folder in given format.

fife.utils.save_maps(obj: Union[pandas.core.series.Series, dict], file_name: str, path: str = '') → None¶: Save a map from values to other values to Intermediate folder.

fife.utils.save_output_table(data: pandas.core.frame.DataFrame, file_name: str, index: bool = True, path: str = '') → None¶: Save given DataFrame in the Output folder as a csv file.

fife.utils.save_plot(file_name: str, path: str = '') → None¶: Save the most recently plotted plot in high resolution.