store_predictions_df
improvelib.utils.store_predictions_df(y_pred, y_col_name, stage, output_dir, input_dir = None, y_true = None, round_decimals = 4)
Save predictions with accompanying dataframe.
This allows to trace original data evaluated (e.g. drug and cell pairs) if corresponding dataframe is available (output from save_stage_ydf
in preprocess), in which case the whole structure as well as the model predictions are stored.
If the dataframe is not available, only ground truth and model predictions are stored.
Used in train and infer.
Parameters:
- y_prednp.array
Model predictions.
- y_col_namestr
Name of the column in the y_data predicted on (e.g. ‘auc’, ‘ic50’).
- stagestr
Specify if evaluation is with respect to val or test set (‘val’, or ‘test’).
- output_dirstr
The output directory where the results should be saved. Should be
params['output_dir']
.- y_truenp.array, optional
Ground truth, if available.
- input_dirstr, optional
Directory where df with ground truth with metadata is stored.
- round_decimalsint, optional
Number of decimals in output (default is 4).
Returns:
None
Example
To store validation predictions in train:
frm.store_predictions_df(
y_true=val_true,
y_pred=val_pred,
stage="val",
y_col_name=params["y_col_name"],
output_dir=params["output_dir"],
input_dir=params["input_dir"]
)
To store inference predictions in infer, when ground truth is available:
frm.store_predictions_df(
y_true=test_true,
y_pred=test_pred,
stage="test",
y_col_name=params["y_col_name"],
output_dir=params["output_dir"],
input_dir=params["input_data_dir"]
)