transform_data
improvelib.applications.drug_response_prediction.drp_utils.transform_data(df, transform_file_name, preprocess_dir)
Transforms (imputes, scales, and/or subsets) features based the transformations determined on the training set with determine_transform. Reads the saved dictionary containing the details needed to perform the specified transformations on all sets, and performs the transformations on the given data.
Used in preprocess.
Parameters:
- dfpd.DataFrame
The input feature DataFrame, column names must be feature IDs (e.g. gene names), index must be IDs (e.g. cell line names).
- transform_file_namestr
Name of the file name used in
determine_transform()
.- preprocess_dirstr
The directory where the tranformation dictionary was saved. Should be set to params[‘output_dir’].
Returns:
- dfpd.DataFrame
The transformed DataFrame.
Example
After using the features in the training data to determine the transformation values with determine_transform, the transformations can be applied as follows:
omics_stage = drp.transform_data(omics_stage, 'omics_transform', params['output_dir'])
drugs_stage = drp.transform_data(drugs_stage, 'drugs_transform', params['output_dir'])