get_response_with_features

improvelib.applications.drug_response_prediction.drp_utils.get_response_with_features(response_df, feature_df, column_name)

Takes a response DataFrame and feature DataFrame(s) and returns a response DataFrame that contains only rows that have available features for the feature type(s) provided. All features in the list must have the same ID type (e.g. drug or cell). If a list is given, only rows will be retained if all features in the list are available.

Used in preprocess.

Parameters:

response_dfpd.DataFrame

Response DataFrame.

feature_dfpd.DataFrame or List of pd.DataFrame

Feature DataFrame or a list of feature DataFrames of the same ID (drug or cell). ID must be index, as with all improvelib functions.

column_namestr

Name of ID column for x data.

Returns:

response_dfpd.DataFrame

Response DataFrame containing only the rows with features available.

Example

Before determining the transformations using the training set, it is important to only use features that are in the training set and have features for both drug and cell. This can be easily performed by calling get_response_with_features and get_features_in_response like so:

response_train = drp.get_response_with_features(response_train, omics, params['canc_col_name'])
response_train = drp.get_response_with_features(response_train, drugs, params['drug_col_name'])