get_features_in_response

improvelib.applications.drug_response_prediction.drp_utils.get_features_in_response(feature_df, response_df, column_name)

Takes a feature DataFrame and a response DataFame and returns the feature DataFrame that contains only features that are present in the given response DataFrame.

Used in preprocess.

Parameters:

feature_dfpd.DataFrame

Feature DataFrame. ID must be index, as with all improvelib functions.

response_dfpd.DataFrame

Response DataFrame.

column_namestr

Name of ID column for x data.

Returns:

feature_dfpd.DataFrame

Feature DataFrame containing only the rows with features that are used in the response.

Example

Before determining the transformations using the training set, it is important to only use features that are in the training set and have features for both drug and cell. This can be easily performed by calling get_response_with_features and get_features_in_response like so:

print("Find intersection of training data.")
response_train = drp.get_response_with_features(response_train, omics, params['canc_col_name'])
response_train = drp.get_response_with_features(response_train, drugs, params['drug_col_name'])
omics_train = drp.get_features_in_response(omics, response_train, params['canc_col_name'])
drugs_train = drp.get_features_in_response(drugs, response_train, params['drug_col_name'])