get_features_in_response
improvelib.applications.drug_response_prediction.drp_utils.get_features_in_response(feature_df, response_df, column_name)
Takes a feature DataFrame and a response DataFame and returns the feature DataFrame that contains only features that are present in the given response DataFrame.
Used in preprocess.
Parameters:
- feature_dfpd.DataFrame
Feature DataFrame. ID must be index, as with all improvelib functions.
- response_dfpd.DataFrame
Response DataFrame.
- column_namestr
Name of ID column for x data.
Returns:
- feature_dfpd.DataFrame
Feature DataFrame containing only the rows with features that are used in the response.
Example
Before determining the transformations using the training set, it is important to only use features that are in the training set and have features for both drug and cell.
This can be easily performed by calling get_response_with_features
and get_features_in_response
like so:
print("Find intersection of training data.")
response_train = drp.get_response_with_features(response_train, omics, params['canc_col_name'])
response_train = drp.get_response_with_features(response_train, drugs, params['drug_col_name'])
omics_train = drp.get_features_in_response(omics, response_train, params['canc_col_name'])
drugs_train = drp.get_features_in_response(drugs, response_train, params['drug_col_name'])