Experiment
- class pyprophet.scoring.data_handling.Experiment(df)[source]
Bases:
objectEncapsulates data operations for peak groups, decoys, and targets.
- df
The underlying data.
- Type:
pd.DataFrame
- __weakref__
list of weak references to the object (if defined)
- filter_(idx)[source]
Filters the data based on the given index.
- Parameters:
idx (array-like) – Boolean index for filtering.
- Returns:
A new Experiment containing the filtered data.
- Return type:
- get_decoy_peaks()[source]
Retrieves the decoy peaks.
- Returns:
A new Experiment containing the decoy peaks.
- Return type:
- get_feature_matrix(use_main_score)[source]
Retrieves the feature matrix for scoring.
- Parameters:
use_main_score (bool) – Whether to include the main score.
- Returns:
The feature matrix.
- Return type:
np.ndarray
- get_target_peaks()[source]
Retrieves the target peaks.
- Returns:
A new Experiment containing the target peaks.
- Return type:
- get_top_decoy_peaks()[source]
Retrieves the top decoy peaks.
- Returns:
A new Experiment containing the top decoy peaks.
- Return type:
- get_top_target_peaks()[source]
Retrieves the top target peaks.
- Returns:
A new Experiment containing the top target peaks.
- Return type:
- get_top_test_peaks()[source]
Retrieves the top test peaks.
- Returns:
A new Experiment containing the top test peaks.
- Return type:
- get_train_peaks()[source]
Retrieves the training peaks.
- Returns:
A new Experiment containing the training peaks.
- Return type:
- log_summary()[source]
Logs a summary of the input data, including the number of peak groups, group IDs, and scores.
- normalize_score_by_decoys(score_col_name)[source]
Normalizes the decoy scores to mean 0 and standard deviation 1, and scales the target scores accordingly.
- Parameters:
score_col_name (str) – Name of the score column to normalize.
- rank_by(score_col_name)[source]
Ranks the data by the specified score column.
- Parameters:
score_col_name (str) – Name of the score column to rank by.
- scale_features(score_columns)[source]
Scales the features to the [0, 1] range.
- Parameters:
score_columns (list) – List of columns to be scaled.