StandardSemiSupervisedLearner
- class pyprophet.scoring.semi_supervised.StandardSemiSupervisedLearner(inner_learner, xeval_fraction, xeval_num_iter, ss_initial_fdr, ss_iteration_fdr, parametric, pfdr, pi0_lambda, pi0_method, pi0_smooth_df, pi0_smooth_log_pi0, test, main_score_selection_report, outfile, level, ss_use_dynamic_main_score)[source]
Bases:
AbstractSemiSupervisedLearnerImplements a standard semi-supervised learning workflow.
- inner_learner
The base learner used for training.
- Type:
- ss_initial_fdr
Initial FDR threshold for training.
- Type:
float
- ss_iteration_fdr
FDR threshold for iterative learning.
- Type:
float
- parametric
Whether to use parametric FDR estimation.
- Type:
bool
- pfdr
Whether to use pFDR estimation.
- Type:
bool
- pi0_lambda
Lambda values for pi0 estimation.
- Type:
list
- pi0_method
Method for pi0 estimation.
- Type:
str
- pi0_smooth_df
Degrees of freedom for pi0 smoothing.
- Type:
int
- pi0_smooth_log_pi0
Whether to log-transform pi0 values.
- Type:
bool
- ss_use_dynamic_main_score
Whether to dynamically select the main score.
- Type:
bool
- __init__(inner_learner, xeval_fraction, xeval_num_iter, ss_initial_fdr, ss_iteration_fdr, parametric, pfdr, pi0_lambda, pi0_method, pi0_smooth_df, pi0_smooth_log_pi0, test, main_score_selection_report, outfile, level, ss_use_dynamic_main_score)[source]
- averaged_learner(params, **kwargs)[source]
Creates an averaged learner from multiple parameter sets.
- Parameters:
params (list) – List of parameter sets.
kwargs – Additional arguments.
- Returns:
The averaged learner.
- Return type:
- classmethod from_config(config: RunnerIOConfig, base_learner)[source]
Creates a StandardSemiSupervisedLearner instance from a configuration object.
- Parameters:
config (RunnerIOConfig) – The configuration object.
base_learner (AbstractLearner) – The base learner used for training.
- Returns:
The initialized learner.
- Return type:
- get_delta_td_bt_feature_size(train, col, mapper, working_thread_number)[source]
Calculates the difference in feature size between top decoy peaks and best target peaks.
- Parameters:
train (Experiment) – Training data.
col (str) – Column used for selection.
mapper (dict) – Mapping of column aliases to feature names.
working_thread_number (int) – Number of threads to use.
- Returns:
The absolute difference in feature size.
- Return type:
int
- iter_semi_supervised_learning(train, score_columns, working_thread_number)[source]
Performs iterative semi-supervised learning.
- Parameters:
train (Experiment) – Training data.
score_columns (list) – List of score column names.
working_thread_number (int) – Number of threads to use.
- Returns:
Model parameters and classifier scores.
- Return type:
tuple
- score(df, params)[source]
Scores the given data using the trained model.
- Parameters:
df (pd.DataFrame) – Input data.
params (dict) – Model parameters.
- Returns:
Classifier scores.
- Return type:
np.ndarray
- select_train_peaks(train, sel_column, cutoff_fdr, parametric, pfdr, pi0_lambda, pi0_method, pi0_smooth_df, pi0_smooth_log_pi0, mapper=None, main_score_selection_report=False, outfile=None, level=None, working_thread_number=None)[source]
Selects the best target peaks and top decoy peaks based on FDR thresholds.
- Parameters:
train (Experiment) – Training data.
sel_column (str) – Column used for selection.
cutoff_fdr (float) – FDR threshold for selection.
parametric (bool) – Whether to use parametric FDR estimation.
pfdr (bool) – Whether to use pFDR estimation.
pi0_lambda (list) – Lambda values for pi0 estimation.
pi0_method (str) – Method for pi0 estimation.
pi0_smooth_df (int) – Degrees of freedom for pi0 smoothing.
pi0_smooth_log_pi0 (bool) – Whether to log-transform pi0 values.
mapper (dict, optional) – Mapping of column aliases to feature names.
main_score_selection_report (bool, optional) – Whether to generate a score selection report.
outfile (str, optional) – Path to the output file.
level (str, optional) – Analysis level (e.g., peptide, protein).
working_thread_number (int, optional) – Number of threads to use.
- Returns:
Top decoy peaks and best target peaks.
- Return type:
tuple
- set_learner(model)[source]
Sets the parameters of the inner learner.
- Parameters:
model (object) – The model parameters.
- start_semi_supervised_learning(train, score_columns, working_thread_number)[source]
Starts the semi-supervised learning process.
- Parameters:
train (Experiment) – Training data.
score_columns (list) – List of score column names.
working_thread_number (int) – Number of threads to use.
- Returns:
Model parameters, classifier scores, and selected main score column.
- Return type:
tuple
- tune_semi_supervised_learning(train)[source]
Tunes the semi-supervised learning model.
- Parameters:
train (Experiment) – Training data.
- Returns:
Model parameters and classifier scores.
- Return type:
tuple