Semi-Supervised Scoring Documentation

This module provides the main tools for statistical scoring, error estimation, and hypothesis testing in targeted proteomics and glycoproteomics workflows. It includes modules for semi-supervised learning, feature scaling, classifier integration, and context-specific inference.

Submodules:

data_handling: Utilities for handling and processing data, including feature scaling, ranking, and validation.
classifiers: Implements various classifiers (e.g., LDA, SVM, XGBoost) for scoring.
semi_supervised: Implements semi-supervised learning workflows for iterative scoring.
runner: Defines workflows for running PyProphet, including learning and weight application.
pyprophet: Core functionality for orchestrating scoring and error estimation workflows.

Dependencies:

numpy
pandas
scikit-learn
xgboost
loguru
click

scoring

This module provides the main tools for statistical scoring, error estimation, and hypothesis testing in targeted proteomics and glycoproteomics workflows.

Runner

`PyProphetRunner`	Base class for running PyProphet workflows.
`PyProphetLearner`	Implements the learning and scoring workflow for PyProphet.
`PyProphetWeightApplier`	Applies pre-trained weights to full/new datasets.

PyProphet

`PyProphet`	Orchestrates the semi-supervised learning and scoring workflow.
`Scorer`	Handles scoring, error estimation, and hypothesis testing for experiments.

Semi-Supervised

`AbstractSemiSupervisedLearner`	Abstract base class for semi-supervised learning workflows.
`StandardSemiSupervisedLearner`	Implements a standard semi-supervised learning workflow.

Classifiers

`AbstractLearner`	Abstract base class for defining a learner interface.
`LinearLearner`	Implements a linear classifier for scoring.
`LDALearner`	Implements a Linear Discriminant Analysis (LDA) learner.
`SVMLearner`	Implements a Support Vector Linear Classification (SVM) learner.
`XGBLearner`	Implements an XGBoost-based learner for scoring.
`HistGBCLearner`	Implements a scikit-learn HistGradientBoostingClassifier-based learner for scoring.

Data Handling

`Experiment`	Encapsulates data operations for peak groups, decoys, and targets.
`prepare_data_table`	Prepares the input data table for scoring and analysis.
`cleanup_and_check`	Cleans up the input DataFrame and validates its structure.
`check_for_unique_blocks`	Checks if group IDs form unique blocks.
`update_chosen_main_score_in_table`	Updates the main score column in the feature table.
`use_metabolomics_scores`	Returns a list of metabolomics-specific score columns.