BaseOSWReader

class pyprophet.io._base.BaseOSWReader(config: BaseIOConfig)[source]

Bases: BaseReader

Class for reading and processing data from an OpenSWATH workflow OSW-sqlite based file.

The OSWReader class provides methods to read different levels of data from the file and process it accordingly. It supports reading data for semi-supervised learning, IPF analysis, context level analysis.

infile

Input file path.

Type:

str

outfile

Output file path.

Type:

str

classifier

Classifier used for semi-supervised learning.

Type:

str

level

Level used in semi-supervised learning (e.g., ‘ms1’, ‘ms2’, ‘ms1ms2’, ‘transition’, ‘alignment’), or context level used peptide/protein/gene inference (e.g., ‘global’, ‘experiment-wide’, ‘run-specific’).

Type:

str

glyco

Flag indicating whether analysis is glycoform-specific.

Type:

bool

read()[source]

Read data from the input file based on the alogorithm.

_init_duckdb_views()[source]

Initialize DuckDB views with optional subsampling.

__eq__(other)

Return self==value.

__hash__ = None
__init__(config: BaseIOConfig)[source]

Initialize the reader with a given configuration.

Parameters:

config (BaseIOConfig) – Configuration object containing input details, and module specific config for params for reading.

__repr__()

Return repr(self).

_init_duckdb_views(con)[source]

Initialize DuckDB views for the OSW file with optional subsampling support.

Creates a TEMP table of sampled precursor IDs if subsample_ratio < 1.0, which can be used by subclasses to filter feature queries.

Subclasses should call this method before creating feature views and then filter views with: WHERE PRECURSOR_ID IN (SELECT PRECURSOR_ID FROM sampled_precursor_ids) when self.subsample_ratio < 1.0.

Parameters:

con – DuckDB connection with OSW database attached as ‘osw’

read() DataFrame[source]

Abstract method to be implemented by subclasses to read data from OSW format for a specific algorithm.