OSWReader

class pyprophet.io.scoring.osw.OSWReader(config: RunnerIOConfig)[source]

Bases: BaseOSWReader

Class for reading and processing data from an OpenSWATH workflow OSW-sqlite based file.

The OSWReader class provides methods to read different levels of data from the file and process it accordingly. It supports reading data for semi-supervised learning, IPF analysis, context level analysis.

infile

Input file path.

Type:

str

outfile

Output file path.

Type:

str

classifier

Classifier used for semi-supervised learning.

Type:

str

level

Level used in semi-supervised learning (e.g., ‘ms1’, ‘ms2’, ‘ms1ms2’, ‘transition’, ‘alignment’), or context level used peptide/protein/gene inference (e.g., ‘global’, ‘experiment-wide’, ‘run-specific’).

Type:

str

glyco

Flag indicating whether analysis is glycoform-specific.

Type:

bool

read()[source]

Read data from the input file based on the alogorithm.

__init__(config: RunnerIOConfig)[source]

Initialize the reader with a given configuration.

Parameters:

config (BaseIOConfig) – Configuration object containing input details, and module specific config for params for reading.

_create_indexes()[source]

Always use a temporary SQLite connection to create indexes directly on the .osw file, since DuckDB doesn’t seem to currently support creating indexes on attached SQLite databases.

_get_precursor_filter_clause()[source]

Return a WHERE/AND clause fragment for filtering by sampled precursor IDs when subsampling is enabled. Returns empty string if no subsampling, otherwise returns a clause like: “ AND f.PRECURSOR_ID IN (SELECT PRECURSOR_ID FROM sampled_precursor_ids)”

_read_using_duckdb(con)[source]

Read features from SQLite using DuckDB based on the specified level.

Parameters: - con: Connection to the DuckDB database.

Returns: - Features based on the specified level.

Raises: - click.ClickException: If the specified level is unsupported.

_read_using_sqlite(con)[source]

Read features from SQLite database based on the specified level.

Parameters: - con: SQLite connection object

Returns: - Features based on the specified level

Raises: - click.ClickException: If the specified level is unsupported

read() DataFrame[source]

Reads the data for scoring from the specified file using DuckDB if available, falling back to SQLite if DuckDB is not available.

Returns: pd.DataFrame: The data read from the file.