demultiplex_helper_funcs module
- auto_read(fname, lev=1, **kwargs) DataFrame
- demux_by_calico_solo(bcs: Series, df_s: DataFrame, samp: str, sep: str, col_list: list[str, str], dem_cs: Series, donor_convert: bool, hto_count: int, multi_hto_sep: str = '') list[Series, Series, list[str]]
Main function for classification by calico_solo.
This function assigns calico_solo classification using another function
demultiplex_helper_funcs.ret_htos_calico_solo().
- demux_by_vireo(bcs: Series, vir_out_file: str, conv_df: DataFrame | None = None, donor_col: str | None = None, conv_col: str | None = None, pool_col: str | None = None, pool_name: str | None = None) tuple[Series, list[str], Series | None]
Main function for classification by vireoSNP.
This function assigns vireo classification.
Paramters
- bcs
A pd series of cell barcodes from gene count matrix.
- vir_out_file
Path to donor_ids.tsv file produced by vireo.
- conv_df
A converter file to change donor names in the vireo output, if needed.
- donor_col
Donor names containing column in the converter file that matches the vireo output.
- conv_col
Column, in the converter file, containing the converted names.
- pool_col
Column, in the converter file, containing the pool names.
- pool_name
Pool name.
- returns:
pd.Series – Classification by vireo per cell
list – Demux stats
pd.Series – Converted donor names of classification by vireo per cell
- demux_stats(demux_freq: Series, demux_name: str) list[str]
- get_donor_info(hto_df: DataFrame, pool_info_df: DataFrame, sep: str, col_list: list)
Return donor information for each cell barcode for multi-HTO setup.
This function returns a pandas series containing demultiplexed donor info according to the data contained in the wet lab info file. This function is made specially for multi-HTO setup.
- Parameters:
hto_df – A series of cell barcodes from gene count matrix
pool_info_df – Subset of wet lab file containing multi-HTO information and SubID (donor IDs)
col_list – List of column names in the wet lab file in the sequence: pool name, HTO names (separated by ‘sep’), HTO barcode, donor info
- Returns:
Contains donor IDs with cell barcodes as index
- Return type:
pd.Series
- parse_file(wet_lab_df, cols, s_name, hs, d_con) list[str] | tuple[list[str], list[str]]
- ret_htos_calico_solo(bcs: Series, df_s: DataFrame, samp: str, sep: str | None, col_list: list[str, str], dem_cs: Series, donor_convert: bool, hto_count: int, multi_hto_setp: bool) list[Series, Series, int, int]
Return HTO information and classification for each cell barcode.
This function returns a 2 pandas series representing donor IDs and HTO name (used for calico_solo) and the number of doublets and negatives identified.
- Parameters:
bcs – A pd series of cell barcodes from gene count matrix
df_s – Wet lab file containing HTO information and SubID (donor IDs) for each pool
samp – Pool name (present in df_s file)
sep – Separator used if all HTOs and donors are present in one row or if multi-HTO setup otherwise None
col_list – List of column names (first val HTO, second val SubID)
dem_cs – A pd series with cell barcodes as index and “HTO classification” (solo output)
donor_convert – If donor names have to be converted from the names used in calico_solo (hashsolo) demultiplexing method
hto_count – If run for multi-HTO setup this indicates the position of HTO in the sequence
multi_hto_setp – True for multi-HTO setup
- Returns:
pd.Series – Contains donor IDs with cell barcodes as index
pd.Series – Contains HTO name with cell barcodes as index
int – number of doublets
int – number of negatives.
- ret_subj_ids(ser: list, t_df: DataFrame) DataFrame
Returns vireo demux stats
This function returns extra stats from vireo demux output
- Parameters:
ser – A pd series of cell barcodes from gene count matrix
t_df – Vireo output (donor_ids.tsv)
- Returns:
A dataframe with extra stats
- Return type:
pd.DataFrame
- set_don_ids(x: str) str
Change naming conventions of Vireo
Use this function to change the naming convetion used in the vireo output - donor_ids.tsv - (generally to make this similar to that of calico_solo/hashsolo but also suits to beautify donor names so as to make it feasible to be classified by using a converter file.
- Parameters:
x – A string representing a donor classification from vireo
- Returns:
The ‘changed’ classification
- Return type:
str