demultiplex_helper_funcs module
- auto_read(fname, **kwargs) DataFrame
- demux_by_calico_solo(bcs: Series, df_s: DataFrame, samp: str, sep: str, col_list: list[str, str], dem_cs: Series, donor_convert: bool) list[pandas.core.series.Series, pandas.core.series.Series, list[str]]
Main function for classification by calico_solo.
This function assigns calico_solo classification using another function
demultiplex_helper_funcs.ret_htos_calico_solo()
.
- demux_by_vireo(bcs: Series, vir_out_file: str, conv_df: Optional[DataFrame] = None) list[pandas.core.series.Series, list[str]]
Main function for classification by vireoSNP.
This function assigns vireo classification.
Paramters
- bcs
A pd series of cell barcodes from gene count matrix
- vir_out_file
Path to donor_ids.tsv file produced by vireo
- conv_df
A converter file to change donor names in the vireo output, if needed
- returns
pd.Series – Classification by vireo per cell
list – Demux stats
- demux_stats(demux_freq: Series, demux_name: str) list[str]
- get_don_ids(x: str, t_df: DataFrame) str
Coverting the donor names of vireo output
- Parameters
x – A string representing a donor
t_df – A converter file that contains the donor and its converted
- Returns
String representing converted donor name
- Return type
str
- parse_file(wet_lab_df, cols, s_name, hs, d_con) Union[list[str], list[list[str], list[str]]]
- ret_htos_calico_solo(bcs: Series, df_s: DataFrame, samp: str, sep: Optional[str], col_list: list[str, str], dem_cs: Series, donor_convert: bool) list[pandas.core.series.Series, pandas.core.series.Series, int, int]
Return HTO information and classification for each cell barcode.
This function returns a 2 pandas series representing donor IDs and HTO name (used for calico_solo) and the number of doublets and negatives identified.
- Parameters
bcs – A pd series of cell barcodes from gene count matrix
df_s –
- Wet lab file containing HTO information and SubID (donor IDs)
for each pool
samp – Pool name (present in df_s file)
sep –
- Separator used if all HTOs and donors are present in one row
otherwise None
col_list – List of column names (first val HTO, second val SubID)
dem_cs –
- A pd series with cell barcodes as index and “HTO classification”
(solo output)
donor_convert – If donor names have to be converted from the names used in calico_solo (hashsolo) demultiplexing method
- Returns
pd.Series – Contains donor IDs with cell barcodes as index
pd.Series – Contains HTO name with cell barcodes as index
int – number of doublets
int – number of negatives.
- ret_subj_ids(ser: list, t_df: DataFrame) DataFrame
Returns vireo demux stats
This function returns extra stats from vireo demux output
- Parameters
ser – A pd series of cell barcodes from gene count matrix
t_df – Vireo output (donor_ids.tsv)
- Returns
A dataframe with extra stats
- Return type
pd.DataFrame
- set_don_ids(x: str) str
Change naming conventions of Vireo
Use this function to change the naming convetion used in the vireo output - donor_ids.tsv - (generally to make this similar to that of calico_solo/hashsolo but also suits to beautify donor names so as to make it feasible to be classified by using a converter file.
- Parameters
x – A string representing a donor classification from vireo
- Returns
The ‘changed’ classification
- Return type
str