harreman.tools.apply_gene_filtering#
- harreman.tools.apply_gene_filtering(adata, layer_key=None, cell_type_key=None, model=None, feature_elimination=False, threshold=0.2, autocorrelation_filt=False, expression_filt=False, de_filt=False, umi_counts_obs_key=None, device=device(type='cpu'), verbose=False)[source]#
Applies multi-step gene filtering to an AnnData object.
- Parameters:
adata (AnnData) – Annotated data object (AnnData).
layer_key (str, optional) – Key to use from adata.layers or “use_raw” to use adata.raw.X.
cell_type_key (str, optional) – Key in adata.obs containing cell type annotations.
model (str, optional) – Model name for autocorrelation computation.
feature_elimination (bool, optional (default: False)) – If True, filters genes based on sparsity across all cells.
threshold (float, optional (default: 0.2)) – Minimum fraction of cells in which the gene must be expressed.
autocorrelation_filt (str, optional (default: False)) – If True, filters genes based on spatial autocorrelation significance.
expression_filt (str, optional (default: False)) – If True, filters genes based on expression in each cell type.
de_filt (str, optional (default: False)) – If True, filters genes based on differential expression between each cell type and the rest.
umi_counts_obs_key (str, optional) – Key in adata.obs with total UMI counts per cell. If None, inferred from the expression matrix.
device (torch.device, optional) – Device to use for computation (e.g., CUDA or CPU). Defaults to GPU if available.
verbose (bool, optional (default: False)) – Whether to print progress and status messages.
- Return type:
None