FACSPy.equalize_groups

FACSPy.equalize_groups#

FACSPy.equalize_groups(adata, fraction=None, n_obs=None, on='sample_ID', random_state=187, as_view=False, copy=False)#

Equalizes the cell count between groups. If there are discrepancies in cell numbers between samples or conditions, this function allows to equalize the cell counts in order to avoid over-/underrepresentation of samples. Subsampling is done by random selection of cell indices per group.

Parameters:
  • adata (AnnData) – The anndata object of shape n_obs x n_vars where rows correspond to cells and columns to the channels.

  • fraction (Optional[float]) – Fraction of cells to be kept. By default, the group with the smallest cell count is selected to calculate the final cell number per group by using n_cells * fraction.

  • n_obs (Optional[int]) – Absolute number of cells per group to be kept. If this number is greater than the cell count in one group, a warning will be issued and all cells of that group are kept.

  • on (Union[list[str], str]) – The group variable. Select to the group to equalize. Defaults to sample_ID, but can be any column in the .obs slot.

  • random_state (int) – Controls the random state for reproducible analysis.

  • as_view (bool) – If True, returns an AnnDataView object.

  • copy (bool) – Whether to copy the dataset.

Return type:

AnnData or None, depending on copy.

Examples

>>> import FACSPy as fp
>>> dataset = fp.create_dataset(...)
>>> fp.equalize_groups(dataset, n_obs = 300_000, on = "disease_group")