FACSPy.sync.synchronize_dataset#
- FACSPy.sync.synchronize_dataset(adata, recalculate=False, copy=False)#
This function is used to synchronize the unstructured metadata with the underlying data. That way, we attempt to update the unstructured data whenever the data are subset or changed in any way.
To detect a changed dataset, we generate a hash that is based on the individual entries that we want to compare. If this hash changes, the dataset has been modified in some way.
This function will only trigger modifications if the hash is not identical. That way, we save lots of unnecessary lookups.
The hash is generated upon dataset creation and stored in adata.uns[“dataset_status_hashs”]. adata.uns[“dataset_status_hashs”] is a dictionary where multiple entries can be inserted.
Currently, adata.obs_names, adata.var_names, unique sample_ID in .obs and in .uns[“metadata”] as well as the columns in `.obs and in .uns[“metadata”] are hashed.
- Parameters:
adata (
AnnData) – The anndata object of shape n_obs x n_vars where rows correspond to cells and columns to the channelsrecalculate (
bool) – If True, recalculates data stored in adata.uns based on the settings as stored in adata.uns[“settings”]copy (
bool) – Whether to copy the dataset.
- Return type:
AnnDataor None, depending on copy.
Examples
>>> import FACSPy as fp >>> dataset = fp.create_dataset(...) >>> dataset = dataset[dataset.obs["sample_ID"].isin(["1", "2"]),:] >>> fp.sync.synchronize_dataset(dataset)