scBiMapping

Fast and Accurate Non-linear Dimensionality Reduction and Cell Annotation for Large and High-dimensional Single-Cell Datasets

Install

pip install scBiMapping

note1: the source code has not yet been uploaded (we will upload it once our paper is published); the currently uploaded codes have been compiled in python 3.11, and thus to run the package in python 3.11 is necessary).

note 2: if you are a BGIer, you can directly use the public image (named scBiMapping) on the cloud platform.

How to use

There are two major functions in scBiMapping, scBiMapping_DR and scBiMapping_annotation, corresponding to the following two tasks.

Task 1: Dimension reduction

scBiMapping_DR(adata,n_embedding = 30, normalization = True):

Input:
- adata: anndata format (cell-feature sparse matrix is stored in adata.X);
- n_embedding: an integer, denoting the number of embeddings (default to 30; slight adjustment may lead to better performance in practice);
- normalization: whether to normalize each embedded vector to norm one (default to True);
Output:
- Embedded matrix is stored in adata.obsm['U'], where each row denotes the embedded vector of one cell;

Task 2: reference-based cell type annotation

scBiMapping_annotation(adata_ref,adata_query,n_embedding = 30, K = 30, K_majority = 10, CellType_Key_for_ref = 'cell_annotation', knnMethod = 'HNSW',normalization = True, reduction_method_on_cells_only = 'BiMapping',metric = 'euclidean',n_embedding_2nd = None)

Input:
- adata_ref: referenc dataset (anndata format);
- adata_query: query dataset (anndata format); Note: the feature set of reference and query datasets should be the same, by using the following setttings for instance
  - intersection_feature = list(set(adata_ref.var_names) & set(adata_query.var_names))
  - adata_ref = adata_ref[:,intersection_feature]
  - adata_query = adata_query[:,intersection_feature]
- n_embedding: an integer, denoting the number of embeddings (default to 30; slight adjustment may lead to better performance in practice);
- K: an integer, denoting how many features are used as the new vector representation of each cell in the embedding (default to 30; adjustment may be needed in practice);
- K_majority: an integer, denoting how many reference cells are used for majority voting (default to 10; adjustment may be needed in practice);
- CellType_Key_for_ref: key in adata_ref.obs that stores the cell type labels of the reference cells (IMPORTANT!!!);
- knnMethod: fast k-nearest neighbor searching method: 'HNSW' (default) or 'NNDescent' (recommended as well);
- normalization: whether to normalize each embedded vector to norm one (default to True);
- reduction_method_on_cells_only: dimension reduction on the new representation in the embedded space: 'BiMapping' (default) or 'None';
- metric: metric in the embedded space: 'euclidean' (default),'cosine', or, 'ip';
- n_embedding_2nd: numbe of embeddings in the 2nd time dimension reduction: None (n_embedding will be used) or a value specfied by users;
Output:
- the predicted cell types for all query cells are stored in adata_query.obs['cell_type_predicted']

Tutorials for tasks 1 and 2

We provide 8 demos to further demonstrate how to conduct dimension reduction and reference-based cell type annotation using scBiMapping; see details at https://cloud.stomics.tech/library/#/tool/detail/workspace_notebook/NB0120241204Yng3Pn/--?zone=sz

Scripts to reproduce primary experimental results

See the corresponding files in this github. See also reproducible program in codeOcean: https://codeocean.com/capsule/3904732/tree.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
Turtorials		Turtorials
reproducing_results_on_dimension_reduction		reproducing_results_on_dimension_reduction
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scBiMapping

Install

How to use

Task 1: Dimension reduction

Task 2: reference-based cell type annotation

Tutorials for tasks 1 and 2

Scripts to reproduce primary experimental results

About

Releases

Packages

Contributors 2

Languages

scBGI/scBiMapping

Folders and files

Latest commit

History

Repository files navigation

scBiMapping

Install

How to use

Task 1: Dimension reduction

Task 2: reference-based cell type annotation

Tutorials for tasks 1 and 2

Scripts to reproduce primary experimental results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages