-
Notifications
You must be signed in to change notification settings - Fork 1
How to generate embedding
Suhas Srinivasan edited this page Dec 29, 2018
·
3 revisions
An additional script is provided to generate 2D embedding of the scRNA-seq data or for the latent features from DAWN. The steps to create visualizations are listed below.
- The scRNA-seq data should be in a Cell x Gene matrix, where Cells are the rows and Genes are the columns.
- The matrix values can be of counts or one of the four RNA-seq expression units (RPM, TPM, FPKM and RPKM).
Note: Log normalized values should not be used. - Perform any necessary filtering of cells based on your quality criteria.
- Remove all row and column labels, only the numerical matrix should be present and saved as a comma-separated values (CSV) file.
- Generate embedding:
python visualizer.py <path to data csv file>
.
- Unlike scRNA-seq data, no additional steps are required.
- To generate embedding:
python visualizer.py <path to latent features csv file>
.
Two files are created after the visualizer completes.
- A CSV file which contains the (X, Y) coordinates for the samples. This file is named similar to the input file but has the suffix:
2d_coord
. - A TIF image containing the plot for the embedding. This file is also named similar to the input file but has the suffix:
2d_viz
.
Typically, a 2D embedding contains some cell clusters with good separation. This number of clusters in the embedding can be used as numClusters
for EM clustering.