Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inquiry about Loading scATAC-seq matrices into R #6

Open
jiangzh-coder opened this issue Nov 11, 2022 · 2 comments
Open

inquiry about Loading scATAC-seq matrices into R #6

jiangzh-coder opened this issue Nov 11, 2022 · 2 comments

Comments

@jiangzh-coder
Copy link

Hi

i want to analyze some scATAC-seq data. And after unzip, i got 10 folders (1 patient per folder). In folder, there are 2 subfolders in whcih scRNAseq data and scATAC-seq data exist seperately. Within these 2 subfolders, there are files generated by cell ranger ( i attached 2 pic.)
How may i loading scATAC-seq matrices as well as scRNAseq data into R? Could you kindly provide some codes?

image

@jiangzh-coder
Copy link
Author

jzhou@jiang:/mnt/d/##files/#atac/HD2$ tail atac_peaks.bed
GL000218.1 83275 84106
KI270726.1 27131 28058
KI270726.1 41490 42368
KI270711.1 7979 8731
KI270711.1 8887 9380
KI270713.1 15800 16459
KI270713.1 17290 18022
KI270713.1 21445 22340
KI270713.1 32734 33486
KI270713.1 36862 37780

jzhou@jiang:/mnt/d/##files/#atac/HD2$ head atac_peak_annotation.tsv
chrom start end gene distance peak_type
chr1 9778 10667 MIR1302-2HG -18887 distal
chr1 180732 181004 AL627309.5 -6871 distal
chr1 181116 181809 AL627309.5 -7255 distal
chr1 183935 184770 AL627309.5 -10074 distal
chr1 191103 192028 AL627309.5 -17242 distal
chr1 267627 268482 AP006222.2 773 distal
chr1 629498 630379 AC114498.1 41870 distal
chr1 633577 634503 AC114498.1 45949 distal
chr1 778280 779196 LINC01409 0 promoter

jzhou@jiang:/mnt/d/##files/#atac/HD2$ tail -5 atac_fragments.tsv
KI270713.1 39073 39449 TAAGTAGCACAGGATG-1 1
KI270713.1 39075 39208 GCTTAACAGTTCCCGT-1 2
KI270713.1 39697 39805 TGTTATGAGGGCTTTG-1 2
KI270713.1 40366 40677 GACCTAAGTTCCGGCT-1 2
KI270713.1 40639 40708 GGATGGCCACACCAAC-1 2

@jiangzh-coder
Copy link
Author

Hello,

i analyzed upstream data in this dataset (GSE199994) . i download SRA file from EBI, then i change then into fastq file, then i transform their names to standard name and run cellranger_atac. i got error as following:

2.7% (< 10%) of read pairs have a valid 10x barcode. This could be a result of poor sequencing quality,
a sample mixup, or running the wrong pipeline, for example, running cellranger-atac on Multiome ATAC + GEX data, or vice versa.

The whole code is as following :

ascp -QT -l 300m -P33001 -i ~/miniconda3/envs/my10x/etc/asperaweb_id_dsa.openssh \era-fasp@fasp.sra.ebi.ac.uk:/vol1/srr/SRR186/006/SRR18613306 .

mv SRR18613306 SRR18613306.sra
parallel-fastq-dump -t 12 -O ./ --split-files --gzip -s SRR18613306.sra

mv *.fastq.gz /home/ubuntu/GSE199994/scATAC/2.raw_fastq

mv SRR18613295_1.fastq.gz SRR18613295-P5_S9_L001_I1_001.fastq.gz
mv SRR18613295_2.fastq.gz SRR18613295-P5_S9_L001_R1_001.fastq.gz
mv SRR18613295_3.fastq.gz SRR18613295-P5_S9_L001_R2_001.fastq.gz
mv SRR18613295_4.fastq.gz SRR18613295-P5_S9_L001_R3_001.fastq.gz

cellranger-atac count --id=SRR18613295-P5
--reference=/home/ubuntu/biosoftware/refdata-cellranger-arc-GRCh38-2020-A-2.0.0
--fastqs=/home/ubuntu/GSE199994/scATAC/2.raw_fastq
--sample=SRR18613295-P5
--localcores=24
--localmem=96

could you help me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant