|
1 |
| -# PacBio Amplicon Analysis (_pbaa_) |
| 1 | +<p align="center"> |
| 2 | + <img src="img/pbaa_logo_transparent.png" alt="pbaa logo" width="250px"/> |
| 3 | +</p> |
| 4 | +<h1 align="center"><i>pbaa</i></h1> |
| 5 | +<p align="center">PacBio Amplicon Analysis</p> |
2 | 6 |
|
| 7 | +*** |
3 | 8 |
|
4 | 9 | PacBio Amplicon Analysis (_pbaa_) separates complex mixtures of amplicon targets from genomic samples. The _pbaa_ application is designed to cluster and generate high-quality consensus sequences from HiFi reads. This application only works on HiFi amplicon data. There are several assumptions made within the code that will only support high quality reads (>QV20). This application will not work on CLR data. _pbaa_ is reference aided method (pseudo de-novo).
|
5 | 10 |
|
6 |
| -Typical use cases involve multi-allelic samples where the sample-specific ploidy or copy number is unknown. _pbaa_ can effectively separate alleles with one to many variants, including SNVs and large indels contained within the target region. _pbaa_ has been optimized and tested for datasets with a moderate (<10) cluster count. Feedback for higher cluster density is welcome and may be addressed in future releases. |
| 11 | +Typical use cases involve multi-allelic samples where the sample-specific ploidy or copy number is unknown. _pbaa_ can effectively separate alleles with one to many variants, including SNVs and large indels contained within the target region. _pbaa_ has been optimized and tested for datasets with a moderate (<10) cluster count. Feedback for higher cluster density is welcome and may be addressed in future releases. |
7 | 12 |
|
8 | 13 | ## Workflow
|
9 | 14 | 
|
@@ -105,7 +110,7 @@ _pbaa_ supports batching of samples via the FOFN (file of file name[s]) format.
|
105 | 110 |
|
106 | 111 | Guide/reference sequence choice affects read grouping/placement. It is important to choose guides that are sufficiently divergent. If too many similar alleles are used for the same locus the fraction of un-placed reads will increase because the number of informative kmers decrease within a locus. Too few guides can also cause cluster dropout; it's the goldilocks problem.
|
107 | 112 |
|
108 |
| -Guide sequences should be grouped into locus assignments. For example if multiple HLA-A alleles are used in the guide sequence, they should be grouped, so clustering will be performed at the locus level. |
| 113 | +Guide sequences should be grouped into locus assignments. For example if multiple HLA-A alleles are used in the guide sequence, they should be grouped, so clustering will be performed at the locus level. |
109 | 114 |
|
110 | 115 | ```
|
111 | 116 | Allele_1|HLA-A (sequence name | group name)
|
@@ -177,7 +182,7 @@ m54043_190914_194303/4195156/ccs HLA-B + HLA00158_B_14-02-01-01_4070_bp|HLA-B 0.
|
177 | 182 |
|
178 | 183 | ## Best practices
|
179 | 184 |
|
180 |
| -### Sample preparation and sequencing |
| 185 | +### Sample preparation and sequencing |
181 | 186 |
|
182 | 187 | [Targeted Sequencing For Amplicons Document](https://www.pacb.com/wp-content/uploads/Application-Brief-Targeted-sequencing-Best-Practices.pdf)
|
183 | 188 |
|
|
0 commit comments