|
| 1 | +# Fastq Sync Service |
| 2 | + |
| 3 | +The fastq sync service is a simple service that allows step functions with fastq set ids as inputs to 'hang' |
| 4 | +until the requirements of the fastq set have been met. |
| 5 | + |
| 6 | +This is useful for workflow-glue services that have fastq set ids but need to wait for either |
| 7 | + |
| 8 | +1. The fastq set readsets to be created |
| 9 | +2. The fastq set to have been qc'd AND have a fingerprint file and compression information |
| 10 | +3. This is also useful for data sharing services that require the fastqs to be unarchived before they can be shared |
| 11 | + |
| 12 | +The step function will then hang at that step until the task token has been 'unlocked' by the fastq sync service. |
| 13 | + |
| 14 | +## Registering task tokens |
| 15 | + |
| 16 | +Workflow glue services can use the fastq sync service by generating the following event |
| 17 | + |
| 18 | +```json5 |
| 19 | +{ |
| 20 | + "EventBusName": "OrcaBusMain", |
| 21 | + "Source": "doesnt matter", |
| 22 | + "DetailType": "FastqSync", |
| 23 | + "Detail": { |
| 24 | + "taskToken": "uuid", |
| 25 | + "fastqSetId": "fqs.123456", |
| 26 | + // Then one or more of the following |
| 27 | + // Requirements can be left out if not needed |
| 28 | + "requirements": { |
| 29 | + // Do all fastq list rows in the set contain readsets? |
| 30 | + "hasActiveReadSet": true, |
| 31 | + // Do all fastq list rows in the set contain an ntsm uri? |
| 32 | + "hasFingerprint": true, |
| 33 | + // Do all fastq list rows in the set contain compression information? |
| 34 | + // Useful if the fastq list rows are in ora format. |
| 35 | + // Some pipelines require the gzip file size in bytes in order |
| 36 | + // to stream the gzip file from ora back into s3 |
| 37 | + "hasFileCompressionInformation": true, |
| 38 | + // Do all fastq list rows in the set contain qc information? |
| 39 | + // We don't use this for anything yet but we may use this in the future |
| 40 | + // to ensure that a fastq set has met the ideal coverage levels |
| 41 | + "hasQc": true, |
| 42 | + }, |
| 43 | + "forceUnarchiving": true, // Force unarchiving of a fastq file if necessary, will fail if not set and fastq is in archive |
| 44 | + } |
| 45 | +} |
| 46 | +``` |
| 47 | + |
| 48 | +The fastq sync service will also trigger the qc, fingerprint or compression information services if they do not exist. |
| 49 | + |
| 50 | +If any of the fastq list rows are in archive, the fastq sync service will also trigger the fastq unarchiving service to thaw out these fastq list rows. |
| 51 | +And place them into the 'byob' bucket. |
| 52 | + |
| 53 | + |
| 54 | +## Unlocking task tokens |
| 55 | + |
| 56 | +The fastq sync service will then also listen for the following event types: |
| 57 | + |
| 58 | +1. FastqListRowUpdated (from the fastq management service) |
| 59 | +2. UnarchivingJobUpdated (from the fastq unarchiving service, where the status is 'SUCCEEDED') |
| 60 | + |
| 61 | +Everytime one of the events is triggered, the fastq sync service will check if the fastq list row or fastq set has met the requirements. |
| 62 | +If all requirements are met for the fastq set, the fastq sync service will unlock the task token. |
0 commit comments