Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: update tutorials for aws batch #312

Merged
merged 5 commits into from
May 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions tutorials/_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ logo: ../docs_old/figures/logo.png
# See https://jupyterbook.org/content/execute.html
execute:
execute_notebooks: force
exclude_patterns:
- noisepy_aws_batch.ipynb
timeout: 360

only_build_toc_files: true
Expand Down
6 changes: 2 additions & 4 deletions tutorials/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,12 @@
# Learn more at https://jupyterbook.org/customize/toc.html

format: jb-book
root: noise_configuration.md
root: noisepy_configuration.md
chapters:
- file: get_started.ipynb
- file: noisepy_datastore.ipynb
- file: noisepy_scedc_tutorial.ipynb
- file: noisepy_ncedc_tutorial.ipynb
- file: noisepy_compositestore_tutorial.ipynb
- file: CLI.md
- file: cloud/checklist.md
- file: cloud/aws-ec2.md
- file: cloud/aws-batch.md
- file: cloud/noisepy_aws_batch.ipynb
28 changes: 28 additions & 0 deletions tutorials/cloud/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Running NoisePy with AWS

## EC2 and Jupyter Lab
Please refer to [SCOPED HPS Book](https://seisscoped.org/HPS-book/chapters/cloud/AWS_101.html) for full detailed instruction on launching an AWS EC2 instance and/or running the notebooks within a containerized environment.

## Submit Batch Job
For large job load, please refer to the [notebook tutorial](./noisepy_aws_batch.ipynb) for more instruction.

## Command Line Interface
You may create or edit the [config.yml](../config.yml) file with appropriate parameters. The cross-correlation function is written to the `ccf_path`.

```bash
noisepy cross_correlate --format numpy --raw_data_path s3://scedc-pds/continuous_waveforms/ \
--xml_path s3://scedc-pds/FDSNstationXML/CI/ \
--ccf_path s3://<S3_BUCKET>/<CC_PATH> \
--stations=SBC,RIO,DEV \
--start=2022-02-02 \
--end=2022-02-03
```

This toy problem gathers the all the cross-correlations calculated and stack them into the NumPy format on the S3 bucket, specificed by the `stack_path`.

```bash
noisepy stack \
--format numpy \
--ccf_path s3://<S3_BUCKET>/<CC_PATH> \
--stack_path s3://<S3_BUCKET>/<STACK_PATH> \
```
63 changes: 0 additions & 63 deletions tutorials/cloud/aws-batch.md

This file was deleted.

100 changes: 0 additions & 100 deletions tutorials/cloud/aws-ec2.md

This file was deleted.

75 changes: 0 additions & 75 deletions tutorials/cloud/checklist.md

This file was deleted.

10 changes: 5 additions & 5 deletions tutorials/cloud/compute_environment.yaml
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
computeEnvironmentName: '' # [REQUIRED] The name for your compute environment.
computeEnvironmentName: '' # [REQUIRED] Specify a name for your compute environment.
type: MANAGED
state: ENABLED
computeResources: # Details about the compute resources managed by the compute environment.
type: FARGATE
maxvCpus: 256 # [REQUIRED] The maximum number of Amazon EC2 vCPUs that a compute environment can reach.
subnets: # [REQUIRED] The VPC subnets where the compute resources are launched.
- ''
securityGroupIds: # [REQUIRED] The Amazon EC2 security groups that are associated with instances launched in the compute environment.
- ''
subnets:
- '' # [REQUIRED] The VPC subnets where the compute resources are launched.
securityGroupIds:
- '' # [REQUIRED] The Amazon EC2 security groups that are associated with instances launched in the compute environment.
40 changes: 40 additions & 0 deletions tutorials/cloud/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
acorr_only: false
cc_len: 1800
cc_method: xcorr
channels: [BHE, BHN, BHZ]
client_url_key: SCEDC
correction: false
correction_csv: null
down_list: false
start_date: '2004-01-01T00:00:00Z'
end_date: '2004-01-03T00:00:00Z'
freq_norm: rma
freqmax: 2.0
freqmin: 0.05
inc_hours: 24
keep_substack: false
lamax: 36.0
lamin: 31.0
lomax: -115.0
lomin: -122.0
max_over_std: 10
maxlag: 200
ncomp: 3
net_list: [CI]
respdir: null
rm_resp: inv
rm_resp_out: VEL
rotation: true
samp_freq: 20.0
single_freq: true
smooth_N: 10
smoothspect_N: 10
stack_method: linear
stations: ["*"]
stationxml: false
step: 450.0
storage_options: {}
substack: false
substack_len: 1800
time_norm: no
xcorr_only: true
6 changes: 3 additions & 3 deletions tutorials/cloud/job.yaml
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
jobName: ''
jobQueue: ''
jobName: '' # [REQUIRED] Specify a name for the job.
jobQueue: '' # [REQUIRED] The job queue into which the job is submitted.
jobDefinition: '' # [REQUIRED] The job definition used by this job.
containerOverrides: # An object with various properties that override the defaults for the job definition that specify the name of a container in the specified job definition and the overrides it should receive.
command: # The command to send to the container that overrides the default command from the Docker image or the job definition.
- cross_correlate
- --format=numpy
- --raw_data_path=s3://scedc-pds/continuous_waveforms/
- --xml_path=s3://scedc-pds/FDSNstationXML/CI/
- --ccf_path=s3://<YOUR_S3_BUCKET>/<CC_PATH>
- --ccf_path=s3://<S3_BUCKET>/<PATH>/<CC_PATH>
- --net_list=CI
- --stations=*
- --start=2022-02-02
Expand Down
8 changes: 4 additions & 4 deletions tutorials/cloud/job_cc.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
jobName: 'noisepy-cross-correlate'
jobQueue: ''
jobName: '' # [REQUIRED] Specify a name for the cross-correlation job.
jobQueue: '' # [REQUIRED] The job queue into which the job is submitted.
jobDefinition: '' # [REQUIRED] The job definition used by this job.
# Uncomment to run a job across multiple nodes. The days in the time range will be split across the nodes.
# arrayProperties:
Expand All @@ -12,7 +12,7 @@ containerOverrides: # An object with various properties that override the defaul
- cross_correlate
- --raw_data_path=s3://scedc-pds/continuous_waveforms/
- --xml_path=s3://scedc-pds/FDSNstationXML/CI/
- --ccf_path=s3://<YOUR_S3_BUCKET>/<CC_PATH>
- --config=s3://<YOUR_S3_BUCKET>/<CONFIG_PATH>/config.yaml
- --ccf_path=s3://<S3_BUCKET>/<PATH>/<CC_PATH>
- --config=s3://<S3_BUCKET>/<PATH>/config.yaml
timeout:
attemptDurationSeconds: 36000 # 10 hrs
Loading
Loading