Skip to content

Commit

Permalink
Add schedule command to CLI docs (#118)
Browse files Browse the repository at this point in the history
- Add `schedule` command docs to `CLI.md`
  • Loading branch information
navarone-feekery authored Aug 29, 2024
1 parent 1dccc23 commit 84f6f00
Showing 1 changed file with 21 additions and 1 deletion.
22 changes: 21 additions & 1 deletion docs/CLI.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,14 +41,15 @@ $ bin/crawler --help
> Commands:
> crawler crawl CRAWL_CONFIG # Run a crawl of the site
> crawler schedule CRAWL_CONFIG # Schedule a recurrent crawl of the site
> crawler validate CRAWL_CONFIG # Validate crawler configuration
> crawler version # Print version
```

### Commands


- [`crawler crawl`](#crawler-crawl)
- [`crawler schedule`](#crawler-schedule)
- [`crawler validate`](#crawler-validate)
- [`crawler version`](#crawler-version)

Expand All @@ -68,6 +69,25 @@ $ bin/crawler crawl config/examples/parks-australia.yml
$ bin/crawler crawl config/examples/parks-australia.yml --es-config=config/es.yml
```

#### `crawler schedule`

Creates a schedule to recurrently crawl the configured domain in the provided config file.
The scheduler uses a cron expression that is configured in the Crawler configuration file using the field `schedule.pattern`.
See [scheduling recurring crawl jobs](../README.md#scheduling-recurring-crawl-jobs) for details on scheduling.

Can optionally take a second configuration file for Elasticsearch settings.
See [CONFIG.md](./CONFIG.md) for details on the configuration files.

```bash
# schedule crawls using only crawler config
$ bin/crawler schedule config/examples/parks-australia.yml
```

```bash
# schedule crawls using crawler config and optional --es-config
$ bin/crawler schedule config/examples/parks-australia.yml --es-config=config/es.yml
```

#### `crawler validate`

Checks the configured domains in `domain_allowlist` to see if they can be crawled.
Expand Down

0 comments on commit 84f6f00

Please sign in to comment.