Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add documentation and volume for custom cvs2bufr-mapping-templates #818

Closed
wants to merge 9 commits into from
1 change: 1 addition & 0 deletions .zap/rules.tsv
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,4 @@
10036 IGNORE "Server Leaks Version Information via ""Server"" HTTP Response Header Field" Low
10110 IGNORE Dangerous JS Functions Low
10105 IGNORE Authentication Credentials Captured Medium
10003 IGNORE Vulnerable JS Library Medium
15 changes: 14 additions & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ services:
test: ["CMD", "curl", "-f", "http://localhost/oapi/admin/resources"]
interval: 5s
retries: 100
volumes:
- ${WIS2BOX_HOST_DATADIR}/mappings:/data/wis2box/mappings:ro

minio:
container_name: wis2box-minio
Expand All @@ -51,7 +53,9 @@ services:
- wis2box.env
environment:
- MINIO_BROWSER_LOGIN_ANIMATION=off
command: server --console-address ":9001" /data
- MINIO_BROWSER_REDIRECT=false
- MINIO_UPDATE=off
command: server --quiet --console-address ":9001" /data
# in a production-setup minio needs to be
volumes:
- minio-data:/data
Expand All @@ -76,6 +80,12 @@ services:
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- cluster.name=es-wis2box
- xpack.security.enabled=false
- ingest.geoip.downloader.enabled=false
- xpack.ml.enabled=false
- xpack.watcher.enabled=false
- xpack.graph.enabled=false
- xpack.monitoring.templates.enabled=false
- cluster.routing.allocation.disk.threshold_enabled=false
mem_limit: 1.5g
memswap_limit: 1.5g
volumes:
Expand All @@ -100,6 +110,8 @@ services:
context: ./wis2box-broker
env_file:
- wis2box.env
volumes:
- mosquitto-config:/mosquitto/config

wis2box-management:
container_name: wis2box-management
Expand Down Expand Up @@ -150,3 +162,4 @@ volumes:
minio-data:
auth-data:
htpasswd:
mosquitto-config:
18 changes: 17 additions & 1 deletion docs/source/reference/running/data-pipeline-plugins.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,21 @@ A typical csv2bufr plugin workflow definition would by defined as follows:

csv:
- plugin: wis2box.data.csv2bufr.ObservationDataCSV2BUFR
template: /data/wis2box/synop_bufr.json # locally created csv2bufr mapping (located in $WIS2BOX_HOST_DATADIR)
template: aws-template # using one of the built-in templates
notify: true # trigger GeoJSON publishing for API and UI
file-pattern: '^.*\.csv$'

The default templates are defined by the `csv2bufr-templates`_ repository.

In the case the user wants to use a custom template, the template should be located in the ``$WIS2BOX_HOST_DATADIR/mappings`` directory.

The plugin configuration would then be defined as follows:

.. code-block:: yaml

csv:
- plugin: wis2box.data.csv2bufr.ObservationDataCSV2BUFR
template: /data/wis2box/mappings/my_own_template.json # locally created csv2bufr mapping (located in $WIS2BOX_HOST_DATADIR/mappings)
notify: true # trigger GeoJSON publishing for API and UI
file-pattern: '^.*\.csv$'

Expand Down Expand Up @@ -146,5 +160,7 @@ For example, to publish GRIB2 data matching the file-pattern ``^.*_(\d{8})\d{2}.
See :ref:`data-mappings` for a full example data mapping configuration.

.. _`csv2bufr`: https://csv2bufr.readthedocs.io
.. _`csv2bufr-templates`: https://github.com/wmo-im/csv2bufr-templates
.. _`bufr2geojson`: https://github.com/wmo-im/bufr2geojson
.. _`synop2bufr`: https://synop2bufr.readthedocs.io

2 changes: 1 addition & 1 deletion docs/source/user/data-ingest.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@

* `bufr2bufr` : the input is received in BUFR format and split by subset, where each subset is published as a separate bufr message
* `synop2bufr` : the input is received in `FM-12 SYNOP format <https://library.wmo.int/idviewer/35713/33>`_ and converted to BUFR format. The year and month are extracted from the file pattern
* `csv2bufr` : the input is received in csv format and converted to BUFR format
* `csv2bufr` : the input is received in csv format and converted to BUFR format, a mapping template is used to convert the csv columns to BUFR encoded values. Custom mapping templates need to be placed in the ``$WIS2BOX_HOST_DATADIR/mappings`` directory. See :ref:`csv2bufr-templates` for examples of mapping templates.

Check warning on line 43 in docs/source/user/data-ingest.rst

View workflow job for this annotation

GitHub Actions / main

undefined label: csv2bufr-templates (if the link has no caption the label must precede a section header)

To publish data for other data formats you can use the 'Universal' plugin, which will pass through the data without any conversion.
Please note that you will need to ensure that the date timestamp can be extracted from the file pattern when using this plugin.
Expand Down
14 changes: 8 additions & 6 deletions tests/integration/test_workflow.py
Original file line number Diff line number Diff line change
Expand Up @@ -214,15 +214,17 @@ def test_metadata_discovery_publish():
def test_data_ingest():
"""Test data ingest/process publish"""

item_api_url = f'{API_URL}/collections/{ID}/items/WIGOS_0-454-2-AWSNAMITAMBO_20210707T145500-82' # noqa
item_api_url = f'{API_URL}/collections/{ID}/items/0-454-2-AWSNAMITAMBO-202107071455-15' # noqa

item_api = SESSION.get(item_api_url).json()

assert item_api['reportId'] == 'WIGOS_0-454-2-AWSNAMITAMBO_20210707T145500'
assert item_api['properties']['resultTime'] == '2021-07-07T14:55:00Z' # noqa
item_source = f'2021-07-07/wis/{ID}/{item_api["reportId"]}.bufr4' # noqa
r = SESSION.get(f'{URL}/data/{item_source}') # noqa
assert r.status_code == codes.ok
assert item_api['properties']['reportId'] == '0-454-2-AWSNAMITAMBO-202107071455' # noqa
assert item_api['properties']['reportTime'] == '2021-07-07T14:55:00Z' # noqa
assert item_api['properties']['wigos_station_identifier'] == '0-454-2-AWSNAMITAMBO' # noqa
assert item_api['properties']['name'] == 'global_solar_radiation_integrated_over_period_specified' # noqa
assert item_api['properties']['value'] == 0.0
assert item_api['properties']['units'] == 'J m-2'
assert item_api['properties']['phenomenonTime'] == '2021-07-06T14:55:00Z/2021-07-07T14:55:00Z' # noqa


def test_data_api():
Expand Down
2 changes: 0 additions & 2 deletions wis2box-broker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,8 @@ FROM eclipse-mosquitto:2.0.20
RUN mkdir -p /data/wis2box/mosquitto
RUN ln -s /mosquitto /data/wis2box/mosquitto

COPY mosquitto-ssl.conf /mosquitto/config/mosquitto-ssl.conf
COPY mosquitto.conf /mosquitto/config/mosquitto.conf

COPY acl.conf /mosquitto/config/acl.conf
COPY entrypoint.sh /docker-entrypoint.sh

RUN chmod +x /docker-entrypoint.sh
8 changes: 0 additions & 8 deletions wis2box-broker/acl.conf

This file was deleted.

69 changes: 52 additions & 17 deletions wis2box-broker/entrypoint.sh
Original file line number Diff line number Diff line change
@@ -1,28 +1,49 @@
#!/bin/sh

if [ -f /tmp/wis2box.crt ]; then
echo "SSL enabled"
echo "setup /mosquitto/certs"
mkdir -p /mosquitto/certs
cp /tmp/wis2box.crt /mosquitto/certs
cp /tmp/wis2box.key /mosquitto/certs
chown -R mosquitto:mosquitto /mosquitto/certs
cp -f /mosquitto/config/mosquitto-ssl.conf /mosquitto/config/mosquitto.conf
else
echo "SSL disabled"
fi

echo "Setting mosquitto authentication"
if [ ! -e "/mosquitto/config/password.txt" ]; then
echo "Adding wis2box users to mosquitto password file"
mosquitto_passwd -b -c /mosquitto/config/password.txt $WIS2BOX_BROKER_USERNAME $WIS2BOX_BROKER_PASSWORD
mosquitto_passwd -b /mosquitto/config/password.txt everyone everyone
else
echo "Mosquitto password file already exists. Skipping wis2box user addition."
echo "Mosquitto password file already exists. Update it if needed"
mosquitto_passwd -b /mosquitto/config/password.txt everyone everyone
mosquitto_passwd -b /mosquitto/config/password.txt $WIS2BOX_BROKER_USERNAME $WIS2BOX_BROKER_PASSWORD
fi

# add max_queued_messages to mosquitto.conf if not already there
if ! grep -q "max_queued_messages" /mosquitto/config/mosquitto.conf; then
echo "max_queued_messages $WIS2BOX_BROKER_QUEUE_MAX" >> /mosquitto/config/mosquitto.conf
fi

sed -i "s#_WIS2BOX_BROKER_QUEUE_MAX#$WIS2BOX_BROKER_QUEUE_MAX#" /mosquitto/config/mosquitto.conf
sed -i "s#_WIS2BOX_BROKER_USERNAME#$WIS2BOX_BROKER_USERNAME#" /mosquitto/config/acl.conf
# prepare the acl.conf file
if [ ! -e "/mosquitto/config/acl.conf" ]; then
echo "Creating mosquitto acl file"
echo "user everyone" >> /mosquitto/config/acl.conf
echo "topic read origin/#" >> /mosquitto/config/acl.conf
echo " " >> /mosquitto/config/acl.conf
echo "user $WIS2BOX_BROKER_USERNAME" >> /mosquitto/config/acl.conf
echo "topic readwrite origin/#" >> /mosquitto/config/acl.conf
echo "topic readwrite wis2box/#" >> /mosquitto/config/acl.conf
echo "topic readwrite data-incoming/#" >> /mosquitto/config/acl.conf
echo "topic read \$SYS/#" >> /mosquitto/config/acl.conf
else
echo "Mosquitto acl file already exists. Update it if needed"
# add user everyone to acl.conf if not already there
if ! grep -q "user everyone" /mosquitto/config/acl.conf; then
echo "user everyone" >> /mosquitto/config/acl.conf
echo "topic read origin/#" >> /mosquitto/config/acl.conf
echo " " >> /mosquitto/config/acl.conf
fi
# add user $WIS2BOX_BROKER_USERNAME to acl.conf if not already there
if ! grep -q "user $WIS2BOX_BROKER_USERNAME" /mosquitto/config/acl.conf; then
echo "user $WIS2BOX_BROKER_USERNAME" >> /mosquitto/config/acl.conf
echo "topic readwrite origin/#" >> /mosquitto/config/acl.conf
echo "topic readwrite wis2box/#" >> /mosquitto/config/acl.conf
echo "topic readwrite data-incoming/#" >> /mosquitto/config/acl.conf
echo "topic read \$SYS/#" >> /mosquitto/config/acl.conf
fi
fi

for i in `env | grep -Ee "\<WIS2BOX_BROKER_USERNAME_[[:alnum:]]+"`; do
NAME_TAIL=`echo $i | awk -FWIS2BOX_BROKER_USERNAME_ '{print $2}' | awk -F= '{print $1}'`
Expand All @@ -35,8 +56,22 @@ for i in `env | grep -Ee "\<WIS2BOX_BROKER_USERNAME_[[:alnum:]]+"`; do
echo "topic readwrite ${!topic}" >> /mosquitto/config/acl.conf
done

# set ownership of mosquitto files
chown -R mosquitto:mosquitto /mosquitto
if [ -f /tmp/wis2box.crt ]; then
echo "SSL enabled"
echo "setup /mosquitto/certs"
mkdir -p /mosquitto/certs
cp /tmp/wis2box.crt /mosquitto/certs
cp /tmp/wis2box.key /mosquitto/certs
chown -R mosquitto:mosquitto /mosquitto/certs
# add listener 8883 block to mosquitto.conf, if not already there
if ! grep -q "listener 8883" /mosquitto/config/mosquitto.conf; then
echo "listener 8883" >> /mosquitto/config/mosquitto.conf
echo "certfile /mosquitto/certs/wis2box.crt" >> /mosquitto/config/mosquitto.conf
echo "keyfile /mosquitto/certs/wis2box.key" >> /mosquitto/config/mosquitto.conf
fi
else
echo "SSL disabled"
fi

# set permission of acl.conf to 0700
chmod 0700 /mosquitto/config/acl.conf
Expand Down
24 changes: 0 additions & 24 deletions wis2box-broker/mosquitto-ssl.conf

This file was deleted.

1 change: 0 additions & 1 deletion wis2box-broker/mosquitto.conf
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ log_dest file /mosquitto/log/mosquitto.log
log_dest stdout
log_timestamp_format %Y-%m-%dT%H:%M:%S
password_file /mosquitto/config/password.txt
max_queued_messages _WIS2BOX_BROKER_QUEUE_MAX

# ACLs
acl_file /mosquitto/config/acl.conf
Expand Down
11 changes: 10 additions & 1 deletion wis2box-management/wis2box/api/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@
from wis2box import cli_helpers
from wis2box.api.backend import load_backend
from wis2box.api.config import load_config
from wis2box.data_mappings import get_plugins

from wis2box.env import (DOCKER_API_URL, API_URL)

LOGGER = logging.getLogger(__name__)
Expand Down Expand Up @@ -258,10 +260,17 @@ def setup(ctx, verbosity):
except Exception as err:
click.echo(f'Issue loading discovery-metadata: {err}')
return False
# loop over records and add data-collection when bufr2geojson is used
for record in records['features']:
metadata_id = record['id']
plugins = get_plugins(record)
LOGGER.info(f'Plugins used by {metadata_id}: {plugins}')
# check if any plugin-names contains 2geojson
has_2geojson = any('2geojson' in plugin for plugin in plugins)
if has_2geojson is False:
continue
if metadata_id not in api_collections:
click.echo(f'Adding collection: {metadata_id}')
click.echo(f'Adding data-collection for: {metadata_id}')
from wis2box.data import gcm
meta = gcm(record)
setup_collection(meta=meta)
Expand Down
Loading
Loading