Skip to content

Code development workflow

Jorge Samuel Mendes de Jesus edited this page Dec 6, 2020 · 17 revisions

Code development workflow

Code workflow is the sequential steps and procedures that are implemented by the pygeoapi project for code development, testing and implementation. This is a more structured sequential description comparing with contribution guidelines.

pygeoapi contributions guidelines and instruction on how to submit tickets are found on this page CONTRIBUTING.md.

Check it to understand development workflow.

1. Forking

Developers work on their own (project) forks, this is a personal sandbox where code is developed and implemented by its author. With time, code on main project and fork will start to divert, since code from other branches and forks gets merged into the project. It is rather important for code from the project to be constantly synced into the fork and working branch

Check the github tutorial on how fork and sync: fork-a-repo

pygeoapi-master ---FORK---> pygeoapi-user001-master     

2. Issues and branches

Github issues should be related to bugs, new feature requests, blue sky research etc. For bug reporting please follow the guideline what to put in your bug report

code development should be oriented in such a way that it solves (or deals with) one issue only. Issues tend to be associated with branches, and code commits go into that specific branch. This also facilitates the Pull Request reviewing process.

pygeoapi-master ---FORK---> pygeoapi-user001-master 
                                                \----------- pygeoapi-user001-issue4456

Don't forget to sync/merge the main pygeoapi-master into your fork's master and merge (or rebase) master into branch version-control-branching. This requires that you first configure a remote for a fork, indicating the upstream location of the main code:

git remote add upstream https://github.com/geopython/pygeoapi.git

3. Code development

A good programmer is the one that writes clear and easy to understand code based on well established guidelines, not the one that writes smart code.

3.1 PEP8 - Python code style

pygeoapi follows the PEP 8 — the Style Guide for Python Code and Python naming conventions, in a nutshell:

  • snake_case for variables
  • lower case for modules and packages
  • upper case for CONSTANTS
  • UpperCaseCamel for classes
  • CAPITALS for constants
  • Methods can also be protected with _ or private __
  • Variable name collision is avoid by adding an extra _ e.g Use csv_ instead of csv`
  • English words only, with proper description of functionality and/or content.
  • Follow OGC standard names (See: 4.1 pygeoapi API)

3.2 Understandable code

PEP8 style convention helps on readability, but code should also be understandable. This can be achieved by simple English variables, good comments, and consistency.

Hoe to write code that everyone can read

code quality

Source: https://xkcd.com/1513/ and Geo-python

3.4 Code documentation

Documentation is what makes or breaks a project, thou shall not say: "The code is already explanatory". If you wrote readable code it is already explanatory BUT you still have to indicate what it does, how it can made run and more important the inputs/outputs as python is a loosely type language.** Any pull request without proper code docstring is automatically rejected**

3.4.1 Docstrings and reStructuredText

pygeoapi uses python docstring and reStructuredText

A good introduction to pydocs can be found in the links below:

Every single method/function/class should be documented using docstrings following reStructuredText (reST) syntax, for example:

class RasterioProvider(BaseProvider):
    """Rasterio Provider"""

    def __init__(self, provider_def):
        """
        Initialize object
        :param provider_def: provider definition
        :returns: pygeoapi.provider.rasterio_.RasterioProvider
        """

Python packages should also have basic description on their __init__.py file e.g:

"""OGC process package, each process is an independent module"""

Note: Type hints are not yet supported

3.4.2 Read the docs

Read the docs is a very popular documentation platform and used for pygeoapi documentation. If pydocs are properly written, read the docs (RTD) will automatically build the content, this step should be done before a pull request.

RDT code is located on folder docs, with folder organization and file content defined as a python generic RDT project, before you proceed please read "How to set up your python projec docs for success (https://towardsdatascience.com/how-to-set-up-your-python-project-docs-for-success-aab613f79626)", to have an idea how things work.

pygeoapi RDT content is on folder pygeoapi/docs/source. *.rst files are the sources where documentation should be written/updated

Table of Contents (TOC), is defined on [index.rst](https://raw.githubusercontent.com/geopython/pygeoapi/master/docs/source/index.rst)

.. _index:

.. image:: /_static/pygeoapi-logo.png
   :scale: 50%
   :alt: pygeoapi logo

pygeoapi |release| documentation
==================================

:Author: the pygeoapi team
:Contact: pygeoapi at lists.osgeo.org
:Release: |release|
:Date: |today|

.. toctree::
   :maxdepth: 4
   :caption: Table of Contents
   :name: toc

   introduction
   how-pygeoapi-works

The TOC names will the point to the individual *rst

3.4.2 API documentation

Remember pydocs and code comments ?! RTD will automatically pull the content from pygeoapi code and build the API documentation, **it is important that new packages and modules are added to openapi.rst **. For example a package, module/class and then class would have the following syntax

Provider
--------

.. automodule:: pygeoapi.provider
   :show-inheritance:
   :members:
   :private-members:
   :special-members:


Base class
^^^^^^^^^^

.. automodule:: pygeoapi.provider.base
   :show-inheritance:
   :members:
   :private-members:
   :special-members:


CSV provider
^^^^^^^^^^^^

.. automodule:: pygeoapi.provider.csv_
   :show-inheritance:
   :members:
   :private-members:

3.4.3 Building documentation (local)

On folder pygeoapi/docs:

#make help
make html
::
Running Sphinx v3.0.1
loading pickled environment... done
:: 
The HTML pages are in build/html

and documentation in available on build/html as read the docs that can viewed on a browser: firefox build/html/index.html

3.4.4 Personal RTD on github (automatic build)

Github has very good support for RDT and you can use even use it on your personal repository on the issue that you are working on.

First, you need to create an account on read the docs Sign up. You can (and should) use authentication using your github account, the following steps assume that you used your github account.

Second, connect your read the docs account to github on admin > control panel, Connected services > Connect to Github

RTD connect services

This will allow you to choose the repository and branch from where RDT will import the documents and build them.

Confirmation of service connection

If the process was successful it should not be necessary to preconfigure the webhooks.

Third, on the dashboard (Profile drop down > My projects) click Import a project

Import a project

And refresh for sync between RTD and Github. You should be able see your private pygeoapi project (<username/pygeoapi>), just add it

As default RTD will build docs from master, it is expected for you to work on your fork in a specific branch (see: Issues and branches), therefore RTD should be set to use the branch.

On the project details, give a name related to the issue that you are working on e.g pygeoapi-532 (this will then part of the public URL), and tick Edit advanced project options

Project details

On the advance options, type the name of working/default branch, and select Python as programming language

Project Extra Details

Finally, on the project page click on Build project and enjoy the automatization project, in a few minutes you will have your documenation online :), for this example it will be something like: https://pygeoapi-532.readthedocs.io

Note: Every time you push to the default branch RTD will update the online documentation.

4. pygeoapi code structure

pygeoapi code uses or implements:

  • an API first approach that is wrapped by a web framework (Flask or Starlette),
  • Object oriented template pattern
  • Plugins
  • EAFP (it’s easier to ask for forgiveness than permission)
  • Prefer DRY (Don't Repeat Yourself) but when necessary WET (Write Everything Twice)

4.1 pygeoapi API

The API structure is defined on pygeoapi/apy.py module and class API, this is the projects's core. The method naming in class API is no coincidence, it follow OGC API names and definitions, for example, in OGC Features we have an endpoint defined as:

GET /collections     

This REST end point describes the collections available, the associated method is:

    def describe_collections(self, headers_, format_, dataset=None):

<VERB>_<OBJECT> is the standard terminology.

4.2 Web-frameworks

Web-frameworks libraries are responsible for:

  • HTTP requests/responses
  • URL routing
  • Configuration loading

REST end points defined by the OGC standards (see here for example) are supported by the web-framework, with its communities approaches, philosophies and perks.

Currently the are two web-frameworks supported

pygeoapi project tends to use Flask as the default web-framework. As guideline, the function name convention should be identical (or very close) to the HTTP request route e.g:

@BLUEPRINT.route('/openapi')
def openapi():
    """
    OpenAPI endpoint
    :returns: HTTP response
    """
    with open(os.environ.get('PYGEOAPI_OPENAPI'), encoding='utf8') as ff:
        openapi = yaml_load(ff)

4.3 Object oriented template pattern

pygeoapi code is object oriented (classes), and implements a template method pattern Wikipedia: template method pattern. Template method pattern is normally used on code base that implement multiple components that have an overlap functionality, behavior or properties.

The provider package contains the following modules:

.
├── __init__.py
├── base.py
├── elasticsearch_.py
├── geojson.py
:

base.py module contains a parent classes that will be used on the specific data provider modules (e.g geojson.py).

#base.py
class BaseProvider:
    """generic Provider ABC"""

    def __init__(self, provider_def):
    :
    
    def get_fields(self):
        raise NotImplementedError()
    def write(self, options={}, data=None):
        raise NotImplementedError()

class BaseProvider is the template that creates the specific classes for each different data provider, this template contains all methods necessary.

You can see the base class being extended on module geojson.py(https://github.com/geopython/pygeoapi/blob/master/pygeoapi/provider/geojson.py)

from pygeoapi.provider.base import BaseProvider, ProviderItemNotFoundError

class GeoJSONProvider(BaseProvider):
    """Provider class backed by local GeoJSON files
    :
    def get_fields(self):
         if os.path.exists(self.data):
            with open(self.data) as src:
                data = json.loads(src.read())
            fields = {}
            for f in data['features'][0]['properties'].keys():
                fields[f] = 'string'
            return fields

Checking class GeoJSONProvider there isn't a write method, if pygeoapi tries to call method write it will end up in the base class and triggering a raise NotImplementedError() that will be properly addressed by pygeoapi API.

This is the pygeoapi code approach, base classes defining precisely what it is expected and avoiding duplication.

Doubts!? Check these links:

4.4 Plugins

TODO:

4.5 EAFP versus LBLYL

TODO:

4.6 DRY versus WET

5. code/functionality testing

pygeoapi project implements test driven development see:(advantages-of-test-driven-development)

pygeoapi has 3 levels of code/functionality: local and remote

5.1 code testing - local

pygeopapi uses pytest for unit testing based on the pygeoapi testing documentation

Tests are on folder /tests and each python module (*.py) bundles several tests based on global functionality or system, root folder contains the pytest.ini that env variables.

After code develop, unit test SHOULD be implemented (a bit ok common sense...is also good). New code should have new unit tests and pytest should be run locally to determine that things are OK, for example:

python -m pytest tests/test_api.py

This is the first step to determine if developed code can properly integrate pygeoapi and it doesn't affect pre-existing code and functionality

6 CI/CD pipeline

TODO

https://imgs.xkcd.com/comics/automation.png

Clone this wiki locally