A “Toy” Example of a Nowcast System

In the spirit of “learn by doing”, this section presents the practical details of building and operating a nowcast system by setting up a “toy” nowcast system with a few trival Example Workers. The subsequent sections provide more detailed reference documentation about building and operating software automation systems based on this framework.

Conda Environment

We’ll use a Conda environment to isolate Python package installation. Miniconda3 provides the conda package manager and environment management tools. Please follow the instructions on the Miniconda3`_ page to download and install it.

Once that is done, create a conda environment with the NEMO_Nowcast package and its dependencies installed in it with the command:

$ conda create -n toy-nowcast -c gomss-nowcast nemo_nowcast

Activate the environment with:

$ source activate toy-nowcast

To deactivate the environment later, use:

(toy-nowcast)$ source deactivate toy-nowcast

nowcast Directory

Create a directory to hold the nowcast system files. It can be anywhere, but for simplicity we’ll put it in your $HOME directory:

(toy-nowcast)$ mkdir $HOME/toy-nowcast

Environment Variables

The NEMO_Nowcast package uses environment variables for some fundamental elements of the configuration of a nowcast system. NOWCAST_ENV must be set to the Conda Environment path. An easy way to do that is:

(toy-nowcast)$ export NOWCAST_ENV=$CONDA_PREFIX

NOWCAST_LOGS must be set to a directory where the nowcast system log files will be stored. For simplicity we’ll just use our $HOME/toy-nowcast/ directory:

(toy-nowcast)$ export NOWCAST_LOGS=$HOME/toy-nowcast/

NOWCAST_YAML must be set to nowcast system configuration file, which we will create in a moment:

(toy-nowcast)$ export NOWCAST_YAML=$HOME/toy-nowcast/nowcast.yaml

Nowcast System Configuration File

The configuration of a nowcast system is defined in a Nowcast Configuration File. Configuration files are written in YAML, the basic element of which is key-value pairs.

Create a file called nowcast.yaml in your toy-nowcast/ directory with the following contents. (you can omit the lines that start with # if you wish, they are comments):

# Example system configuration file for a NEMO_Nowcast framework system

# System status checklist file
checklist file: $(NOWCAST.ENV.NOWCAST_LOGS)/nowcast_checklist.yaml

# Python interpreter in environment with all dependencies installed
# Used to launch workers
python: $(NOWCAST.ENV.NOWCAST_ENV)/bin/python

# Logging system configuration
logging:
  version: 1
  disable_existing_loggers: False
  formatters:
    simple:
      format: '%(asctime)s %(levelname)s [%(name)s] %(message)s'
  handlers:
    console:
      class: logging.StreamHandler
      level: DEBUG
      formatter: simple
      stream: ext://sys.stdout
  root:
    level: DEBUG
    handlers:
     - console

# Message system
zmq:
  host: localhost
  ports:
    # traffic between manager and message broker
    manager: 4343
    # traffic between workers and message broker
    workers: 4344

message registry:
  # Message types that the manager process can send and their meanings
  # Don't change this section without making corresponding changes in
  # the nemo_nowcast.manager module of the NEMO_Nowcast package.
  manager:
    ack: message acknowledged
    checklist cleared: system checklist cleared
    unregistered worker: ERROR - message received from unregistered worker
    unregistered message type: ERROR - unregistered message type received from worker
    no after_worker function: ERROR - after_worker function not found in next_workers module

  # Module from which to load :py:func:`after_<worker_name>` functions
  # that provide lists of workers to launch when :kbd:`worker_name` finishes
  next workers module: next_workers

  workers:
    # Worker module name
    sleep:
      # The key in the system checklist that the manager maintains that is to
      # be used to hold message payload information provided by the
      # :kbd:`example` worker
      checklist key: sleepyhead
      # Message types that the :kbd:`example` worker can send and their meanings
      success: sleep worker slept well
      failure: sleep worker slept badly
      crash: sleep worker crashed

The contents of the configuration file are described in detail in the Nowcast Configuration File section of these docs.

supervisord Process Manager Configuration File

It is recommended to run the various long-running nowcast system processes under a process manager. Doing so ensures that the processes will be restarted if they crash, and provides a centralized interface for monitoring and controlling the processes. We’ll use Supervisor which was installed when you created your Conda Environment.

supervisord uses its own configuration file, written using INI syntax. Create a file called supervisord.ini in your toy-nowcast/ directory with the following contents:

# Example supervisord process manager configuration file
# for a NEMO_Nowcast framework system

# Supervisor daemon and its interfaces
[supervisord]
logfile = %(ENV_NOWCAST_LOGS)s/supervisor.log
pidfile = %(ENV_NOWCAST_LOGS)s/supervisor.pid
childlogdir = %(ENV_NOWCAST_LOGS)s

[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

[inet_http_server]
# This value must match that used in [supervisorctl]serverurl below
port = localhost:9001


# Supervisor interactive shell tool
[supervisorctl]
# Host and port values here must match those used in [inet_http_server]port above
serverurl = http://localhost:9001
prompt = nowcast-supervisor


# Long-running processes that supervisor manages
# Priority values define process startup order
[program:message_broker]
command = %(ENV_NOWCAST_ENV)s/bin/python3 -m nemo_nowcast.message_broker %(ENV_NOWCAST_YAML)s
priority = 0
autorestart = true

[program:manager]
command = %(ENV_NOWCAST_ENV)s/bin/python3 -m nemo_nowcast.manager %(ENV_NOWCAST_YAML)s
priority = 1
autorestart = true

next_workers Module

Now we need to create the next_workers module for our system. As described in the Manager section, The next_workers module contains functions that return a sequence of nemo_nowcast.worker.NextWorker objects that specify workers and their command-line arguments that the manager should launch when it receives a completion message from a worker.

We’ll start with a minimal next_workers module. Create a file called next_workers.py in your toy-nowcast/ directory and put the following code in it:

"""Example :py:mod:`next_workers` module.

Functions to calculate lists of workers to launch after previous workers
end their work.

Function names **must** be of the form :py:func:`after_worker_name`.
"""
from nemo_nowcast import NextWorker


def after_sleep(msg, config, checklist):
    """Calculate the list of workers to launch after the sleep example worker
    ends.

    :arg msg: Nowcast system message.
    :type msg: :py:func:`collections.namedtuple`

    :arg config: :py:class:`dict`-like object that holds the nowcast system
                 configuration that is loaded from the system configuration
                 file.
    :type config: :py:class:`nemo_nowcast.config.Config`

    :arg dict checklist: System checklist: data structure containing the
                         present state of the nowcast system.

    :returns: Sequence of :py:class:`nemo_nowcast.worker.NextWorker` instances
              for worker(s) to launch next.
    :rtype: list
    """
    next_workers = {
        'crash': [],
        'failure': [],
        'success': [],
    }
    return next_workers[msg.type]

This module provides an after_sleep() function that tells the manager what worker(s) to launch after the sleep worker finishes. The nemo_nowcast.workers.sleep is example worker that is included in the NEMO_Nowcast repository. Note that our after_sleep() function always returns an empty list; that is, we’re saying that the manager should not launch another worker. Also note that the 3 keys in the next_workers dict correspond to the 3 message types registered for the sleep worker in our nowcast.yaml file.

Running the Nowcast System

Our “toy” nowcast system is ready to run. Start the process manager with the command:

(toy-nowcast) toy-nowcast$ supervisord -c supervisord.ini

We have configured supervisord to send its logging messages, and those from the processes it is managing to files in the directory pointed to by the envvar:NOWCAST_LOGS environment variable. That is $HOME/toy-nowcast/ if you followed the instructions in the Environment Variables section above. The supervisor.log file tells us about what supervisord is doing. If you use less or cat to look at it, you should see something like

2020-05-15 12:12:54,544 INFO RPC interface 'supervisor' initialized
2020-05-15 12:12:54,544 CRIT Server 'inet_http_server' running without any HTTP authentication checking
2020-05-15 12:12:54,545 INFO daemonizing the supervisord process
2020-05-15 12:12:54,545 INFO supervisord started with pid 15937
2020-05-15 12:12:55,548 INFO spawned: 'message_broker' with pid 15974
2020-05-15 12:12:55,550 INFO spawned: 'manager' with pid 15975
2020-05-15 12:12:56,773 INFO success: message_broker entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2020-05-15 12:12:56,773 INFO success: manager entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

This shows supervisord itself starting up, then it spawning processes for our nowcast system’s Message Broker and Manager processes, and confirming that those processes are running.

The logging messages from the Message Broker and Manager processes. Those files have names like:

  • message_broker-stdout---supervisor-1_p3jss7.log

  • manager-stdout---supervisor-5x6e5ryj.log

The 8 characters between ---supervisor and .log are randomly generated each time supervisord is started. At this point, those files contain the startup messages from those processes:

2020-05-15 12:12:55,769 INFO [message_broker] running in process 15974
2020-05-15 12:12:55,769 INFO [message_broker] read config from /home/doug/toy-nowcast/nowcast.yaml
2020-05-15 12:12:55,769 INFO [message_broker] writing logging messages to local file system
2020-05-15 12:12:55,769 INFO [message_broker] worker socket bound to port 4344
2020-05-15 12:12:55,769 INFO [message_broker] manager socket bound to port 4343

from the message_broker, and:

2020-05-15 12:12:55,770 INFO [manager] running in process 15975
2020-05-15 12:12:55,770 INFO [manager] read config from /home/doug/toy-nowcast/nowcast.yaml
2020-05-15 12:12:55,770 INFO [manager] writing logging messages to local file system
2020-05-15 12:12:55,771 INFO [manager] next workers module loaded from next_workers
2020-05-15 12:12:55,771 INFO [manager] connected to localhost port 4343
2020-05-15 12:12:55,771 WARNING [manager] checklist load failed:
Traceback (most recent call last):
  File "/home/doug/NEMO_Nowcast/nemo_nowcast/manager.py", line 253, in _load_checklist
    with open(checklist_file, "rt") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/doug/toy-nowcast/nowcast_checklist.yaml'
2020-05-15 12:12:55,772 WARNING [manager] running with empty checklist
2020-05-15 12:12:55,772 DEBUG [manager] listening...

from the manager. The latter tries to initialize the state of the system by reading from the nowcast_checklist.yaml file and warns use that it can’t find that file; not surprising since this it the first time the system has been launched. Finally, it tells us that it has gone into its default state of listening for messages from workers.

You can shut the system down with the command

(toy-nowcast)$ supervisorctl -c supervisord.ini shutdown

but leave it running so that we can play with the sleep worker.

Running the sleep Worker

Start another terminal session, activate your toy-nowcast Conda Environment in it, and set up the Environment Variables:

$ cd toy-nowcast
$ source activate toy-nowcast
(toy-nowcast)$ export NOWCAST_ENV=$CONDA_PREFIX
(toy-nowcast)$ export NOWCAST_LOGS=$HOME/toy-nowcast/
(toy-nowcast)$ export NOWCAST_YAML=$HOME/toy-nowcast/nowcast.yaml

Now you can run the sleep worker with:

(toy-nowcast)$ python3 -m nemo_nowcast.workers.sleep $NOWCAST_YAML

You should see logging messages that look like:

2020-05-15 15:21:24,532 INFO [sleep] running in process 10011
2020-05-15 15:21:24,532 INFO [sleep] read config from /home/doug/toy-nowcast/nowcast.yaml
2020-05-15 15:21:24,532 INFO [sleep] writing log messages to local file system
2020-05-15 15:21:24,532 INFO [sleep] connected to localhost port 4344
2020-05-15 15:21:29,538 INFO [sleep] slept for 5 seconds
2020-05-15 15:21:29,539 DEBUG [sleep] sent message: (success) sleep worker slept well
2020-05-15 15:21:29,547 DEBUG [sleep] received message from manager: (ack) message acknowledged
2020-05-15 15:21:29,547 DEBUG [sleep] shutting down

with a 5 second long pause in the middle.

If you look at the manager log file again you should see additional logging messages that look like:

2020-05-15 15:21:29,541 DEBUG [manager] received message from sleep: (success) sleep worker slept well
2020-05-15 15:21:29,542 INFO [manager] checklist updated with [sleepyhead] items from sleep worker
2020-05-15 15:21:29,550 DEBUG [manager] listening...

You can use tail (perhaps with its -f option) to see the end of the log files, or you can use supervisorctl for find and show you the tail of the log file for any of the processes it is managing:

(toy-nowcast)$ supervisorctl -c supervisord.ini tail manager

TODO:

  • exercises:

    • experiment with running sleep worker with --sleep-time and/or --debug command-line flags

    • run circusctl in a 3rd terminal session

    • status

    • add after_rotate_logs() function to next_workers module

    • run python -m nemo_nowcast.workers.rotate_logs nowcast.yaml

    • add rotate_logs worker to after_awaken function, run sleep worker again