.. _usage:

=======
 Usage
=======

Assumptions about cluster zones
===============================

A `k8s zone`_ is a set of cluster nodes with the same value of `k8s label`_ key
``topology.kubernetes.io/zone``, see an example of zone ``data-a``:

.. code-block:: console

   $ oc get nodes -l topology.kubernetes.io/zone=data-a
   NAME              STATUS   ROLES    AGE     VERSION
   compute-0         Ready    worker   7d14h   v1.20.0+bafe72f
   compute-1         Ready    worker   7d14h   v1.20.0+bafe72f
   compute-2         Ready    worker   7d14h   v1.20.0+bafe72f
   control-plane-0   Ready    master   7d14h   v1.20.0+bafe72f

We assume that there are 3 zones in the cluster, and that every node belongs to
some zone, eg:

.. code-block:: console

   $ oc get nodes -L topology.kubernetes.io/zone
   NAME              STATUS   ROLES    AGE   VERSION           ZONE
   compute-0         Ready    worker   8d    v1.20.0+bafe72f   data-a
   compute-1         Ready    worker   8d    v1.20.0+bafe72f   data-a
   compute-2         Ready    worker   8d    v1.20.0+bafe72f   data-a
   compute-3         Ready    worker   8d    v1.20.0+bafe72f   data-b
   compute-4         Ready    worker   8d    v1.20.0+bafe72f   data-b
   compute-5         Ready    worker   8d    v1.20.0+bafe72f   data-b
   control-plane-0   Ready    master   8d    v1.20.0+bafe72f   data-a
   control-plane-1   Ready    master   8d    v1.20.0+bafe72f   data-b
   control-plane-2   Ready    master   8d    v1.20.0+bafe72f   arbiter

There is no limitation on the design of cluster zones or their names
(values of ``topology.kubernetes.io/zone`` label key). The ocp-network-split
references zones under single letter names (such as ``a``, ``b`` ... see
:py:const:`ocpnetsplit.zone.ZONES`), so that you will just need to
create mapping between ocp-network-split names and actual zone names as shown
in the following sections.

.. _`k8s zone`: https://kubernetes.io/docs/reference/labels-annotations-taints/#topologykubernetesiozone
.. _`k8s label`: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/

External zone 
=============

Besides normal cluster zones, there is a special zone ``x`` which represents
external services running outside of a cluster. Specifying list of IP addresses
for ``x`` zone allows you to block traffic to these IP addresses in both
directions later.

Command line tools
==================

There are also 2 command line tools:

- ``ocp-network-split-setup``: based on given zone name assignment, it fetches
  IP addresses of all nodes for every zone (to create env file with zone
  configuration), and creates ``MachineConfig`` yaml file to deploy the zone
  configuration along with firewall script and systemd unit files to every node
  of the cluster. This is done only once.

- ``ocp-network-split-sched``: schedules given network split configuration
  which will start at given time and stop after given number of minutes.

Let's have a look how the zone configuration generated by the setup script
looks like (the example also shows how to define zone name mapping):

.. code-block:: console

   $ ocp-network-split-setup -a arbiter -b data-a -c data-b --print-env-only
   ZONE_A="10.1.160.36"
   ZONE_B="10.1.160.127 10.1.160.158 10.1.160.160 10.1.160.163"
   ZONE_C="10.1.160.103 10.1.160.162 10.1.160.65 10.1.160.98"

If this looks good, we can go on and create ``MachineConfig`` yaml file, which
you can inspect as well.

.. code-block:: console

    $ ocp-network-split-setup -a foo-arbiter -b data-a -c data-b -o network-split.yaml
    $ head network-split.yaml
    apiVersion: machineconfiguration.openshift.io/v1
    kind: MachineConfig
    metadata:
      labels:
        machineconfiguration.openshift.io/role: master
      name: 99-master-network-split
    spec:
      config:
        ignition:
          version: 3.1.0

Then you can use ``oc create`` to deploy the configuration:

.. code-block:: console

    $ oc create -f network-split.yaml
    machineconfig.machineconfiguration.openshift.io/99-master-network-split created
    machineconfig.machineconfiguration.openshift.io/99-worker-network-split created

When the machine config is applied (check ``oc get mcp`` if both pools are
updated), we can schedule 5 minute long network split of particular
configuration ``ab`` (cutting connection between zones ``a`` and ``b``) at
given time:

.. code-block:: console

    $ ocp-network-split-sched ab -t 2021-04-09T16:30 --split-len 5

When the time details are omitted, the sched script will just list net split
timers for given split configuration on all nodes. In the following example,
we can see one split was schedule 26 minutes ago, while anoter is going to
happen in about 4 minutes:

.. code-block:: console

    $ ocp-network-split-sched ab
    node/compute-0
    NEXT                         LEFT          LAST                         PASSED    UNIT                                    ACTIVATES
    Fri 2021-04-09 14:30:00 UTC  3min 50s left n/a                          n/a       network-split-ab-setup@1617978600.timer network-split@ab.service
    n/a                          n/a           Fri 2021-04-09 14:00:00 UTC  26min ago network-split-ab-setup@1617976800.timer network-split@ab.service
    
    node/compute-1
    NEXT                         LEFT          LAST                         PASSED    UNIT                                    ACTIVATES
    Fri 2021-04-09 14:30:00 UTC  3min 48s left n/a                          n/a       network-split-ab-setup@1617978600.timer network-split@ab.service
    n/a                          n/a           Fri 2021-04-09 14:00:00 UTC  26min ago network-split-ab-setup@1617976800.timer network-split@ab.service
    
    ... rest of the output is ommited ...

You can schedule multiple splits in advance, or wait for one network split to
end before going on with another one.

Python API
==========

To use ocp-network-split in your python test script, see functions in module
:py:mod:`ocpnetsplit.main` which provides public API and implementation
of the command line tools referenced in the previous section.

Quick high level overview of API usage:

- Generate list of dictionaries representing content of ``MachineConfig`` yaml,
  (which contains network split script and unit files) using
  :py:func:`ocpnetsplit.main.get_zone_config` and
  :py:func:`ocpnetsplit.main.get_networksplit_mc_spec`.
- Deploy the ``MachineConfig`` generated in the previous step and wait for the
  configuration to be applied on all nodes. This needs to be done only once.
- Pick desired network split configuration from
  :py:const:`ocpnetsplit.zone.NETWORK_SPLITS`.
- Schedule selected network split disruption via
  :py:func:`ocpnetsplit.main.schedule_split`, this will define 2 timers
  on each node, one to start the disruption and another one to stop it.
- Wait for the 1st timer to trigger setup of the network split.
- Wait for the 2nd timer to trigger teardown, restoring the network
  configuration back.
- Optionally schedule another network split again.
