Metadata-Version: 2.1
Name: BigQuery-DatasetManager
Version: 0.1.1
Summary: BigQuery-DatasetManager is a simple file-based CLI management tool for BigQuery Datasets.
Home-page: https://github.com/laughingman7743/BigQuery-DatasetManager/
Author: laughingman7743
Author-email: laughingman7743@gmail.com
License: MIT License
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Database
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Requires-Dist: future
Requires-Dist: pytz
Requires-Dist: click (>=6.0)
Requires-Dist: PyYAML (>=3.12)
Requires-Dist: google-cloud-bigquery (==1.1.0)
Requires-Dist: python-dateutil (>=2.7.0)
Requires-Dist: enum34; python_version <= "3.3"

.. image:: https://img.shields.io/pypi/pyversions/BigQuery-DatasetManager.svg
    :target: https://pypi.python.org/pypi/BigQuery-DatasetManager/

.. image:: https://travis-ci.org/laughingman7743/BigQuery-DatasetManager.svg?branch=master
    :target: https://travis-ci.org/laughingman7743/BigQuery-DatasetManager

.. image:: https://codecov.io/gh/laughingman7743/BigQuery-DatasetManager/branch/master/graph/badge.svg
    :target: https://codecov.io/gh/laughingman7743/BigQuery-DatasetManager

.. image:: https://img.shields.io/pypi/l/BigQuery-DatasetManager.svg
    :target: https://github.com/laughingman7743/BigQuery-DatasetManager/blob/master/LICENSE


BigQuery-DatasetManager
=======================

BigQuery-DatasetManager is a simple file-based CLI management tool for `BigQuery Datasets`_.

.. _`BigQuery Datasets`: https://cloud.google.com/bigquery/docs/datasets

Requirements
------------

* Python

  - CPython 2,7, 3,4, 3.5, 3.6

Installation
------------

.. code:: bash

    $ pip install BigQuery-DatasetManager

Resource representation
-----------------------

The resource representation of the dataset and the table is described in `YAML format`_.

.. _`YAML format`: http://www.yaml.org/

Dataset
~~~~~~~

.. code:: yaml

    name: dataset1
    friendly_name: null
    description: null
    default_table_expiration_ms: null
    location: US
    access_entries:
    -   role: OWNER
        entity_type: specialGroup
        entity_id: projectOwners
    -   role: WRITER
        entity_type: specialGroup
        entity_id: projectWriters
    -   role: READER
        entity_type: specialGroup
        entity_id: projectReaders
    -   role: OWNER
        entity_type: userByEmail
        entity_id: aaa@bbb.gserviceaccount.com
    -   role: null
        entity_type: view
        entity_id:
            datasetId: view1
            projectId: project1
            tableId: table1
    labels:
        foo: bar

+----------------+-------------+-----------+---------+--------------------------------------------------------------------+
| Key name                                 | Value   | Description                                                        |
+================+=============+===========+=========+====================================================================+
| dataset_id                               | str     | ID of the dataset.                                                 |
+----------------+-------------+-----------+---------+--------------------------------------------------------------------+
| friendly_name                            | str     | Title of the dataset.                                              |
+----------------+-------------+-----------+---------+--------------------------------------------------------------------+
| description                              | str     | Description of the dataset.                                        |
+----------------+-------------+-----------+---------+--------------------------------------------------------------------+
| default_table_expiration_ms              | int     | Default expiration time for tables in the dataset.                 |
+----------------+-------------+-----------+---------+--------------------------------------------------------------------+
| location                                 | str     | Location in which the dataset is hosted.                           |
+----------------+-------------+-----------+---------+--------------------------------------------------------------------+
| access_entries                           | seq     | Represents grant of an access role to an entity.                   |
+----------------+-------------+-----------+---------+--------------------------------------------------------------------+
| access_entries | role                    | str     | Role granted to the entity.                                        |
|                |                         |         | The following string values are supported:                         |
|                |                         |         |                                                                    |
|                |                         |         | * ``OWNER``                                                        |
|                |                         |         | * ``WRITER``                                                       |
|                |                         |         | * ``READER``                                                       |
|                |                         |         |                                                                    |
|                |                         |         | It may also be ``null`` if the ``entity_type`` is ``view``.        |
+                +-------------+-----------+---------+--------------------------------------------------------------------+
|                | entity_type             | str     | Type of entity being granted the role. One of                      |
|                |                         |         |                                                                    |
|                |                         |         | * ``userByEmail``                                                  |
|                |                         |         | * ``groupByEmail``                                                 |
|                |                         |         | * ``domain``                                                       |
|                |                         |         | * ``specialGroup``                                                 |
|                |                         |         | * ``view``                                                         |
+                +-------------+-----------+---------+--------------------------------------------------------------------+
|                | entity_id   |           | str/map | If the ``entity_type`` is not 'view', the ``entity_id`` is the     |
|                |             |           |         | ``str`` ID of the entity being granted the role. If the            |
|                |             |           |         | ``entity_type`` is 'view', the ``entity_id`` is a ``dict``         |
|                |             |           |         | representing the view from a different dataset to grant access to. |
+                +             +-----------+---------+--------------------------------------------------------------------+
|                |             | datasetId | str     | ID of the dataset containing this table.                           |
|                |             |           |         | (Specifies when ``entity_type`` is ``view``.)                      |
+                +             +-----------+---------+--------------------------------------------------------------------+
|                |             | projectId | str     | ID of the project containing this table.                           |
|                |             |           |         | (Specifies when ``entity_type`` is ``view``.)                      |
+                +             +-----------+---------+--------------------------------------------------------------------+
|                |             | tableId   | str     | ID of the table.                                                   |
|                |             |           |         | (Specifies when ``entity_type`` is ``view``.)                      |
+----------------+-------------+-----------+---------+--------------------------------------------------------------------+
| labels                                   | map     | Labels for the dataset.                                            |
+----------------+-------------+-----------+---------+--------------------------------------------------------------------+

NOTE: See `the official documentation of BigQuery Datasets`_ for details of key names.

.. _`the official documentation of BigQuery Datasets`: https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets

Table
~~~~~

.. code:: yaml

    table_id: table1
    friendly_name: null
    description: null
    expires: null
    partitioning_type: null
    view_use_legacy_sql: null
    view_query: null
    schema:
    -   name: column1
        field_type: STRING
        mode: REQUIRED
        description: null
        fields: null
    -   name: column2
        field_type: RECORD
        mode: NULLABLE
        description: null
        fields:
        -   name: column2_1
            field_type: STRING
            mode: NULLABLE
            description: null
            fields: null
        -   name: column2_2
            field_type: INTEGER
            mode: NULLABLE
            description: null
            fields: null
        -   name: column2_3
            field_type: RECORD
            mode: REPEATED
            description: null
            fields:
            -   name: column2_3_1
                field_type: BOOLEAN
                mode: NULLABLE
                description: null
                fields: null
    labels:
        foo: bar

.. code:: yaml

    table_id: view1
    friendly_name: null
    description: null
    expires: null
    partitioning_type: null
    view_use_legacy_sql: false
    view_query: |
        select
        *
        from
        `project1.dataset1.table1`
    schema: null
    labels: null

+----------------+--------------+-------+-------------------------------------------------------------------------------+
| Key name                      | Value | Description                                                                   |
+================+==============+=======+===============================================================================+
| table_id                      | str   | ID of the table.                                                              |
+----------------+--------------+-------+-------------------------------------------------------------------------------+
| friendly_name                 | str   | Title of the table.                                                           |
+----------------+--------------+-------+-------------------------------------------------------------------------------+
| description                   | str   | Description of the table.                                                     |
+----------------+--------------+-------+-------------------------------------------------------------------------------+
| expires                       | str   | Datetime at which the table will be deleted.                                  |
|                               |       | (ISO8601 format ``%Y-%m-%dT%H:%M:%S.%f%z``)                                   |
+----------------+--------------+-------+-------------------------------------------------------------------------------+
| partitioning_type             | str   | Time partitioning of the table if it is partitioned.                          |
|                               |       | The only partitioning type that is currently supported is ``DAY``.            |
+----------------+--------------+-------+-------------------------------------------------------------------------------+
| view_use_legacy_sql           | bool  | Specifies whether to use BigQuery's legacy SQL for this view.                 |
+----------------+--------------+-------+-------------------------------------------------------------------------------+
| view_query                    | str   | SQL query defining the table as a view.                                       |
+----------------+--------------+-------+-------------------------------------------------------------------------------+
| schema                        | seq   | The schema of the table destination for the row.                              |
+----------------+--------------+-------+-------------------------------------------------------------------------------+
| schema         | name         | str   | The name of the field.                                                        |
+                +--------------+-------+-------------------------------------------------------------------------------+
|                | field_type   | str   | The type of the field. One of                                                 |
|                |              |       |                                                                               |
|                |              |       | * ``STRING``                                                                  |
|                |              |       | * ``BYTES``                                                                   |
|                |              |       | * ``INTEGER``                                                                 |
|                |              |       | * ``INT64`` (same as INTEGER)                                                 |
|                |              |       | * ``FLOAT``                                                                   |
|                |              |       | * ``FLOAT64`` (same as FLOAT)                                                 |
|                |              |       | * ``BOOLEAN``                                                                 |
|                |              |       | * ``BOOL`` (same as BOOLEAN)                                                  |
|                |              |       | * ``TIMESTAMP``                                                               |
|                |              |       | * ``DATE``                                                                    |
|                |              |       | * ``TIME``                                                                    |
|                |              |       | * ``DATETIME``                                                                |
|                |              |       | * ``RECORD`` (where RECORD indicates that the field contains a nested schema) |
|                |              |       | * ``STRUCT`` (same as RECORD)                                                 |
+                +--------------+-------+-------------------------------------------------------------------------------+
|                | mode         | str   | The mode of the field. One of                                                 |
|                |              |       |                                                                               |
|                |              |       | * ``NULLABLE``                                                                |
|                |              |       | * ``REQUIRED``                                                                |
|                |              |       | * ``REPEATED``                                                                |
+                +--------------+-------+-------------------------------------------------------------------------------+
|                | description  | str   | Description for the field.                                                    |
+                +--------------+-------+-------------------------------------------------------------------------------+
|                | fields       | seq   | Describes the nested schema fields if the type property is set to ``RECORD``. |
+----------------+--------------+-------+-------------------------------------------------------------------------------+
| labels                        | map   | Labels for the table.                                                         |
+----------------+--------------+-------+-------------------------------------------------------------------------------+

NOTE: See `the official documentation of BigQuery Tables`_ for details of key names.

.. _`the official documentation of BigQuery Tables`: https://cloud.google.com/bigquery/docs/reference/rest/v2/tables

Directory structure
~~~~~~~~~~~~~~~~~~~

.. code::

    .
    ├── dataset1        # Directory storing the table configuration file of dataset1.
    │   ├── table1.yml  # Configuration file of table1 in dataset1.
    │   └── table2.yml  # Configuration file of table2 in dataset1.
    ├── dataset1.yml    # Configuration file of dataset1.
    ├── dataset2        # Directory storing the table configuration file of dataset2.
    │   └── .gitkeep    # When keeping a directory, dataset2 is empty.
    ├── dataset2.yml    # Configuration file of dataset2.
    └── dataset3.yml    # Configuration file of dataset3. This dataset does not manage the table.

NOTE: If you do not want to manage the table, delete the directory with the same name as the dataset name.

Usage
-----

.. code::

    Usage: bqdm [OPTIONS] COMMAND [ARGS]...

    Options:
      -c, --credential-file PATH  Location of credential file for service accounts.
      -p, --project TEXT          Project ID for the project which you’d like to manage with.
      --color / --no-color        Enables output with coloring.
      --parallelism INTEGER       Limit the number of concurrent operation.
      --debug                     Debug output management.
      -h, --help                  Show this message and exit.

    Commands:
      apply    Builds or changes datasets.
      destroy  Specify subcommand `plan` or `apply`
      export   Export existing datasets into file in YAML format.
      plan     Generate and show an execution plan.

Export
~~~~~~

.. code::

    Usage: bqdm export [OPTIONS] [OUTPUT_DIR]

      Export existing datasets into file in YAML format.

    Options:
      -d, --dataset TEXT          Specify the ID of the dataset to manage.
      -e, --exclude-dataset TEXT  Specify the ID of the dataset to exclude from managed.
      -h, --help                  Show this message and exit.

Plan
~~~~

.. code::

    Usage: bqdm plan [OPTIONS] [CONF_DIR]

      Generate and show an execution plan.

    Options:
      --detailed_exitcode         Return a detailed exit code when the command exits.
                                  When provided, this argument changes
                                  the exit codes and their meanings to provide
                                  more granular information about what the
                                  resulting plan contains:
                                  0 = Succeeded with empty diff
                                  1 = Error
                                  2 = Succeeded with non-
                                  empty diff
      -d, --dataset TEXT          Specify the ID of the dataset to manage.
      -e, --exclude-dataset TEXT  Specify the ID of the dataset to exclude from managed.
      -h, --help                  Show this message and exit.

Apply
~~~~~

.. code::

    Usage: bqdm apply [OPTIONS] [CONF_DIR]

      Builds or changes datasets.

    Options:
      -d, --dataset TEXT              Specify the ID of the dataset to manage.
      -e, --exclude-dataset TEXT      Specify the ID of the dataset to exclude from managed.
      -m, --mode [select_insert|select_insert_backup|replace|replace_backup|drop_create|drop_create_backup]
                                      Specify the migration mode when changing the schema.
                                      Choice from `select_insert`,
                                      `select_insert_backup`, `replace`, r`eplace_backup`,
                                      `drop_create`,
                                      `drop_create_backup`.  [required]
      -b, --backup-dataset TEXT       Specify the ID of the dataset to store the backup at migration
      -h, --help                      Show this message and exit.

NOTE: See `migration mode`_

Destroy
~~~~~~~

.. code::

    Usage: bqdm destroy [OPTIONS] COMMAND [ARGS]...

      Specify subcommand `plan` or `apply`

    Options:
      -h, --help  Show this message and exit.

    Commands:
      apply  Destroy managed datasets.
      plan   Generate and show an execution plan for...

Destroy plan
^^^^^^^^^^^^

.. code::

    Usage: bqdm destroy plan [OPTIONS] [CONF_DIR]

      Generate and show an execution plan for datasets destruction.

    Options:
      --detailed-exitcode         Return a detailed exit code when the command exits.
                                  When provided, this argument changes
                                  the exit codes and their meanings to provide
                                  more granular information about what the
                                  resulting plan contains:
                                  0 = Succeeded with empty diff
                                  1 = Error
                                  2 = Succeeded with non-
                                  empty diff
      -d, --dataset TEXT          Specify the ID of the dataset to manage.
      -e, --exclude-dataset TEXT  Specify the ID of the dataset to exclude from managed.
      -h, --help                  Show this message and exit.

Destroy apply
^^^^^^^^^^^^^

.. code::

    Usage: bqdm destroy apply [OPTIONS] [CONF_DIR]

      Destroy managed datasets.

    Options:
      -d, --dataset TEXT          Specify the ID of the dataset to manage.
      -e, --exclude-dataset TEXT  Specify the ID of the dataset to exclude from managed.
      -h, --help                  Show this message and exit.

Migration mode
--------------

select_insert
~~~~~~~~~~~~~

#. TODO

LIMITATIONS: TODO

select_insert_backup
~~~~~~~~~~~~~~~~~~~~

#. TODO

LIMITATIONS: TODO

replace
~~~~~~~

#. TODO

LIMITATIONS: TODO

replace_backup
~~~~~~~~~~~~~~

#. TODO

LIMITATIONS: TODO

drop_create
~~~~~~~~~~~

#. TODO

drop_create_backup
~~~~~~~~~~~~~~~~~~

#. TODO

Authentication
--------------

See `authentication section`_ in the official documentation of ``google-cloud-python``.

    If you're running in Compute Engine or App Engine, authentication should "just work".

    If you're developing locally, the easiest way to authenticate is using the Google Cloud SDK:

    .. code:: bash

        $ gcloud auth application-default login

    Note that this command generates credentials for client libraries. To authenticate the CLI itself, use:

    .. code:: bash

        $ gcloud auth login

    Previously, gcloud auth login was used for both use cases. If your gcloud installation does not support the new command, please update it:

    .. code:: bash

        $ gcloud components update

    If you're running your application elsewhere, you should download a service account JSON keyfile and point to it using an environment variable:

    .. code:: bash

        $ export GOOGLE_APPLICATION_CREDENTIALS="/path/to/keyfile.json"

.. _`authentication section`: https://google-cloud-python.readthedocs.io/en/latest/core/auth.html#overview

Testing
-------

Depends on the following environment variables:

.. code:: bash

    $ export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json
    $ export GOOGLE_CLOUD_PROJECT=YOUR_PROJECT_ID

Run test
~~~~~~~~

.. code:: bash

    $ pip install pipenv
    $ pipenv install --dev
    $ pipenv run pytest

Run test multiple Python versions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code:: bash

    $ pip install pipenv
    $ pipenv install --dev
    $ pyenv local 3.6.5 3.5.5 3.4.8 2.7.14
    $ pipenv run tox

TODO
----

#. Support encryption configuration for table
#. Support external data configuration for table
#. Schema replication
#. Integration tests


