Metadata-Version: 2.4
Name: arrakis-schema
Version: 0.1.2
Summary: Schemas for the Arrakis API
Project-URL: Homepage, https://git.ligo.org/ngdd/arrakis-schema
Project-URL: Documentation, https://docs.ligo.org/ngdd/arrakis-schema
Project-URL: Issue Tracker, https://git.ligo.org/ngdd/arrakis-schema/issues
Project-URL: Source Code, https://git.ligo.org/ngdd/arrakis-schema.git
Author-email: Patrick Godwin <patrick.godwin@ligo.org>, Jameson Graef Rollins <jameson.rollins@ligo.org>
Maintainer-email: Patrick Godwin <patrick.godwin@ligo.org>
License-Expression: LGPL-3.0-or-later
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Natural Language :: English
Classifier: Operating System :: POSIX
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Astronomy
Classifier: Topic :: Scientific/Engineering :: Physics
Requires-Python: >=3.10
Requires-Dist: importlib-resources>=5.10; python_version < '3.12'
Provides-Extra: dev
Requires-Dist: arrakis[docs]; extra == 'dev'
Requires-Dist: arrakis[lint]; extra == 'dev'
Requires-Dist: arrakis[test]; extra == 'dev'
Provides-Extra: docs
Requires-Dist: markdown-callouts>=0.2; extra == 'docs'
Requires-Dist: markdown-exec>=0.5; extra == 'docs'
Requires-Dist: mkdocs-coverage>=0.2; extra == 'docs'
Requires-Dist: mkdocs-gen-files>=0.3; extra == 'docs'
Requires-Dist: mkdocs-literate-nav>=0.4; extra == 'docs'
Requires-Dist: mkdocs-material-igwn; extra == 'docs'
Requires-Dist: mkdocs-section-index>=0.3; extra == 'docs'
Requires-Dist: mkdocs>=1.3; extra == 'docs'
Requires-Dist: mkdocstrings[python]; extra == 'docs'
Requires-Dist: toml>=0.10; extra == 'docs'
Provides-Extra: lint
Requires-Dist: mypy; extra == 'lint'
Requires-Dist: mypy-extensions; extra == 'lint'
Requires-Dist: pip; extra == 'lint'
Requires-Dist: ruff; extra == 'lint'
Provides-Extra: test
Requires-Dist: pytest; extra == 'test'
Requires-Dist: pytest-cov; extra == 'test'
Description-Content-Type: text/markdown

# Arrakis Schema Specification

This repository defines the schema for all data, metadata and API requests for
the `arrakis` LIGO data distribution system.

Schemas are defined in one of two places:

* `endpoints/`: Endpoints served from Arrakis server
* `publication/`: Data published into Kafka

## Endpoints

The Arrakis server responds to API requests corresponding to the four
main actions exposed by the client API:

* **stream**: [endpoints/stream](endpoints/stream)
* **describe**: [endpoints/describe](endpoints/describe)
* **find**: [endpoints/find](endpoints/find)
* **count**: [endpoints/count](endpoints/count)

as well as two actions which aid in publication:

* **partition**: [endpoints/partition](endpoints/partition)
* **publish**: [endpoints/publish](endpoints/publish)

All API requests are done in a two-stage approach by first sending an
Arrow Flight descriptor to the server, returning back a Flight info object
which contains the request and the server to contact, contained within
a Flight ticket. This ticket is then sent to receive back the expected
payload with a specific Arrow flight schema dependent on the request,
serialized in the Arrow 
![streaming format](https://arrow.apache.org/docs/format/Columnar.html#ipc-streaming-format).

The Flight descriptors sent to the server in the first stage are all
specified here as JSON packets which are UTF-8-encoded, using the command
variant of the Flight descriptor, which can be used to specify any
application-specific command.

The Flight descriptor schemas are described within each endpoint in
`descriptor.json`, while the payload schemas are described via `schema.txt`. In
addition, a generic descriptor specification for all endpoints is described in
`endpoints/descriptor.json`.

## Publication

Publication is done by first registering a publisher via the **publish**
endpoint with a publisher ID. If authorized to do so, the server will send a
response with connection details to connect via Kafka to publish data.

The data is published via Kafka with the schema described in
[publication/schema.txt](publication/schema.txt).


## Usage

### Python

The generic Flight descriptor schema is described within each endpoint in
`{endpoint}.json`. In addition, a generic descriptor specification for all
endpoints is described in `descriptor.json`.


```python

from arrakis_schema import load_schema

schema = load_schema("count.json")

```
