Metadata-Version: 2.4
Name: aweson
Version: 2.0.0
Summary: Traversing and manipulating hierarchical info sets (JSON) using pythonic JSON Path-like expressions
Author-email: Ferenc Dosa-Racz <dosarf@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/dosarf/aweson
Project-URL: Issues, https://github.com/dosarf/aweson/issues
Project-URL: Changelog, https://github.com/dosarf/aweson/blob/master/CHANGELOG.md
Keywords: JSON,Path
Classifier: Programming Language :: Python :: 3.10
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/x-rst
License-File: LICENSE
Provides-Extra: test
Requires-Dist: pytest; extra == "test"
Requires-Dist: pylint; extra == "test"
Requires-Dist: mypy; extra == "test"
Requires-Dist: black; extra == "test"
Requires-Dist: isort; extra == "test"
Provides-Extra: develop
Requires-Dist: build; extra == "develop"
Requires-Dist: twine; extra == "develop"
Dynamic: license-file

``aweson``
==========
Traversing and manipulating hierarchical data (JSON) using pythonic JSON Path-like expressions


Importing
---------

    >>> from aweson import JP, find_all, find_next


Iterating over hierarchical data
--------------------------------

    >>> content = {"employees": [
    ...     {"name": "Doe, John", "age": 32, "account": "johndoe"},
    ...     {"name": "Doe, Jane", "age": -23, "account": "janedoe"},
    ...     {"name": "Deer, Jude", "age": 42, "account": "judedeer"},
    ... ]}
    >>> list(find_all(content, JP.employees[:].name))
    ['Doe, John', 'Doe, Jane', 'Deer, Jude']

Note that the JSON Path-like expression ``JP.employees[:].name`` is `not` a string,
it's a Python expression, i.e. your IDE will be of actual help.

Furthermore, to address all items in a list, Pythonic slice expression ``[:]`` is used. Naturally,
other indexing and slice expressions also work in the conventional Pythonic way:

    >>> list(find_all(content, JP.employees[-1].name))
    ['Deer, Jude']

    >>> list(find_all(content, JP.employees[:2].name))
    ['Doe, John', 'Doe, Jane']


Paths to items iterated
-----------------------

You may be interested in the actual path of an item being returned, just like
you get an index alongside items when using ``enumerate()``. For instance, you may want to verify
ages being non-negative, and report accurately the path of failure items:

    >>> path, item = next(tup for tup in find_all(content, JP.employees[:].age, with_path=True) if tup[1] < 0)
    >>> item
    -23

The offending path, then, in human-readable format:

    >>> str(path)
    '$.employees[1].age'

The enclosing record, using ``.parent`` attribute of the path obtained for the offending age:

    >>> next(find_all(content, path.parent))
    {'name': 'Doe, Jane', 'age': -23, 'account': 'janedoe'}

Note, with argument ``with_path=True`` passed, ``find_all()`` yields tuples instead of single
items.


Selecting sub-items
-------------------

You can select sub-items of iterated items, comes handy into turning one structure
into another, like a list of records into a ``dict``:

    >>> {tup[0]: tup[1] for tup in find_all(content, JP.employees[:](JP.account, JP.name))}
    {'johndoe': 'Doe, John', 'janedoe': 'Doe, Jane', 'judedeer': 'Deer, Jude'}

or, to make your processing logic within the same comprehension expression more readable:

    >>> {account: name for account, name in find_all(content, JP.employees[:](JP.account, JP.name))}
    {'johndoe': 'Doe, John', 'janedoe': 'Doe, Jane', 'judedeer': 'Deer, Jude'}

You can also make a sub-items selection produce `named tuples` by explicitly naming sub-paths:

    >>> list(find_all(content, JP.employees[:](account=JP.account, name=JP.name)))
    [SubSelect(account='johndoe', name='Doe, John'), SubSelect(account='janedoe', name='Doe, Jane'), SubSelect(account='judedeer', name='Deer, Jude')]

Now, the processing code could be elsewhere than the `find_all()` invocation, as named tuples will carry
the field names with them. The produced named tuples will all be called ``SubSelect`` but they will be
different named tuples for each invocation.


Variable field name selection
-----------------------------

The forms ``JP["field_name"]`` and ``JP.field_name`` are equivalent:

    >>> from functools import reduce
    >>> def my_sum(content, field_name, initial):
    ...     return reduce(
    ...         lambda x, y: x + y,
    ...         find_all(content, JP.employees[:][field_name]),
    ...         initial
    ...     )
    >>> my_sum(content, "age", 0)
    51
    >>> my_sum(content, "account", "")
    'johndoejanedoejudedeer'


Field name by regular expressions
---------------------------------

Sometimes you have a JSON format where types are represented by JSON objects, e.g.

    >>> content = {
    ...     "apple": [{"name": "red delicious"}, {"name": "punakaneli"}],
    ...     "pear": [{"name": "wilhelm"}, {"name": "conference"}]
    ... }

i.e. it's not a union list of records with a type discriminator ``[{"type": "apple", "name": ...}, ...]``.
Iterating over all the fruits, regardless of their type, in our example ``content`` above can
be achieved by:

    >>> list(find_all(content, JP["apple|pear"][:].name))
    ['red delicious', 'punakaneli', 'wilhelm', 'conference']

if you are only interested in apples and pears, or

    >>> list(find_all(content, JP[".*"][:].name))
    ['red delicious', 'punakaneli', 'wilhelm', 'conference']

if you are interested in fruits other than apples and pears.

Note, that the expression ``JP["*"]`` is also supported, but that's `not` a regular expression,
merely a conventional JSON Path notation, and equivalent to ``JP[:]``:

    >>> list(find_all([5, 42, 137], JP["*"]))
    [5, 42, 137]


Suppressing indexing errors and key errors
------------------------------------------

By default, path expressions are strict, e.g. for ``list`` indexes:

    >>> list(find_all([0, 1], JP[2]))
    Traceback (most recent call last):
      ...
    IndexError: list index out of range

and for ``dict`` keys:

    >>> list(find_all({"hello": 42}, JP.hi))
    Traceback (most recent call last):
      ...
    KeyError: 'hi'

You can suppress these errors and simply have nothing yielded, for ``list`` indexes:

    >>> list(find_all([0, 1], JP[2], lenient=True))
    []

and for ``dict`` keys:

    >>> list(find_all({"hello": 42}, JP.hi, lenient=True))
    []


Utility ``find_next()``
-----------------------

Often, you just need a first value, roughly equivalent to a ``next(find_all(...))``
invocation. You can use ``find_next()`` for this, for instance

    >>> find_next([{"hello": 5}, {"hello": 42}], JP[:].hello)
    5
    >>> find_next([{"hello": 5}, {"hello": 42}], JP[1].hello)
    42

You can also ask for the path of the value returned, in the style of ``with_path=True``
above

    >>> path, value = find_next([{"hello": 5}, {"hello": 42}], JP[-1].hello, with_path=True)
    >>> str(path)
    '$[1].hello'
    >>> value
    42

You can also supply a default value for ``find_next()``, just like for ``next()``:

    >>> find_next([{"hello": 5}, {"hello": 42}], JP[3].hello, default=17)
    17

