Metadata-Version: 2.1
Name: PyRecombine
Version: 1.0.1
License: BSD-3-Clause
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: C++
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX
Classifier: Operating System :: Unix
Classifier: Operating System :: MacOS
Requires-Dist: numpy>=2.0
Requires-Dist: intel_openmp; (platform_machine == "AMD64" or platform_machine == "x86_64") and sys_platform != "darwin"
Description-Content-Type: text/markdown

# PyRecombine

Performs a dynamic Caratheodory process and takes a weighted collection of vectors and identifies by pointers, a subset of minimal cardinality among the vectors and new weights so both empirical measures have the same mean.
Software written by Terry Lyons, based on algorithms developed jointly with Christian Litter and then with Maria Tchernychova 2008-2020.
Here minimal is a local notion and means it cannot be further reduced.
There may be other noncomparable cubature sets with fewer points.

The library has a robust and stable C interface and even old dlls from 2008 with the old algorithm still run.
Measures are represented by an array of null pointers (pseudo-points) and an array of equal length of positive doubles adding to `1.`; the calling program provides a feature map that converts each null pointer to a vector, and so recombine never knows or needs to know the real types of the points.
Recombine returns the indexes of the points that should remain and new positive weights so that the mean of these remaining points with the new weights is the same as the old mean.
The new list of survivors is never more than `D+1` long where `D` is the  dimension of the vector space.
If there are `N` points in `D` dimensions the sequential complexity of this algorithm is less  than `ND + log_2(N/D) D^3`.
This reflects the order of magnitude improvement in the algorithm developed with  Maria Tchernychova; the algorithm with Litterer had complexity `ND + log_2(N/D) D^4` although it is quicker for small problems.
The interface remains the same.
The ND comes from touching the points, and `log_2(N/D) D^3` from performing `log_2(N/D)` complex SVD type calculations on `Dx2D` matrices.
This is a linear programming problem under the surface, but the algorithm here has fixed complexity.
In many of the problems we are interested in `N` is approximately `D^2` so the cost is (to logarithms) equated with the cost of touching the points.

The algorithm uses MKL (LAPACK) to do most of the linear algebra, although there is a part that is bespoke.
The operations benefit from some parallelism via OMP.
Say (export OMP_NUM_THREADS=8).
The log factor is truly sequential.
