Metadata-Version: 2.2
Name: bigdata-research-tools
Version: 0.11.1
Summary: Bigdata.com API High-Efficiency Tools at Scale
Author-email: Bigdata Solutions Engineering Team <support@ravenpack.com>
Project-URL: homepage, https://bigdata.com/api
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: bigdata-client<3.0.0,>=2.5.0
Requires-Dist: excel>=1.0.1
Requires-Dist: ipython>=8.18.1
Requires-Dist: openai>=1.62.0
Requires-Dist: openpyxl>=3.1.5
Requires-Dist: pandas<3.0.0,>=2.2.3
Requires-Dist: plotly>=6.0.0
Requires-Dist: tqdm>=4.67.1
Provides-Extra: advanced
Requires-Dist: pandas<3.0.0,>=2.2.0; extra == "advanced"
Provides-Extra: excel
Requires-Dist: openpyxl<4.0.0,>=3.1.5; extra == "excel"
Requires-Dist: pillow<12.0.0,>=11.1.0; extra == "excel"
Provides-Extra: openai
Requires-Dist: openai<2.0.0,>=1.61.1; extra == "openai"
Provides-Extra: plotly
Requires-Dist: plotly<7.0.0,>=6.0.0; extra == "plotly"
Provides-Extra: all
Requires-Dist: openpyxl<4.0.0,>=3.1.5; extra == "all"
Requires-Dist: pillow<12.0.0,>=11.1.0; extra == "all"
Requires-Dist: openai<2.0.0,>=1.61.1; extra == "all"
Requires-Dist: plotly<7.0.0,>=6.0.0; extra == "all"
Requires-Dist: hdbscan<0.9.0,>=0.8.40; extra == "all"
Provides-Extra: docs
Requires-Dist: Sphinx>=7.2.6; extra == "docs"
Requires-Dist: autodoc-pydantic>=2.0.1; extra == "docs"
Requires-Dist: myst-parser>=2.0.0; extra == "docs"
Requires-Dist: furo>=2024.1.29; extra == "docs"
Requires-Dist: sphinxcontrib-spelling>=8.0.0; extra == "docs"
Requires-Dist: sphinx-new-tab-link>=0.6.0; extra == "docs"
Requires-Dist: sphinx-copybutton>=0.5.2; extra == "docs"
Requires-Dist: sphinx-reredirects>=0.1.5; extra == "docs"

<p align="center">
  <picture>
    <source srcset="https://sdk.bigdata.com/en/latest/_static/bigdata_dark.svg" media="(prefers-color-scheme: dark)">
    <img src="https://sdk.bigdata.com/en/latest/_static/bigdata_light.svg" alt="Bigdata Logo" width="250">
  </picture>
</p>

# Bigdata Research Tools

[![Python version support](https://img.shields.io/badge/Python-3.9%20|%203.10%20|%203.11%20|%203.12%20|%203.13-blue?logo=python)](https://pypi.org/project/bigdata-research-tools)
[![PyPI version](https://badge.fury.io/py/bigdata-research-tools.svg)](https://badge.fury.io/py/bigdata-research-tools)

**Bigdata.com API High-Efficiency Tools at Scale**

This repository provides efficient toolset to use the Bigdata.com SDK.

---

## Installation

Install the package from PyPI using `pip`:

```bash
pip install bigdata-research-tools
```

---

## Usage

The following example demonstrates the convenient way to run multiple searches
in a concurrent and rate-limited manner:

```python 
from bigdata_research_tools import run_search
from bigdata_client import Bigdata

bigdata = Bigdata()

results = run_search(bigdata=bigdata,
                     queries=YOUR_LIST_OF_QUERIES,
                     limit=1000)
```

## 1. Return Values

### 1.1. Return only the results list

By default, setting `only_results=True` will return a list of all results from
all queries.

```python
results = run_search(bigdata=bigdata,
                     queries=YOUR_LIST_OF_QUERIES,
                     limit=1000,
                     only_results=True)
```

```shell
>>> results
[
    [results1, results2, ...],
    [results1, results2, ...],
    [results1, results2, ...],
]
```

### 1.2. Return queries with their corresponding results

Setting `only_results=False` will return a dictionary mapping each (query,
date_range) combination pair to their respective search results list.

```python
query_results = run_search(bigdata=bigdata,
                           queries=YOUR_LIST_OF_QUERIES,
                           limit=1000,
                           only_results=False)
```

```shell
>>> query_results
{
    '(query1, date_range1)': [results1, results2, ...],
    '(query1, date_range2)': [results1, results2, ...],
    '(query2, date_range1)': [results1, results2, ...],
    '(query2, date_range2)': [results1, results2, ...],
    ...
}
```

---

## Key Features

- **Rate Limiting**: Enforces a configurable query-per-minute (RPM) limit using
  a token bucket algorithm.
- **Concurrency Support**: Executes multiple search queries simultaneously with
  a user-defined maximum number of threads.
- **Thread-Safe**: Ensures safe concurrent access to shared resources with
  built-in thread locks.
- **Flexible Configuration**:
    - Set custom RPM limits and token bucket sizes.
    - Configure search parameters such as date ranges, sorting, and result
      limits.
- **Minimum Dependencies**: Requires only the `bigdata_client` SDK.
- **Ease of Use**: Includes a convenience function for running multiple
  searches with minimal setup.

---

## License

This software is licensed for use solely under the terms agreed upon in the
applicable Master Agreement and Order Schedule between the parties.
For trials, the applicable legal documents are the Mutual Non-Disclosure
Agreement, or if applicable the Trial Agreement.
No other rights or licenses are granted by implication, estoppel, or otherwise.
For further details, please refer to your specific Master Agreement and Order
Schedule or contact us at legal@ravenpack.com.

---

**RavenPack** | **Bigdata.com** \
All rights reserved © 2025

