Metadata-Version: 2.1
Name: intelligence-toolkit
Version: 0.0.3
Summary: Interactive workflows for generating AI intelligence reports from real-world data sources using GPT models
License: MIT
Keywords: AI,data analysis,reports,workflows
Author: Dayenne Souza
Author-email: ddesouza@microsoft.com
Requires-Python: >=3.11,<3.13
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Dist: altair (==4.2.2)
Requires-Dist: altair-viewer (>=0.4.0,<0.5.0)
Requires-Dist: azure-core (==1.30.2)
Requires-Dist: azure-identity (==1.17.1)
Requires-Dist: duckdb (==1.0.0)
Requires-Dist: future (>=1.0.0,<2.0.0)
Requires-Dist: graspologic (>=3.4.1,<4.0.0)
Requires-Dist: jsonschema (>=4.23.0,<5.0.0)
Requires-Dist: lancedb (==0.12.0)
Requires-Dist: markdown2 (==2.5.0)
Requires-Dist: nest-asyncio (>=1.6.0,<2.0.0)
Requires-Dist: networkx (==3.3)
Requires-Dist: numpy (==1.26.4)
Requires-Dist: openai (>=1.37.1,<2.0.0)
Requires-Dist: pac-synth (==0.0.8)
Requires-Dist: pdfkit (==1.0.0)
Requires-Dist: pdfplumber (==0.11.2)
Requires-Dist: plotly (==5.22.0)
Requires-Dist: plotly-express (==0.4.1)
Requires-Dist: poethepoet (>=0.27.0,<0.28.0)
Requires-Dist: poetry (>=1.8.3,<2.0.0)
Requires-Dist: polars (==0.20.10)
Requires-Dist: pyarrow (==15.0.0)
Requires-Dist: pydantic (==2.8.2)
Requires-Dist: pydantic_core (==2.20.1)
Requires-Dist: scikit-learn (==1.5.1)
Requires-Dist: scipy (==1.12.0)
Requires-Dist: seaborn (==0.13.2)
Requires-Dist: semchunk (==2.2.0)
Requires-Dist: sentence-transformers (>=3.1.1,<4.0.0)
Requires-Dist: streamlit (==1.31.1)
Requires-Dist: streamlit-aggrid (==0.3.4.post3)
Requires-Dist: streamlit-agraph (==0.0.45)
Requires-Dist: streamlit-javascript (==0.1.5)
Requires-Dist: textblob (==0.18.0.post0)
Requires-Dist: tiktoken[azure] (==0.7.0)
Requires-Dist: torch (==2.4.1) ; sys_platform != "darwin"
Requires-Dist: torch (==2.5.1) ; sys_platform == "darwin"
Description-Content-Type: text/markdown

# Developing 

## Requirements

- Python 3.11 ([Download](https://www.python.org/downloads/))
- poetry ([Download](https://python-poetry.org/docs/#installing-with-the-official-installer))
- wkhtmltopdf (used to generate PDF reports)

    - Windows: ([Download](https://wkhtmltopdf.org/downloads.html))

    - Linux:  `sudo apt-get install wkhtmltopdf`

    - MacOS: `brew install homebrew/cask/wkhtmltopdf`


## Running the app

## GPT settings

You can configure your OpenAI access when running the app via `Settings page`, or using environment variables.

#### Default values: 
```
OPENAI_API_MODEL="gpt-4o-mini"
OPENAI_TYPE="OpenAI" ## Other option available: Azure OpenAI
AZURE_AUTH_TYPE="Azure Key" # if OPENAI_TYPE==Azure OpenAI
DEFAULT_EMBEDDING_MODEL = "text-embedding-3-small"
```

### OpenAI
OPENAI_API_KEY=<OPENAI_API_KEY>

### Azure OpenAI
```
OPENAI_TYPE="Azure OpenAI"
AZURE_OPENAI_VERSION=2023-12-01-preview
AZURE_OPENAI_ENDPOINT="https://<ENDPOINT>.azure.com/"
OPENAI_API_KEY=<AZURE_OPENAI_API_KEY>

#If Azure OpenAI using Managed Identity:
AZURE_AUTH_TYPE="Managed Identity"
```

### Running locally

Windows: Search and open the app `Windows Powershell` on Windows start menu

Linux and Mac: Open `Terminal`

For any OS:

Navigate to the folder where you cloned this repo. 

Use `cd `+ the path to the folder. For example:

`cd C:\Users\user01\projects\intelligence-toolkit`

Run `poetry install` and wait for the packages installation.

#### Run the app:

Run `poetry run poe run_streamlit`, and it will automatically open the app in your default browser in `localhost:8081`

#### Use the API

You can also replicate the examples in your own environment running `pip install intelligence-toolkit`.

See the documentation and an example of how to run the code with your data to obtain results without the need to run the UI.
- [Anonymize Case Data](./app/workflows/anonymize_case_data/README.md)

    - [Example](./example_notebooks/anonymize_case_data.ipynb)

- [Compare Case Groups](./app/workflows/compare_case_groups/README.md)

    - [Example](./example_notebooks/compare_case_groups.ipynb)

- [Detect Case Patterns](./app/workflows/detect_case_patterns/README.md)

    - [Example](./example_notebooks/detect_case_patterns.ipynb)

- [Detect Entity Networks](./app/workflows/detect_entity_networks/README.md)

    - [Example](./example_notebooks/detect_entity_networks.ipynb)

- [Extract Record Data](./app/workflows/extract_record_data/README.md)

    - [Example](./example_notebooks/extract_record_data.ipynb)

- [Generate Mock Data](./app/workflows/generate_mock_data/README.md)

    - [Example](./example_notebooks/generate_mock_data.ipynb)

- [Match Entity Records](./app/workflows/match_entity_records/README.md)

    - [Example](./example_notebooks/match_entity_records.ipynb)
    
- [Query Text Data](./app/workflows/query_text_data/README.md)

    - [Example](./example_notebooks/query_text_data.ipynb)


### Running with docker

##### Recommended configuration:

- *Minimum disk space*: 8GB 
- *Minimum memory*: 4GB

Download, install and then open docker app: https://www.docker.com/products/docker-desktop/

Then, open a terminal:
Windows: Search and open the app `Windows Powershell` on Windows start menu

Linux and Mac: Open `Terminal`

For any OS:

Navigate to the folder where you cloned this repo. 

Use `cd `+ the path to the folder. For example:

`cd C:\Users\user01\projects\intelligence-toolkit`

Build the container:

`docker build . -t intelligence-toolkit`

Once the build is finished, run the docker container:

- via terminal:

    `docker run -d --name intelligence-toolkit -p 80:80 intelligence-toolkit`

Open [localhost:80](http://localhost:80)

  **Note that docker might sleep and you might need to start it again. Open Docker Desktop, in the left menu click on Container and press play on intelligence-toolkit.**

# Lifecycle Scripts

For Lifecycle scripts it utilizes [Poetry](https://python-poetry.org/docs#installation) and [poethepoet](https://pypi.org/project/poethepoet/) to manage build scripts.

Available scripts are:

- `poetry run poe test_unit` - This will execute unit tests on api.
- `poetry run poe test_smoke` - This will execute smoke tests on api.
- `poetry run poe check` - This will perform a suite of static checks across the package, including:
  - formatting
  - documentation formatting
  - linting
  - security patterns
  - type-checking
- `poetry run poe fix` - This will apply any available auto-fixes to the package. Usually this is just formatting fixes.
- `poetry run poe fix_unsafe` - This will apply any available auto-fixes to the package, including those that may be unsafe.
- `poetry run poe format` - Explicitly run the formatter across the package.


