Metadata-Version: 2.1
Name: brdata-rag-tools
Version: 0.1.5
Summary: Improve development of retrieval augmented generation (RAG) applications at the BR AI + Automation Lab.
Author-email: Marco Lehner <marco.lehner@br.de>
Project-URL: Homepage, https://github.com/br-data/rag-tools-library
Project-URL: Issues, https://github.com/br-data/rag-tools-library/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: aenum ==3.1.15
Requires-Dist: annotated-types ==0.6.0
Requires-Dist: anyio ==4.2.0
Requires-Dist: certifi ==2023.11.17
Requires-Dist: charset-normalizer ==3.3.2
Requires-Dist: distro ==1.9.0
Requires-Dist: exceptiongroup ==1.2.0
Requires-Dist: faiss-cpu ==1.7.4
Requires-Dist: greenlet ==3.0.3
Requires-Dist: h11 ==0.14.0
Requires-Dist: httpcore ==1.0.2
Requires-Dist: httpx ==0.26.0
Requires-Dist: idna ==3.6
Requires-Dist: numpy ==1.26.3
Requires-Dist: openai ==1.7.0
Requires-Dist: pgvector ==0.2.4
Requires-Dist: psycopg2-binary ==2.9.9
Requires-Dist: pydantic ==2.5.3
Requires-Dist: pydantic-core ==2.14.6
Requires-Dist: regex ==2023.12.25
Requires-Dist: requests ==2.31.0
Requires-Dist: sniffio ==1.3.0
Requires-Dist: SQLAlchemy ==2.0.25
Requires-Dist: tiktoken ==0.5.2
Requires-Dist: tqdm ==4.66.1
Requires-Dist: typing-extensions ==4.9.0
Requires-Dist: urllib3 ==2.1.0

# rag-tools-library
Library to support common tasks in retrieval augmented generation (RAG).

This library is in a very early stage and all the documentation is AI generated.

## Tutorial and Documentation

You find a brief tutorial and the documentation under [br-data.github.io/rag-tools-library](https://br-data.github.io/rag-tools-library/).

## Roadmap

- [ ] Add Google Bison to available LLMs
- [x] Add an offline database alternative
  - [x] FAISS and SQLite
- [x] Allow users to register their own LLMs 
- [x] Allow users to register their own Embedding models
- [ ] Support Semantic Scholar endpoint to generate embeddings for scientific papers.
- [x] Support chat functionality; e.g. let the user give feedback on the result to the LLM.

# Deployment

Run the `build_and_deploy.sh` script in the root folder. Once prompted for the username, pass `__token__` and the pypi API 
token you've received. If you don't have an API token and feel like you should, feel free to contact the maintainers.

# Contact

Marco Lehner

[marco.lehner@br.de](mailto:marco.lehner@br.de)
