Metadata-Version: 2.3
Name: alexandrainst_ragger
Version: 5.0.1
Summary: A repository for general-purpose RAG applications.
Author-email: Dan Saattrup Nielsen <dan.nielsen@alexandra.dk>, Anders Jess Pedersen <anders.j.pedersen@alexandra.dk>
Maintainer-email: Dan Saattrup Nielsen <dan.nielsen@alexandra.dk>
License: Proprietary
Requires-Python: <4.0,>=3.10
Requires-Dist: click>=8.1.7
Requires-Dist: numpy>=1.25.0
Requires-Dist: openai>=1.50.0
Requires-Dist: pydantic>=2.9.2
Requires-Dist: python-dotenv>=1.0.1
Requires-Dist: pyyaml>=6.0.2
Requires-Dist: tiktoken>=0.8.0
Provides-Extra: demo
Requires-Dist: gradio>=5.4.0; extra == 'demo'
Provides-Extra: keyword-search
Requires-Dist: rank-bm25>=0.2.2; extra == 'keyword-search'
Provides-Extra: onprem-cpu
Requires-Dist: llama-cpp-python>=0.3.1; extra == 'onprem-cpu'
Requires-Dist: outlines>=0.0.46; extra == 'onprem-cpu'
Requires-Dist: protobuf>=5.28.3; extra == 'onprem-cpu'
Requires-Dist: sentence-transformers>=3.2.1; extra == 'onprem-cpu'
Requires-Dist: transformers>=4.46.1; extra == 'onprem-cpu'
Provides-Extra: onprem-gpu
Requires-Dist: outlines>=0.1.1; extra == 'onprem-gpu'
Requires-Dist: protobuf>=5.28.3; extra == 'onprem-gpu'
Requires-Dist: sentence-transformers>=3.2.1; extra == 'onprem-gpu'
Requires-Dist: transformers>=4.46.1; extra == 'onprem-gpu'
Requires-Dist: vllm>=0.1.0; extra == 'onprem-gpu'
Provides-Extra: postgres
Requires-Dist: psycopg2-binary>=2.9.10; extra == 'postgres'
Description-Content-Type: text/markdown

<a href="https://github.com/alexandrainst/ragger">
  <img
    src="https://filedn.com/lRBwPhPxgV74tO0rDoe8SpH/alexandra/alexandra-logo.jpeg"
    width="239"
    height="175"
    align="right"
  />
</a>

# Ragger

A package for general-purpose RAG applications.

______________________________________________________________________
[![Code Coverage](https://img.shields.io/badge/Coverage-73%25-yellow.svg)](https://github.com/alexandrainst/ragger/tree/main/tests)


Author(s):

- Dan Saattrup Nielsen (dan.nielsen@alexandra.dk)
- Anders Jess Pedersen (anders.j.pedersen@alexandra.dk)


Maintainer(s):

- Dan Saattrup Nielsen (dan.nielsen@alexandra.dk)


## Installation

Installation with `pip`, `uv`, or `poetry`:

```bash
pip install alexandrainst_ragger
uv add alexandrainst_ragger
poetry add alexandrainst_ragger
```

You can also add additional extras to the installation, such as:

- `onprem_cpu` to use anything that requires an on-premises installation, running on a
  CPU.
- `onprem_gpu` to use anything that requires an on-premises installation, running on a
  GPU.
- `postgres` to use anything PostgreSQL-related.
- `keyword_search` to use the `BM25Retriever` for keyword-based retrieval. Note that
  this is also required when using `HybridRetriever` with the default configuration.
- `demo` to use the demo server.

Here is an example of how to install with the `onprem_cpu` extra, with `pip`, `uv`, and
`poetry`, respectively:

```bash
pip install alexandrainst_ragger[onprem_cpu]
uv add alexandrainst_ragger --extra onprem_cpu
poetry add alexandrainst_ragger --extras onprem_cpu
```


### Development Installation

If you want to install the package for development, you can do so as follows:

```bash
git clone git@github.com:alexandrainst/ragger.git
cd alexandrainst_ragger
make install
```


## Quick Start

_Checkout the `tutorial.ipynb` notebook for a more detailed guide_

Initialise a RAG system with default settings as follows:

```python
import alexandrainst_ragger as air

rag_system = air.RagSystem()
rag_system.add_documents([
	"København er hovedstaden i Danmark.",
	"Danmark har 5,8 millioner indbyggere.",
	"Danmark er medlem af Den Europæiske Union."
])
answer, supporting_documents = rag_system.answer("Hvad er hovedstaden i Danmark?")
```

The `answer` is then the string answer, and the `supporting_documents` is a list of
`Document` objects that support the answer.

You can also start a demo server as follows:

```python
demo = air.Demo(rag_system=rag_system)
demo.launch()
```


## All Available Components

Ragger supports the following components:

### Document Stores

These are the databases carrying all the documents. Documents are represented as objects
of the `Document` data class, which has an `id` and a `text` field.

- `JsonlDocumentStore`: A document store that reads from a JSONL file. (default)
- `SqliteDocumentStore`: A document store that uses a SQLite database to store documents.
- `PostgresDocumentStore`: A document store that uses a PostgreSQL database to store
  documents. This assumes that the PostgreSQL server is already running.
- `TxtDocumentStore`: A document store that reads documents from a single text file,
  separated by newlines.


### Retrievers

Retrievers are used to retrieve documents related to a query.

- `EmbeddingRetriever`: A retriever that uses embeddings to retrieve documents. These
  embeddings are computed using an embedder, which can be one of the following:
  - `OpenAIEmbedder`: An embedder that uses the OpenAI Embeddings API. (default)
  - `E5Embedder`: An embedder that uses an E5 model.

  The embeddings are stored in an embedding store, which can be one of the following:
  - `NumpyEmbeddingStore`: An embedding store that stores embeddings in a NumPy array.
	(default)
  - `PostgresEmbeddingStore`: An embedding store that uses a PostgreSQL database to
	store embeddings, using the `pgvector` extension. This assumes that the PostgreSQL
	server is already running, and that the `pgvector` extension is installed. See
	[here](https://github.com/pgvector/pgvector?tab=readme-ov-file#installation) for
	more information on how to install the extension.

- `BM25Retriever`: A retriever that uses BM25 to retrieve documents. This is keyword
  based and is thus more suitable for keyword-based queries.

- `HybridRetriever`: A retriever that fuses the results of multiple retrievers. This
  can for instance be used to combine the results of the `EmbeddingRetriever` and the
  `BM25Retriever` to get the best of both worlds (known as "Hybrid Retrieval").


### Generators

Generators are used to generate answers from the retrieved documents and the question.

- `APIGenerator`: A generator that uses the OpenAI API or any OpenAI-compatible API.
  (default)
- `GGUFGenerator`: A generator that uses Llama.cpp to wrap any model from the Hugging
  Face Hub in GGUF format. Optimised for CPU generation.
- `VllmGenerator`: A generator that uses vLLM to wrap almost any model from the Hugging
  Face Hub. Note that this requires a GPU to run.


## Custom Components

You can also create custom components by subclassing the following classes:

- `DocumentStore`
- `Retriever` (and by extension, also `Embedder` and `EmbeddingStore`)
- `Generator`

These can then simply be added to a `RagSystem`. Here is an example:

```python
class InMemoryDocumentStore(air.DocumentStore):
	"""A document store that just keeps all documents in memory."""

	def __init__(self, documents: list[str]):
		self.documents = [
			air.Document(id=str(i), text=text) for i, text in enumerate(documents)
		]

	def add_documents(self, documents):
		self.documents.extend(documents)

	def remove(self):
		self.documents = []

	def __getitem__(self, index: air.Index) -> str:
		return self.documents[int(index)]

	def __contains__(self, index: air.Index) -> bool:
		return 0 <= int(index) < len(self.documents)

	def __iter__(self):
		yield from self.documents

	def __len__(self) -> int:
		return len(self.documents)

document_store = InMemoryDocumentStore(documents=[
	"København er hovedstaden i Danmark.",
	"Danmark har 5,8 millioner indbyggere.",
	"Danmark er medlem af Den Europæiske Union."
])
rag_system = air.RagSystem(document_store=document_store)
answer, supporting_documents = rag_system.answer("Hvad er hovedstaden i Danmark?")
```


## Run RAG Demo With Docker (deprecated)

**Note**: This method is deprecated and will be removed in a future version. Please use
the `ragger_webui` repository instead: <https://github.com/alexandrainst/ragger_webui>.

Ensure that your SSH keys are in your SSH agent, by running `ssh-add -L`. If not, you
can add them by running `ssh-add`.

You can run a CPU-based Docker container with a RAG demo with the following commands:

```bash
docker build --ssh default --build-arg config=<config-name> -t alexandrainst_ragger -f Dockerfile.cpu .
docker run --rm -p 7860:7860 alexandrainst_ragger
```

Here `<config-name>` is the name of a YAML or JSON file, with the following format (here
is a YAML example):

```yaml
document_store:
  name: JsonlDocumentStore
  <key>: <value>  # For any additional arguments to `JSONLDocumentStore`

retriever:
  name: EmbeddingRetriever
  embedder:
    name: OpenAIEmbedder
    <key>: <value>  # For any additional arguments to `OpenAIEmbedder`
  embedding_store:
    name: NumpyEmbeddingStore
    <key>: <value>  # For any additional arguments to `NumpyEmbeddingStore`

generator:
  name: OpenAIGenerator
  <key>: <value>  # For any additional arguments to `OpenAIGenerator`

<key>: <value>  # For any additional arguments to `RagSystem` or `Demo`
```

The config can also just be empty, to use the defaults. This is typically not
recommended, however, as you would probably need to at least specify the configuration
of your stores.

Note that if you change the configuration then you might need extra dependencies when
building your image.

Note that some components need additional environment variables to be set, such as
`OPENAI_API_KEY` for the `OpenAIEmbedder` and `OpenAIGenerator`. These can be set by
including a `.env` file in the working directory when building the Docker image, and it
will be copied into the image and used during compilation and running of the demo.

If you have any data on disk, you can simply mount it into the Docker container by
adding the `-v` flag to the `docker run` command.

To run a GPU-based Docker container, first ensure that the NVIDIA Container Toolkit is
[installed](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installation)
and
[configured](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#configuring-docker).
Ensure that the the CUDA version stated at the top of the Dockerfile matches the CUDA
version installed (which you can check using `nvidia-smi`). After that, we build the
image as follows:

```bash
docker build --pull --ssh default --build-arg config=<config-name> -t alexandrainst_ragger -f Dockerfile.cuda .
docker run --gpus 1 --rm -p 7860:7860 alexandrainst_ragger
```
