Metadata-Version: 2.1
Name: ALLM
Version: 1.0.5
Summary: A simple and efficient python library for fast inference of GGUF Large Language Models.
Author: All Advance AI
Author-email: allmdev@allaai.com
Maintainer: All Advance AI
Maintainer-email: allmdev@allaai.com
Keywords: GGUF,GGUF Large Language Model,GGUF Large Language Models,GGUF Large Language Modeling,GGUF Large Language Modeling Library
Description-Content-Type: text/markdown
Requires-Dist: Flask
Requires-Dist: click
Requires-Dist: llama-index
Requires-Dist: llama-cpp-python
Requires-Dist: aiohttp
Requires-Dist: llama-index-llms-llama-cpp
Requires-Dist: huggingface-hub
Requires-Dist: langchain ==0.0.267
Requires-Dist: chromadb ==0.3.26
Requires-Dist: pdfminer.six
Requires-Dist: pydantic ==1.10.13
Requires-Dist: sentence-transformers
Requires-Dist: vertexai
Requires-Dist: google-cloud-aiplatform
Requires-Dist: openai


# ALLM


ALLM is a Python library designed for fast inference of GGUF (Generic Global Unsupervised Features) Large Language Models (LLMs) on both CPU and GPU. It provides a convenient interface for loading pre-trained GGUF models and performing inference using them. This library is ideal for applications where quick response times are crucial, such as chatbots, text generation, and more.

## Features


- **Efficient Inference**: ALLM leverages the power of LLM models to provide fast and accurate inference.
- **CPU and GPU Support**: The library is optimized for both CPU and GPU, allowing you to choose the best hardware for your application.
- **Simple Interface**: With a straightforward command line support, you can easily load models and perform inference with just a single command.
- **Flexible Configuration**: Customize inference settings such as temperature and model path to suit your needs.
- **Automated Hosting Configuration**: Models are swiftly downloaded and configured in your environment, enabling them to be operational within minutes.


## Operating System Compatibility


This table outlines the compatibility of different operating systems with their respective providers:

<div style="background-color: #f2f2f2; padding: 10px;">
    
| **OS Type** | **<img src="https://th.bing.com/th/id/OIP.9SFh8OY0QsojPqM39ls-NAHaHa?w=520&h=520&rs=1&pid=ImgDetMain" alt="" width="30"/><p> Windows** | **<img src="https://www.svgrepo.com/download/3968/linux.svg" alt="Linux" width="30"/><p> Linux** | **<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/f/fa/Apple_logo_black.svg/135px-Apple_logo_black.svg.png" alt="Mac" width="25"/><p> MacOS** |
|:--------:|:--------:|:--------:|:--------:|
| **Status** | Supported | Supported | Coming Soon |
| **Dependencies** | Python, [VS Tools](https://visualstudio.microsoft.com/downloads/#remote-tools-for-visual-studio-2022) | Python,<p> ```bash sudo apt-get install build-essential -y``` | Coming Soon|
| **Models Support** | - Local Models - VertixAI - Azure OpenAI | - Local Models - VertixAI - Azure OpenAI  | Coming Soon
 |

</div>


## Supported Models 

| LLM Family    | Hosting            | Supported LLMs                               |
|---------------|--------------------|-----------------------------------------------|
| ALLM       | Self Hosted (gguf)    | Mistral, Mistral_instruct, Llama2, full list availble in supported model section.           |
| Azure         | AzureOpen AI    | gpt-35-turbo, gpt-4, gpt-4-turbo or any.            |
| Google LLMs   | VertexAI deployment | gemini-pro, text-bison@001 or any.                  |
| Llama2        | Azure deployment   | llama2-7b, llama2-13b, llama2-70b             |
| Mistral       | Azure deployment   | Mistral-7b, Mixtral-7bx8                     |



## Installation


You can install ALLM using pip:

```bash
pip install allm
```
## Usage
### 0.1 LocalModel Generic Prompt 

You can start inference with a simple 'allm-run' command. The command takes name or path, temperature(optional), max new tokens(optional) and additional model kwargs(optional) as arguments.

when you run the allm-run, Defualt Mistral model will be downloaded to your systems and gets configured automatically, if the model name is not provided. 

```bash
allm-run --name model_name_or_path
```

## API
### 0.2 LocalModel Generic API 

You can initiate the inference API by simply using the 'allm-serve' command. This command launches the API server on the default host, 127.0.0.1:5000. If you prefer to run the API server on a different port and host, you have the option to customize the apiconfig.txt file within your model directory.

```bash
allm-serve
```

## ALLM Agents

### 1.1 New Agent Creation 
To create local agent, begin by loading your knowledge documents into the database using the allm-newagent command and specifying the agent name:

```bash
allm-newagent --doc "document_path" --agent agent_name
```

or

```bash
allm-newagent --dir "directory containing files to be ingested" --agent agent_name
```

### 1.2 Agent Chat  
After agent is created successfully with your knowledge document, you can start the local agent chat with the allm-agentchat command:

```bash
allm-agentchat --agent agent name
```

After your agents are created you can also initiate agent-specific API server using the allm-agentapi command:

### 1.3 Agent API
```bash
allm-agentapi --agent agent name
```

After your agents are created you can also update the knowledge on the existing agent by adding documents using allm-updateagent command:

```bash
allm-updateagent --agent agent name
```

## Supported Cloud models.

ALLM supports all types of Generative LLMs on AzureOpenAI & VertexAI, including GPT(s) & Geminipro models. You can start local inference of cloud based models using the following command:
 
### 2.1 VertixAI Generic Prompt 
```bash
allm-run-vertex --projectid Id_of_your_GCP_project --region location_of_your_cloud_server
```

or 

```bash
allm-run-vertex
```

### 2.2 AzureOpneAI Generic Prompt
```bash
allm-run-azure --key key --version version --endpoint https://{your_endpoint}.openai.azure.com --model model_name
```

or 

```bash
allm-run-azure
```
 
You can also have a custom agent working with your cloud deployed model using the following command. It is important to note that before this step, agent should be created using section: **1.1 New Agent Creation**. 

### 2.3 VertixAI AgentChat 
```bash
allm-agentchat-vertex --projectid Id_of_your_GCP_project --region location_of_your_cloud_server --agent agent_name
```

model\vertex-config.json needs to be configured to use below command, this will ensure projectid, region are captured.

```bash
allm-agentchat-vertex --agent agent_name
```

### 2.4 AzureOpenAI AgentChat 

```bash
allm-agentchat-azure --key key --version version --endpoint https://{your_endpoint}.openai.azure.com --model model_name --agent agentname
```

model\azure-config.json needs to be configured to use below command, this will ensure endpoint, modelname etc are captured.

```bash
allm-agentchat-azure --agent agentname
```

model_name is an optional parameter in both vertex and azure, if not mentioned, inference will work on gemini-1.0-pro-002 for vertex and gpt-35-turbo for OpenAI by default.

### 2.5 AzureOpenAI AgentChat API 

```bash
allm-agentapi-azure --agent agentname
```

### 2.6 AzureOpenAI AgentChat API 

```bash
allm-agentapi-vertex --agent agent_name
```

## Supported Model names.

- Llama3
- Llama2
- Llama
- Llama2_chat
- Llama_chat
- Mistral
- Mistral_instruct


