Metadata-Version: 2.4
Name: agent-cli
Version: 0.4.0
Summary: A suite of AI-powered command-line tools for text correction, audio transcription, and voice assistance.
Author-email: Bas Nijholt <bas@nijho.lt>
Project-URL: Homepage, https://github.com/basnijholt/agent-cli
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: wyoming
Requires-Dist: pyaudio
Requires-Dist: rich
Requires-Dist: pyperclip
Requires-Dist: pydantic-ai-slim[openai]
Requires-Dist: typer
Requires-Dist: pyperclip
Provides-Extra: test
Requires-Dist: pytest>=7.0.0; extra == "test"
Requires-Dist: pytest-asyncio>=0.20.0; extra == "test"
Requires-Dist: pytest-cov>=4.0.0; extra == "test"
Requires-Dist: pydantic-ai-slim[openai]; extra == "test"
Requires-Dist: pytest-timeout; extra == "test"
Provides-Extra: dev
Requires-Dist: agent-cli[test]; extra == "dev"
Requires-Dist: pre-commit>=3.0.0; extra == "dev"
Requires-Dist: versioningit; extra == "dev"
Requires-Dist: markdown-code-runner; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: notebook; extra == "dev"
Provides-Extra: speed
Requires-Dist: audiostretchy>=1.3.0; extra == "speed"
Dynamic: license-file

# Agent CLI

`agent-cli` is a Python command-line tool that provides a suite of AI-powered utilities.

> [!TIP]
> If using [`uv`](https://docs.astral.sh/uv/), you can easily run the tools from this package directly. For example, to see the help message for `autocorrect`:
>
> ```bash
> uvx agent-cli autocorrect --help
> ```

<details><summary><b><u>[ToC]</u></b> 📚</summary>

<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->

- [Features](#features)
- [Prerequisites](#prerequisites)
- [Installation](#installation)
- [Usage](#usage)
  - [`autocorrect`](#autocorrect)
  - [`transcribe`](#transcribe)
  - [`voice-assistant`](#voice-assistant)
  - [`interactive`](#interactive)
- [Development](#development)
  - [Running Tests](#running-tests)
  - [Pre-commit Hooks](#pre-commit-hooks)
- [Contributing](#contributing)
- [License](#license)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

</details>

## Features

- **`autocorrect`**: Correct grammar and spelling in your text using a local LLM with Ollama.
- **`transcribe`**: Transcribe audio files to text.
- **`voice-assistant`**: A voice-powered clipboard assistant.
- **`interactive`**: An interactive conversational agent with tool-calling capabilities.

## Prerequisites

- **Python**: Version 3.11 or higher.
- **Ollama**: For `autocorrect` and `voice-assistant`, you need [Ollama](https://ollama.ai/) running with a model pulled (e.g., `ollama pull devstral:24b`).
- **Wyoming Piper**: For `voice-assistant` and `interactive`, you need [Wyoming Piper](https://github.com/rhasspy/wyoming-piper) running to use text-to-speech (TTS).
- **Wyoming Faster Whisper**: For `transcribe`, you need [Wyoming Faster Whisper](https://github.com/rhasspy/wyoming-faster-whisper) running to use automatic speech recognition (ASR).
- **xsel**, **xclip**, or **pbcopy**: Many of the tools use these to interact with the clipboard.
- **PortAudio**: Required to get pyAudio to work.

## Installation

Install `agent-cli` using pip:

```bash
pip install agent-cli
```

Or for development:

1. **Clone the repository:**

   ```bash
   git clone git@github.com:basnijholt/agent-cli.git
   cd agent-cli
   ```

2. **Install in development mode:**

   ```bash
   uv sync
   source .venv/bin/activate  # On Windows use `.venv\Scripts\activate`
   ```

## Usage

This package provides multiple command-line tools.

### `autocorrect`

Corrects text from your clipboard or direct input.

<details>
<summary>See the output of <code>agent-cli autocorrect --help</code></summary>

<!-- CODE:BASH:START -->
<!-- echo '```yaml' -->
<!-- export NO_COLOR=1 -->
<!-- export TERM=dumb -->
<!-- export TERMINAL_WIDTH=80 -->
<!-- agent-cli autocorrect --help -->
<!-- echo '```' -->
<!-- CODE:END -->

<!-- OUTPUT:START -->
<!-- ⚠️ This content is auto-generated by `markdown-code-runner`. -->
```yaml
                                                                                
 Usage: agent-cli autocorrect [OPTIONS] [TEXT]                                  
                                                                                
 Correct text from clipboard using a local Ollama model.                        
                                                                                
                                                                                
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│   text      [TEXT]  The text to correct. If not provided, reads from         │
│                     clipboard.                                               │
│                     [default: None]                                          │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --model        -m      TEXT  The Ollama model to use. Default is             │
│                              devstral:24b.                                   │
│                              [default: devstral:24b]                         │
│ --ollama-host          TEXT  The Ollama server host. Default is              │
│                              http://localhost:11434.                         │
│                              [default: http://localhost:11434]               │
│ --log-level            TEXT  Set logging level. [default: WARNING]           │
│ --log-file             TEXT  Path to a file to write logs to.                │
│                              [default: None]                                 │
│ --quiet        -q            Suppress console output from rich.              │
│ --help                       Show this message and exit.                     │
╰──────────────────────────────────────────────────────────────────────────────╯

```

<!-- OUTPUT:END -->

</details>

### `transcribe`

Transcribes whatever you say into text using Wyoming ASR (Automatic Speech Recognition) which uses faster-whisper.

<details>
<summary>See the output of <code>agent-cli transcribe --help</code></summary>

<!-- CODE:BASH:START -->
<!-- echo '```yaml' -->
<!-- export NO_COLOR=1 -->
<!-- export TERM=dumb -->
<!-- export TERMINAL_WIDTH=80 -->
<!-- agent-cli transcribe --help -->
<!-- echo '```' -->
<!-- CODE:END -->

<!-- OUTPUT:START -->
<!-- ⚠️ This content is auto-generated by `markdown-code-runner`. -->
```yaml
                                                                                
 Usage: agent-cli transcribe [OPTIONS]                                          
                                                                                
 Wyoming ASR Client for streaming microphone audio to a transcription server.   
                                                                                
 Usage: - Run in foreground: agent-cli transcribe --device-index 1 - Run in     
 background: agent-cli transcribe --device-index 1 & - Check status: agent-cli  
 transcribe --status - Stop background process: agent-cli transcribe --stop     
                                                                                
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --device-index                           INTEGER  Index of the PyAudio input │
│                                                   device to use.             │
│                                                   [default: None]            │
│ --device-name                            TEXT     Device name keywords for   │
│                                                   partial matching. Supports │
│                                                   comma-separated list where │
│                                                   each term can partially    │
│                                                   match device names         │
│                                                   (case-insensitive). First  │
│                                                   matching device is         │
│                                                   selected.                  │
│                                                   [default: None]            │
│ --list-devices                                    List available audio input │
│                                                   devices and exit.          │
│ --asr-server-ip                          TEXT     Wyoming ASR server IP      │
│                                                   address.                   │
│                                                   [default: 192.168.1.143]   │
│ --asr-server-port                        INTEGER  Wyoming ASR server port.   │
│                                                   [default: 10300]           │
│ --model            -m                    TEXT     The Ollama model to use.   │
│                                                   Default is devstral:24b.   │
│                                                   [default: devstral:24b]    │
│ --ollama-host                            TEXT     The Ollama server host.    │
│                                                   Default is                 │
│                                                   http://localhost:11434.    │
│                                                   [default:                  │
│                                                   http://localhost:11434]    │
│ --llm                  --no-llm                   Use an LLM to process the  │
│                                                   transcript.                │
│                                                   [default: no-llm]          │
│ --stop                                            Stop any running           │
│                                                   background process.        │
│ --status                                          Check if a background      │
│                                                   process is running.        │
│ --clipboard            --no-clipboard             Copy result to clipboard.  │
│                                                   [default: clipboard]       │
│ --log-level                              TEXT     Set logging level.         │
│                                                   [default: WARNING]         │
│ --log-file                               TEXT     Path to a file to write    │
│                                                   logs to.                   │
│                                                   [default: None]            │
│ --quiet            -q                             Suppress console output    │
│                                                   from rich.                 │
│ --help                                            Show this message and      │
│                                                   exit.                      │
╰──────────────────────────────────────────────────────────────────────────────╯

```

<!-- OUTPUT:END -->

</details>

### `voice-assistant`

Starts the voice assistant. Supports daemon mode with process management.

**Basic Usage:**
```bash
# Run in foreground
agent-cli voice-assistant --device-index 1

# Run in background
agent-cli voice-assistant --device-index 1 &

# Check status
agent-cli voice-assistant --status

# Stop background process
agent-cli voice-assistant --stop
```

**Keyboard Maestro Integration:**
The process management features make it perfect for hotkey toggles. Use `--status` to check if running, `--stop` to stop, and `&` to start in background.

<details>
<summary>See the output of <code>agent-cli voice-assistant --help</code></summary>

<!-- CODE:BASH:START -->
<!-- echo '```yaml' -->
<!-- export NO_COLOR=1 -->
<!-- export TERM=dumb -->
<!-- export TERMINAL_WIDTH=80 -->
<!-- agent-cli voice-assistant --help -->
<!-- echo '```' -->
<!-- CODE:END -->

<!-- OUTPUT:START -->
<!-- ⚠️ This content is auto-generated by `markdown-code-runner`. -->
```yaml
                                                                                
 Usage: agent-cli voice-assistant [OPTIONS]                                     
                                                                                
 Interact with clipboard text via a voice command using Wyoming and an Ollama   
 LLM.                                                                           
                                                                                
 Usage: - Run in foreground: agent-cli voice-assistant --device-index 1 - Run   
 in background: agent-cli voice-assistant --device-index 1 & - Check status:    
 agent-cli voice-assistant --status - Stop background process: agent-cli        
 voice-assistant --stop - List output devices: agent-cli voice-assistant        
 --list-output-devices - Save TTS to file: agent-cli voice-assistant --tts      
 --save-file response.wav                                                       
                                                                                
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --device-index                               INTEGER  Index of the PyAudio   │
│                                                       input device to use.   │
│                                                       [default: None]        │
│ --device-name                                TEXT     Device name keywords   │
│                                                       for partial matching.  │
│                                                       Supports               │
│                                                       comma-separated list   │
│                                                       where each term can    │
│                                                       partially match device │
│                                                       names                  │
│                                                       (case-insensitive).    │
│                                                       First matching device  │
│                                                       is selected.           │
│                                                       [default: None]        │
│ --list-devices                                        List available audio   │
│                                                       input devices and      │
│                                                       exit.                  │
│ --asr-server-ip                              TEXT     Wyoming ASR server IP  │
│                                                       address.               │
│                                                       [default:              │
│                                                       192.168.1.143]         │
│ --asr-server-port                            INTEGER  Wyoming ASR server     │
│                                                       port.                  │
│                                                       [default: 10300]       │
│ --model                -m                    TEXT     The Ollama model to    │
│                                                       use. Default is        │
│                                                       devstral:24b.          │
│                                                       [default:              │
│                                                       devstral:24b]          │
│ --ollama-host                                TEXT     The Ollama server      │
│                                                       host. Default is       │
│                                                       http://localhost:1143… │
│                                                       [default:              │
│                                                       http://localhost:1143… │
│ --stop                                                Stop any running       │
│                                                       background process.    │
│ --status                                              Check if a background  │
│                                                       process is running.    │
│ --clipboard                --no-clipboard             Copy result to         │
│                                                       clipboard.             │
│                                                       [default: clipboard]   │
│ --log-level                                  TEXT     Set logging level.     │
│                                                       [default: WARNING]     │
│ --log-file                                   TEXT     Path to a file to      │
│                                                       write logs to.         │
│                                                       [default: None]        │
│ --quiet                -q                             Suppress console       │
│                                                       output from rich.      │
│ --tts                      --no-tts                   Enable text-to-speech  │
│                                                       for responses.         │
│                                                       [default: no-tts]      │
│ --tts-server-ip                              TEXT     Wyoming TTS server IP  │
│                                                       address.               │
│                                                       [default:              │
│                                                       192.168.1.143]         │
│ --tts-server-port                            INTEGER  Wyoming TTS server     │
│                                                       port.                  │
│                                                       [default: 10200]       │
│ --voice                                      TEXT     Voice name to use for  │
│                                                       TTS (e.g.,             │
│                                                       'en_US-lessac-medium'… │
│                                                       [default: None]        │
│ --tts-language                               TEXT     Language for TTS       │
│                                                       (e.g., 'en_US').       │
│                                                       [default: None]        │
│ --speaker                                    TEXT     Speaker name for TTS   │
│                                                       voice.                 │
│                                                       [default: None]        │
│ --tts-speed                                  FLOAT    Speech speed           │
│                                                       multiplier (1.0 =      │
│                                                       normal, 2.0 = twice as │
│                                                       fast, 0.5 = half       │
│                                                       speed).                │
│                                                       [default: 1.0]         │
│ --output-device-index                        INTEGER  Index of the PyAudio   │
│                                                       output device to use   │
│                                                       for TTS.               │
│                                                       [default: None]        │
│ --output-device-name                         TEXT     Output device name     │
│                                                       keywords for partial   │
│                                                       matching. Supports     │
│                                                       comma-separated list   │
│                                                       where each term can    │
│                                                       partially match device │
│                                                       names                  │
│                                                       (case-insensitive).    │
│                                                       First matching device  │
│                                                       is selected.           │
│                                                       [default: None]        │
│ --list-output-devices                                 List available audio   │
│                                                       output devices and     │
│                                                       exit.                  │
│ --save-file                                  PATH     Save TTS response      │
│                                                       audio to WAV file.     │
│                                                       [default: None]        │
│ --help                                                Show this message and  │
│                                                       exit.                  │
╰──────────────────────────────────────────────────────────────────────────────╯

```

<!-- OUTPUT:END -->

</details>

### `interactive`

An interactive conversational agent that remembers your conversation and can use tools to perform actions.

**Available Tools:**
- `read_file`: Reads the content of a file.
- `execute_code`: Executes a shell command.

<details>
<summary>See the output of <code>agent-cli interactive --help</code></summary>

<!-- CODE:BASH:START -->
<!-- echo '```yaml' -->
<!-- export NO_COLOR=1 -->
<!-- export TERM=dumb -->
<!-- export TERMINAL_WIDTH=80 -->
<!-- agent-cli interactive --help -->
<!-- echo '```' -->
<!-- CODE:END -->

<!-- OUTPUT:START -->
<!-- ⚠️ This content is auto-generated by `markdown-code-runner`. -->
```yaml
                                                                                
 Usage: agent-cli interactive [OPTIONS]                                         
                                                                                
 An interactive agent that you can talk to.                                     
                                                                                
                                                                                
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --device-index                         INTEGER  Index of the PyAudio input   │
│                                                 device to use.               │
│                                                 [default: None]              │
│ --device-name                          TEXT     Device name keywords for     │
│                                                 partial matching. Supports   │
│                                                 comma-separated list where   │
│                                                 each term can partially      │
│                                                 match device names           │
│                                                 (case-insensitive). First    │
│                                                 matching device is selected. │
│                                                 [default: None]              │
│ --list-devices                                  List available audio input   │
│                                                 devices and exit.            │
│ --asr-server-ip                        TEXT     Wyoming ASR server IP        │
│                                                 address.                     │
│                                                 [default: 192.168.1.143]     │
│ --asr-server-port                      INTEGER  Wyoming ASR server port.     │
│                                                 [default: 10300]             │
│ --model                -m              TEXT     The Ollama model to use.     │
│                                                 Default is devstral:24b.     │
│                                                 [default: devstral:24b]      │
│ --ollama-host                          TEXT     The Ollama server host.      │
│                                                 Default is                   │
│                                                 http://localhost:11434.      │
│                                                 [default:                    │
│                                                 http://localhost:11434]      │
│ --stop                                          Stop any running background  │
│                                                 process.                     │
│ --status                                        Check if a background        │
│                                                 process is running.          │
│ --log-level                            TEXT     Set logging level.           │
│                                                 [default: WARNING]           │
│ --log-file                             TEXT     Path to a file to write logs │
│                                                 to.                          │
│                                                 [default: None]              │
│ --quiet                -q                       Suppress console output from │
│                                                 rich.                        │
│ --tts                      --no-tts             Enable text-to-speech for    │
│                                                 responses.                   │
│                                                 [default: no-tts]            │
│ --tts-server-ip                        TEXT     Wyoming TTS server IP        │
│                                                 address.                     │
│                                                 [default: 192.168.1.143]     │
│ --tts-server-port                      INTEGER  Wyoming TTS server port.     │
│                                                 [default: 10200]             │
│ --voice                                TEXT     Voice name to use for TTS    │
│                                                 (e.g.,                       │
│                                                 'en_US-lessac-medium').      │
│                                                 [default: None]              │
│ --tts-language                         TEXT     Language for TTS (e.g.,      │
│                                                 'en_US').                    │
│                                                 [default: None]              │
│ --speaker                              TEXT     Speaker name for TTS voice.  │
│                                                 [default: None]              │
│ --tts-speed                            FLOAT    Speech speed multiplier (1.0 │
│                                                 = normal, 2.0 = twice as     │
│                                                 fast, 0.5 = half speed).     │
│                                                 [default: 1.0]               │
│ --output-device-index                  INTEGER  Index of the PyAudio output  │
│                                                 device to use for TTS.       │
│                                                 [default: None]              │
│ --output-device-name                   TEXT     Output device name keywords  │
│                                                 for partial matching.        │
│                                                 Supports comma-separated     │
│                                                 list where each term can     │
│                                                 partially match device names │
│                                                 (case-insensitive). First    │
│                                                 matching device is selected. │
│                                                 [default: None]              │
│ --list-output-devices                           List available audio output  │
│                                                 devices and exit.            │
│ --save-file                            PATH     Save TTS response audio to   │
│                                                 WAV file.                    │
│                                                 [default: None]              │
│ --history-dir                          PATH     Directory to store           │
│                                                 conversation history.        │
│                                                 [default:                    │
│                                                 ~/.config/agent-cli/history] │
│ --help                                          Show this message and exit.  │
╰──────────────────────────────────────────────────────────────────────────────╯

```

<!-- OUTPUT:END -->

</details>


## Development

### Running Tests

The project uses `pytest` for testing. To run tests using `uv`:

```bash
uv run pytest
```

### Pre-commit Hooks

This project uses pre-commit hooks (ruff for linting and formatting, mypy for type checking) to maintain code quality. To set them up:

1. Install pre-commit:

   ```bash
   pip install pre-commit
   ```

2. Install the hooks:

   ```bash
   pre-commit install
   ```

   Now, the hooks will run automatically before each commit.

## Contributing

Contributions are welcome! If you find a bug or have a feature request, please open an issue. If you'd like to contribute code, please fork the repository and submit a pull request.

## License

This project is licensed under the MIT License - see the `LICENSE` file for details.
