Metadata-Version: 2.4
Name: modelstudio-sdk
Version: 0.2.0
Summary: Python SDK for the Model Studio REST API
License-Expression: MIT
License-File: LICENSE
Requires-Python: >=3.10
Requires-Dist: httpx<1.0,>=0.25.0
Requires-Dist: pydantic<3.0,>=2.0
Provides-Extra: dev
Requires-Dist: mypy>=1.8; extra == 'dev'
Requires-Dist: pandas>=1.5.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest-httpx>=0.30.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.4.0; extra == 'dev'
Provides-Extra: pandas
Requires-Dist: pandas>=1.5.0; extra == 'pandas'
Description-Content-Type: text/markdown

# Model Studio Python SDK

Python SDK for the Model Studio REST API. Provides typed access to projects, datasets, annotations, metrics, and ML workflow operations.

## Installation

```bash
# From wheel (in notebook containers, pre-installed)
pip install modelstudio-sdk

# Development install
git clone https://gitlab.com/orbitalinsight/elements/model-studio/modelstudio-sdk.git
cd modelstudio-sdk
pip install -e ".[dev]"

# With pandas support
pip install "modelstudio-sdk[pandas]"
```

## Quick Start

```python
from modelstudio import ModelStudioClient

# Auto-configured inside notebooks (reads env vars)
client = ModelStudioClient.from_env()

# Or explicit
client = ModelStudioClient(
    base_url="http://localhost:8081",
    jwt_token="eyJhbG...",
)

# List projects and datasets
projects = client.projects.list()
project = client.project(str(projects[0].id))
for ds in project.datasets():
    print(f"{ds.name} ({ds.dataset_type})")
```

## Environment Variables

| Variable | Required | Description |
|----------|----------|-------------|
| `MODEL_STUDIO_API_URL` | Yes | API base URL |
| `MODEL_STUDIO_JWT` | No | JWT authentication token |
| `MODEL_STUDIO_ORG` | No | Organization name |

## Usage Examples

### Projects & Datasets

```python
# List projects
projects = client.projects.list()

# Create a project
project = client.projects.create(name="My Project", description="...")

# List datasets in a project
datasets = client.project("project-uuid").datasets()

# Create a dataset in a project
ds_model = client.project("project-uuid").create_dataset(
    name="My Dataset",
    dataset_type="object-detection-coco",
)

# Work with a specific dataset (by ID, independent of project)
ds = client.dataset("dataset-uuid")
```

### Dataset Overview & Metrics

```python
ds = client.dataset("dataset-uuid")

overview = ds.overview()
print(f"Images: {overview.summary.total_images}")
print(f"Annotations: {overview.summary.total_annotations}")

for split in ds.splits():
    print(f"{split.name} ({split.split_type})")

cats = ds.categories()
for cat in cats.categories:
    print(f"{cat.name}: {cat.annotation_count} annotations")
```

### Category Management

```python
ds.merge_categories(source_categories=[1, 2, 3], target_category="vehicle")
ds.rename_category(category_id=5, new_name="truck")
ds.remove_category(category_id=10)
ds.consolidate_labels({"car": "vehicle", "van": "vehicle"})

# Undo any mutation
ds.undo()
```

### Split Operations

```python
ds.redistribute(ratios={"train": 0.8, "val": 0.1, "test": 0.1}, seed=42)
ds.class_aware_redistribute(ratios={"train": 0.8, "val": 0.2}, prevent_tile_leakage=True)
ds.check_leakage()
```

### Import & Export

```python
split = ds.split("split-uuid")

# Import from S3
queued = split.import_from_source("s3", "coco", {
    "connection_id": "conn-uuid",
    "bucket": "my-bucket",
    "prefix": "datasets/coco/",
})

# Export as COCO JSON
coco = ds.export_coco()
split_coco = split.export_coco()
```

### Cloning & Async Operations

```python
cloned = ds.clone(name="My Clone")
poller = ds.clone_poller(interval=2.0, max_wait=300.0)
result = poller.wait()
```

### Annotations

```python
from modelstudio.models.annotations import CreateAnnotationRequest

# Paginated listing
page = ds.list_annotations(split_id="...", page=0, size=50)

# Create
ds.create_annotation(CreateAnnotationRequest(
    image_id="img-uuid", category_id=1, bbox=[0, 0, 50, 50], area=2500,
))

# Bulk delete
ds.delete_annotations(annotation_ids=[1, 2, 3])
```

### Few-Shot & Oversampling

```python
from modelstudio.models.few_shot import FewShotRequest
from modelstudio.models.oversample import OversampleRequest

preview = ds.few_shot_preview(FewShotRequest(num_images=100, method="RANDOM", seed=42))
result = ds.oversample_execute(OversampleRequest(target_ratio=0.5, strategy="PREFER_ANNOTATED"))
```

### Dataset Merge

```python
from modelstudio.models.merge import MergeDatasetRequest

analysis = client.merge_datasets_analyze(["ds-1", "ds-2"])
result = client.merge_datasets(MergeDatasetRequest(
    source_dataset_ids=["ds-1", "ds-2"],
    target_name="merged-dataset",
))
```

### Experiments & Runs

```python
project = client.project("project-uuid")

# Create experiment
exp = project.experiments.create(name="YOLOv8 ablation")

# Create and submit a run
from modelstudio.models.runs import CreateRunRequest
run_model = project.experiment(str(exp.id)).runs.create(CreateRunRequest(
    name="baseline",
    model_architecture="yolov8",
    dataset_id="dataset-uuid",
))
run = project.experiment(str(exp.id)).run(str(run_model.id))
run.submit()

# Monitor
run.stages()
run.metrics()
run.metrics_data(name="loss")
```

### DataFrame Integration

```python
# Requires: pip install "modelstudio-sdk[pandas]"
ds.images_df(size=100)
ds.annotations_df(size=100)
ds.categories_df()
ds.class_distribution_df()
split.images_df()
split.annotations_df()
```

## Error Handling

```python
from modelstudio.exceptions import NotFoundError, ConflictError, BadRequestError

try:
    ds.overview()
except NotFoundError as e:
    print(f"Not found: {e.message}")
except BadRequestError as e:
    print(f"Bad request: {e.message}")
```

## API Reference

For the complete method listing, see the [API Reference](https://gitlab.com/orbitalinsight/elements/model-studio/model-studio-notebooks/-/blob/master/tutorials/api-reference.md) in the tutorials folder (also available in your notebook environment at `tutorials/api-reference.md`).

## Architecture

### Related Repositories

| Repo | Purpose |
|------|---------|
| [`model-studio-sdk`](https://gitlab.com/orbitalinsight/elements/model-studio/modelstudio-sdk) | This repo — Python SDK + Jupyter Server Docker |
| [`frontend`](https://gitlab.com/orbitalinsight/frontend-2.0) | Model Studio React frontend (custom notebook UI lives here) |
| [`model-studio-api`](https://gitlab.com/orbitalinsight/elements/model-studio/model-studio-api) | Backend REST API the SDK wraps |
| [`model-studio-notebooks`](https://gitlab.com/orbitalinsight/elements/model-studio/model-studio-notebooks) | JupyterHub + KubeSpawner Helm chart (production multi-user) |
| [`model-studio-agent`](https://gitlab.com/orbitalinsight/elements/model-studio/model-studio-agent) | Agent chat backend |
| [`keycloak-config`](https://gitlab.com/orbitalinsight/elements/keycloak-config) | Keycloak realm/client configuration |

### Local Development Architecture

```
┌─────────────────────────────────────────────────────────────┐
│  Browser (http://localhost:5173)                            │
│                                                             │
│  ┌───────────────────────────────────────────────────────┐  │
│  │  Model Studio Frontend (Vite)                         │  │
│  │  ┌──────────────┐  ┌──────────────┐  ┌────────────┐  │  │
│  │  │ Dataset Pages │  │ Notebook     │  │ Agent Chat │  │  │
│  │  │              │  │ Panel        │  │ Panel      │  │  │
│  │  └──────────────┘  └──────┬───────┘  └─────┬──────┘  │  │
│  └───────────────────────────┼────────────────┼──────────┘  │
│                              │                │             │
│  Vite Dev Server Proxies:    │                │             │
│  /jupyter/* ─────────────────┘                │             │
│  /agent/* ────────────────────────────────────┘             │
└──────────────────────────────┼────────────────┼─────────────┘
                               │                │
              ┌────────────────┘                │
              ▼                                 ▼
┌──────────────────────────┐    ┌──────────────────────────┐
│  Jupyter Server (Docker) │    │  Agent API               │
│  localhost:8889          │    │  localhost:8080           │
│                          │    │  (model-studio-agent)     │
│  ┌────────────────────┐  │    └──────────────────────────┘
│  │ Python 3.10 Kernel │  │
│  │ + Model Studio SDK │  │
│  └────────┬───────────┘  │
└───────────┼──────────────┘
            │
            ▼
┌──────────────────────────────────────────────────────────┐
│  Model Studio API                                        │
│  https://model-studio-api.elements.dev.privateer.com     │
└──────────────────────────────────────────────────────────┘
```

## Development

### Prerequisites

- [Miniconda](https://docs.conda.io/en/latest/miniconda.html) or Anaconda
- Docker + Docker Compose (for notebook server)
- `jq` and `curl` (for Keycloak token fetch)

### Quick Start

```bash
./develop.sh --mode setup              # Create conda env, install deps
./develop.sh --mode test               # Run unit tests
./develop.sh --mode test-integration   # Fetch Keycloak token + run integration tests
./develop.sh --mode lint               # Run ruff + mypy
./develop.sh --mode notebook-server    # Start headless Jupyter Server on port 8889
```

### Make Targets

```bash
make install              # pip install -e ".[dev]"
make test                 # Unit tests with coverage
make test-unit            # Unit tests only (no integration)
make lint                 # ruff + mypy
make build                # Build wheel
make notebook-server      # Start headless Jupyter Server (port 8889)
make docker               # Start full JupyterLab (port 8888)
```

### Integration Tests

```bash
./develop.sh --mode test-integration
```

Integration tests run against the live API (elements-dev). They expect a project named "SDK Testing" with datasets "sdk-test-read-only" and "sdk-test-mutations".

## Requirements

- Python >= 3.10
- httpx >= 0.25.0
- pydantic >= 2.0
- pandas >= 1.5.0 (optional)
