Metadata-Version: 2.4
Name: obsforge
Version: 0.1.3
Summary: Zero-infra drop-in structured logging & observability for Python (Grafana/Loki, Elastic, any stack)
Project-URL: Homepage, https://github.com/AnthonyGrullonA/OBSFORGE
Project-URL: Repository, https://github.com/AnthonyGrullonA/OBSFORGE
Project-URL: Issues, https://github.com/AnthonyGrullonA/OBSFORGE/issues
Project-URL: Changelog, https://github.com/AnthonyGrullonA/OBSFORGE/blob/main/CHANGELOG.md
Author: Anthony Grullon
License-Expression: MIT
License-File: LICENSE
Keywords: django,fastapi,logging,observability,opentelemetry,structured-logging
Classifier: Development Status :: 3 - Alpha
Classifier: Framework :: Django
Classifier: Framework :: FastAPI
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: System :: Logging
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: orjson>=3.10.0
Requires-Dist: pydantic>=2.8.0
Provides-Extra: celery
Requires-Dist: celery>=5.3.0; extra == 'celery'
Provides-Extra: db
Requires-Dist: aiomysql>=0.2.0; extra == 'db'
Requires-Dist: asyncpg>=0.29.0; extra == 'db'
Requires-Dist: psycopg>=3.1.0; extra == 'db'
Requires-Dist: sqlalchemy>=2.0.0; extra == 'db'
Provides-Extra: dev
Requires-Dist: anyio>=4.4.0; extra == 'dev'
Requires-Dist: build>=1.2.0; extra == 'dev'
Requires-Dist: mypy>=1.11.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.24.0; extra == 'dev'
Requires-Dist: pytest-cov>=5.0.0; extra == 'dev'
Requires-Dist: pytest>=8.3.0; extra == 'dev'
Requires-Dist: ruff>=0.6.0; extra == 'dev'
Requires-Dist: twine>=5.1.0; extra == 'dev'
Provides-Extra: django
Requires-Dist: django>=5.1; extra == 'django'
Provides-Extra: drf
Requires-Dist: django>=5.1; extra == 'drf'
Requires-Dist: djangorestframework>=3.15.0; extra == 'drf'
Provides-Extra: fastapi
Requires-Dist: fastapi>=0.115.0; extra == 'fastapi'
Requires-Dist: starlette>=0.40.0; extra == 'fastapi'
Provides-Extra: http
Requires-Dist: aiohttp>=3.10.0; extra == 'http'
Requires-Dist: httpx>=0.27.0; extra == 'http'
Requires-Dist: requests>=2.32.0; extra == 'http'
Provides-Extra: otel
Requires-Dist: opentelemetry-api>=1.27.0; extra == 'otel'
Requires-Dist: opentelemetry-sdk>=1.27.0; extra == 'otel'
Requires-Dist: opentelemetry-semantic-conventions>=0.48b0; extra == 'otel'
Description-Content-Type: text/markdown

# obsforge

[![PyPI](https://img.shields.io/pypi/v/obsforge.svg)](https://pypi.org/project/obsforge/)
[![Python](https://img.shields.io/pypi/pyversions/obsforge.svg)](https://pypi.org/project/obsforge/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://github.com/AnthonyGrullonA/OBSFORGE/blob/main/LICENSE)

**Zero-infra, drop-in structured logging & observability for Python.**

Add it to your code and your logs come out as structured JSON on **stdout** —
ready for your existing shipper (promtail, filebeat, fluent-bit, otel-collector)
to deliver to **Grafana/Loki, Elastic, or any stack**. obsforge **does not run,
require, or manage any extra infrastructure**. It writes to stdout; your platform
already knows how to collect that.

```bash
pip install obsforge
```

```python
import logging, obsforge

obsforge.install_logging_bridge()          # one line, no bootstrap, no code rewrite

logging.getLogger("checkout.auth").warning("login failed", extra={"reason": "bad_password"})
```

That's the whole integration. Set the service name once via the OpenTelemetry-standard
env var and every event is attributable:

```bash
export OTEL_SERVICE_NAME=checkout         # otherwise logs show "unknown-service"
```

---

## What problem does it solve?

Most apps log unstructured strings, then teams bolt on regex parsing, inconsistent
fields, and accidental high-cardinality labels that blow up Loki. Tracing and
logs live in separate worlds. And "observability SDKs" often drag in collectors,
agents, or external services you have to operate.

obsforge takes the opposite stance:

- **Semantic events, not strings.** Every log is a canonical event with a stable
  schema (`event`, `severity`, `service`, `correlation`, `trace`, ...), not a
  free-form `message`.
- **Cardinality-safe by construction.** Output is a three-section document where
  low-cardinality keys become labels and high-cardinality identifiers
  (`trace_id`, `user_id`, `tenant_id`, ...) are **never** labels.
- **Correlation built in.** W3C trace context + correlation ids flow across HTTP,
  Celery, Kafka, RabbitMQ, asyncio and background workers — so logs and traces
  share ids.
- **Zero extra infrastructure.** stdout-first, JSON-first. No collector, agent, or
  service is required by the library.
- **Drop-in.** Works with your existing `logging` calls; no rewrite.

## What it is *not*

It is not an agent, a daemon, or a hosted service. It doesn't ship logs itself —
your platform's collector does. It has no external service dependency in the
default path.

---

## Features

- 🪵 **stdlib `logging` bridge** — route existing `logger.*` calls through obsforge with one line.
- 🧱 **Canonical event model** — typed (pydantic), with HTTP / DB / exception / cache / security / business / dependency context.
- 🏷️ **Loki label governance** — `LokiLabelPolicy` + `StructuredMetadataPolicy`; no cardinality footguns.
- 🔗 **Distributed correlation** — W3C `traceparent` / `baggage`, propagated across services, queues, and tasks.
- 🛡️ **Security by default** — deep PII scrubbing (email, JWT/bearer, card, SSN, IP) across every string field; identity validation on ingress.
- 🧯 **Fail-open pipeline** — a telemetry fault never breaks your request.
- 🧩 **Framework adapters** — FastAPI/Starlette, Django/DRF middleware; HTTP-client and DB instrumentation.
- 🔭 **Optional OpenTelemetry** — opt-in log export + trace bridge (off by default; no collector needed otherwise).
- ✅ **Typed & tested** — `mypy --strict` clean, layered test suite, benchmarked.

---

## Installation

```bash
pip install obsforge                 # core (orjson, pydantic — no other runtime deps)
pip install "obsforge[fastapi]"      # FastAPI / Starlette middleware
pip install "obsforge[django]"       # Django middleware
pip install "obsforge[drf]"          # Django REST Framework exception handler
pip install "obsforge[celery]"       # Celery task correlation
pip install "obsforge[db]"           # SQLAlchemy / psycopg / asyncpg / aiomysql
pip install "obsforge[http]"         # httpx / requests / aiohttp client instrumentation
pip install "obsforge[otel]"         # optional OpenTelemetry export
```

Python **3.11+** (CI runs 3.11, 3.12, 3.13).

---

## Quickstart

### 1) Drop-in for existing `logging` code (recommended start)

The minimal integration is a single call — no `bootstrap()`, no objects to thread
through your code. The bridge initializes a default SDK on first use:

```python
import logging, obsforge

obsforge.install_logging_bridge()        # set OTEL_SERVICE_NAME in the environment

log = logging.getLogger("checkout.orders")
log.info("order placed", extra={"order_id": "o_123", "amount_cents": 4200})

try:
    charge(order)
except PaymentError:
    log.exception("charge failed", extra={"order_id": "o_123"})   # traceback captured
```

When you need explicit configuration (environment, Loki preset, redaction), bootstrap
once and the bridge reuses that SDK automatically:

```python
sdk = obsforge.bootstrap(obsforge.ObsforgeSettings(
    service_name="checkout",
    environment="production",
    loki={"preset": "prod"},     # dev | staging | prod label policy
))
obsforge.install_logging_bridge()        # reuses the client registered by bootstrap()
# (or pass it explicitly: obsforge.install_logging_bridge(sdk.event_client))
```

`logger.exception(...)` captures a structured exception context. Logs emitted
inside a request automatically inherit its correlation id.

### 2) Explicit semantic events

```python
sdk.logger.log_sync("auth.login.failed", severity=obsforge.Severity.WARNING, reason="bad_password")
await sdk.logger.log("billing.invoice.paid", amount_cents=4200)

@obsforge.instrument(sdk.event_client, "checkout.order.place")
def place_order(...): ...
```

### 3) FastAPI

```python
import obsforge
from fastapi import FastAPI
from obsforge.integrations.fastapi.middleware import FastAPIObservabilityMiddleware

sdk = obsforge.bootstrap(obsforge.ObsforgeSettings(service_name="api"))
app = FastAPI()
app.add_middleware(FastAPIObservabilityMiddleware, api_engine=sdk.api_engine)
```

See [`examples/`](examples/) for FastAPI, Django, worker, and plain-logging apps.

### 4) OpenTelemetry (optional)

```bash
pip install "obsforge[otel]"
```
```python
from opentelemetry.sdk._logs import LoggerProvider
from opentelemetry.sdk._logs.export import BatchLogRecordProcessor
from opentelemetry.exporter.otlp.proto.http._log_exporter import OTLPLogExporter

provider = LoggerProvider()
provider.add_log_record_processor(BatchLogRecordProcessor(OTLPLogExporter()))

sdk = obsforge.bootstrap(
    obsforge.ObsforgeSettings(service_name="api", otel={"logs_enabled": True}),
    otel_logger_provider=provider,   # inject the configured provider to actually export
)
# Events are also emitted as OTel LogRecords with trace_id in the record context.
```
OTel export is off by default — stdout remains the default path. Opting in
(`logs_enabled=True` / `traces_enabled=True`) requires injecting the matching
provider (`otel_logger_provider=` / `otel_tracer_provider=`); enabling it without
one raises `ConfigurationError` rather than silently dropping every record.

---

## What the output looks like

Each event is one JSON line on stdout, split into three sections:

```json
{
  "labels": { "service": "checkout", "environment": "production", "level": "warning", "event_kind": "api", "outcome": "failed" },
  "structured_metadata": { "trace_id": "0af7...", "correlation_id": "...", "user_id": "u_1", "fingerprint": "..." },
  "body": { "event_name": "auth.login.failed", "message": "login failed", "severity": "warning", "...": "..." }
}
```

- **`labels`** → low-cardinality, safe to index in Loki.
- **`structured_metadata`** → high-cardinality ids; queryable, **never** labels.
- **`body`** → the full canonical event (your log line).

### How your stack consumes it (no library infra)

| Backend | How |
|---|---|
| **Grafana / Loki** | promtail or Alloy maps `labels`→labels, `structured_metadata`→structured metadata, `body`→log line ([config](https://github.com/AnthonyGrullonA/OBSFORGE/blob/main/docs/operational/loki_otel_setup.md)) |
| **Elastic** | filebeat / fluent-bit ingest the JSON line; query by `body.*` / `structured_metadata.*` |
| **Anything** | it's JSON on stdout — if your platform collects stdout, it just works |

### Grafana / Loki (promtail snippet)

Map the three sections explicitly — never auto-flatten the whole document into labels:

```yaml
pipeline_stages:
  - json:
      expressions: { labels: labels, structured_metadata: structured_metadata, body: body }
  - labels:
      service:
      environment:
      level:
      event_kind:
      outcome:
  - structured_metadata:
      trace_id:
      correlation_id:
      user_id:
  - output:
      source: body
```

Full promtail/Alloy + OTLP reference: [docs/operational/loki_otel_setup.md](https://github.com/AnthonyGrullonA/OBSFORGE/blob/main/docs/operational/loki_otel_setup.md).

---

## Governance & safety (defaults you get for free)

- **No cardinality footguns:** `trace_id`, `user_id`, `tenant_id`, `request_id`,
  `correlation_id`, `session_id`, `event_id` can never be emitted as labels.
- **PII scrubbing:** emails, bearer/JWT tokens, card numbers, SSNs and **IPv4**
  addresses are pattern-redacted from every string field — message, payload
  previews, headers, exception message **and stacktrace**, DB query text — and
  configured `redact_keys` (`password`, `token`, …) are replaced wholesale.

  **Scope (know the edges):** pattern scrubbing covers the shapes above; it does
  **not** yet catch IPv6, phone numbers, or generic cloud keys (e.g. `AKIA…`)
  unless they sit under a redacted key name. Key-based redaction applies to
  top-level fields; secrets buried inside a nested dict passed as a single `extra`
  value are flattened to a string, so only pattern scrubbing reaches them. Treat
  redaction as defense-in-depth, not a guarantee — don't deliberately log secrets.
- **Fail-open:** any error inside the pipeline is logged-and-dropped, never raised
  into your hot path.
- **Trusted propagation:** inbound identifiers are length- and control-char-validated;
  baggage is capped.

---

## Configuration

```python
obsforge.ObsforgeSettings(
    service_name="checkout",
    environment="production",
    min_severity=obsforge.Severity.INFO,      # sampler drops below-threshold events
    loki={"preset": "prod"},                  # label policy preset
    security={"scrub_pii": True, "trust_inbound_identity": True},
    otel={"logs_enabled": False},             # opt-in; requires the otel extra + a provider
)
```

Per-subsystem settings exist for HTTP, DB, exceptions, and distributed
correlation — see `obsforge.config.settings`.

---

## Performance

~28k–37k events/sec/core single-threaded; ~35 µs per `logger.info(...)` through
the bridge — about **8–9× plain stdlib formatting**, and that delta buys the
typed event, PII scrubbing, correlation, and Loki-governed JSON. Throughput holds
steady under concurrent load (Python 3.13, no-op sink). Run it yourself:

```bash
python benchmarks/run.py            # overhead + stdlib baseline + concurrency sweep
```

Details and methodology: [docs/operational/performance.md](https://github.com/AnthonyGrullonA/OBSFORGE/blob/main/docs/operational/performance.md).

---

## Documentation

- [Adoption guide](https://github.com/AnthonyGrullonA/OBSFORGE/blob/main/docs/adoption-guide.md) — full integration walkthrough
- [Architecture](https://github.com/AnthonyGrullonA/OBSFORGE/blob/main/docs/architecture.md) — layers and canonical event shape
- [Loki / OTel setup](https://github.com/AnthonyGrullonA/OBSFORGE/blob/main/docs/operational/loki_otel_setup.md) — promtail/Alloy + OTLP config
- [Performance](https://github.com/AnthonyGrullonA/OBSFORGE/blob/main/docs/operational/performance.md)
- [Support matrix](https://github.com/AnthonyGrullonA/OBSFORGE/blob/main/docs/support-matrix.md)
- [Publishing](https://github.com/AnthonyGrullonA/OBSFORGE/blob/main/docs/publishing.md)
- [Changelog](https://github.com/AnthonyGrullonA/OBSFORGE/blob/main/CHANGELOG.md)

---

## Compatibility

| | |
|---|---|
| Python | 3.11, 3.12, 3.13 (CI matrix) |
| Frameworks | FastAPI/Starlette, Django/DRF (via extras) |
| DB drivers | SQLAlchemy, psycopg, asyncpg, aiomysql (via `db` extra) |
| HTTP clients | httpx, requests, aiohttp (via `http` extra) |
| Backends | Grafana/Loki, Elastic, any stdout collector |

Postgres, Kafka, RabbitMQ, Celery brokers and an OTLP collector are **never run or
required** by obsforge — they're only relevant to instrumentation tests you opt into.

---

## Development

```bash
python3 -m venv .venv && source .venv/bin/activate   # Python 3.11+
pip install -e ".[dev,fastapi,django,db,http,otel]"

ruff check src tests          # lint
mypy src                      # strict type-check
pytest -m "not requires_postgres and not requires_mysql and not requires_kafka and not requires_rabbitmq and not requires_broker"
```

---

## License

MIT — see [LICENSE](https://github.com/AnthonyGrullonA/OBSFORGE/blob/main/LICENSE).
