Metadata-Version: 2.4
Name: dagster-kafka
Version: 1.1.2
Summary: Complete Kafka integration for Dagster with enterprise DLQ tooling
Home-page: https://github.com/kingsley-123/dagster-kafka-integration
Author: Kingsley Okonkwo
Project-URL: Documentation, https://github.com/kingsley-123/dagster-kafka-integration/blob/main/README.md
Project-URL: Source, https://github.com/kingsley-123/dagster-kafka-integration
Project-URL: Issues, https://github.com/kingsley-123/dagster-kafka-integration/issues
Project-URL: Validation Report, https://github.com/kingsley-123/dagster-kafka-integration/tree/main/validation
Keywords: dagster,kafka,apache-kafka,data-engineering,streaming,dlq,dead-letter-queue,avro,protobuf,schema-registry,enterprise,production,monitoring,alerting
Classifier: Development Status :: 5 - Production/Stable
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: System :: Distributed Computing
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: dagster>=1.5.0
Requires-Dist: kafka-python>=2.0.2
Requires-Dist: fastavro>=1.8.0
Requires-Dist: confluent-kafka[avro]>=2.1.0
Requires-Dist: requests>=2.28.0
Requires-Dist: protobuf<6.0,>=4.21.0
Requires-Dist: grpcio-tools>=1.50.0
Requires-Dist: googleapis-common-protos>=1.56.0
Dynamic: author
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license-file
Dynamic: project-url
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary


# Dagster Kafka Integration

The most comprehensively validated Kafka integration for Dagster with enterprise-grade features supporting all three major serialization formats and production security.

## Enterprise Validation Completed

**Version 1.1.2** - Most validated Kafka integration package ever created:

**11-Phase Comprehensive Validation** - Unprecedented testing methodology  
**Exceptional Performance** - 1,199 messages/second peak throughput proven  
**Security Hardened** - Complete credential validation + network security  
**Stress Tested** - 100% success rate (305/305 operations over 8+ minutes)  
**Enterprise Ready** - Complete DLQ tooling suite with 5 CLI tools  
**Zero Critical Issues** - Across all validation phases  

## Complete Enterprise Solution

- **JSON Support**: Native JSON message consumption from Kafka topics
- **Avro Support**: Full Avro message support with Schema Registry integration  
- **Protobuf Support**: Complete Protocol Buffers integration with schema management
- **Dead Letter Queue (DLQ)**: Enterprise-grade error handling with circuit breaker patterns
- **Enterprise Security**: Complete SASL/SSL authentication and encryption support
- **Schema Evolution**: Comprehensive validation with breaking change detection across all formats
- **Production Monitoring**: Real-time alerting with Slack/Email integration
- **High Performance**: Advanced caching, batching, and connection pooling
- **Error Recovery**: Multiple recovery strategies for production resilience

## Installation

```bash
pip install dagster-kafka
```

## Enterprise DLQ Tooling Suite

Complete operational tooling available immediately after installation:

```bash
# Analyze failed messages with comprehensive error pattern analysis
dlq-inspector --topic user-events --max-messages 20

# Replay messages with filtering and safety controls  
dlq-replayer --source-topic orders_dlq --target-topic orders --dry-run

# Monitor DLQ health across multiple topics
dlq-monitor --topics user-events_dlq,orders_dlq --output-format json

# Set up automated alerting
dlq-alerts --topic critical-events_dlq --max-messages 500

# Operations dashboard for DLQ health monitoring
dlq-dashboard --topics user-events_dlq,orders_dlq
```

## Quick Start

```python
from dagster import asset, Definitions
from dagster_kafka import KafkaResource, KafkaIOManager, DLQStrategy

@asset
def api_events():
    '''Consume JSON messages from Kafka topic with DLQ support.'''
    pass

defs = Definitions(
    assets=[api_events],
    resources={
        "kafka": KafkaResource(bootstrap_servers="localhost:9092"),
        "io_manager": KafkaIOManager(
            kafka_resource=KafkaResource(bootstrap_servers="localhost:9092"),
            consumer_group_id="my-dagster-pipeline",
            enable_dlq=True,
            dlq_strategy=DLQStrategy.RETRY_THEN_DLQ,
            dlq_max_retries=3
        )
    }
)
```

## Validation Results Summary

| Phase | Test Type | Result | Key Metrics |
|-------|-----------|--------|-------------|
| **Phase 5** | Performance Testing | **PASS** | 1,199 msgs/sec peak throughput |
| **Phase 7** | Integration Testing | **PASS** | End-to-end message flow validated |
| **Phase 9** | Compatibility Testing | **PASS** | Python 3.12 + Dagster 1.11.3 |
| **Phase 10** | Security Audit | **PASS** | Credential + network security |
| **Phase 11** | Stress Testing | **EXCEPTIONAL** | 100% success rate, 305 operations |

## Enterprise Security

### Security Protocols Supported
- **SASL_SSL**: Combined authentication and encryption (recommended for production)
- **SSL**: Certificate-based encryption
- **SASL_PLAINTEXT**: Username/password authentication  
- **PLAINTEXT**: For local development and testing

### SASL Authentication Mechanisms
- **SCRAM-SHA-256**: Secure challenge-response authentication
- **SCRAM-SHA-512**: Enhanced secure authentication
- **PLAIN**: Simple username/password authentication
- **GSSAPI**: Kerberos authentication for enterprise environments
- **OAUTHBEARER**: OAuth-based authentication

## Why Choose This Integration

### Complete Solution
- Only integration supporting all 3 major formats (JSON, Avro, Protobuf)
- Enterprise-grade security with SASL/SSL support
- Production-ready with comprehensive monitoring
- Advanced error handling with Dead Letter Queue support
- Complete DLQ Tooling Suite for enterprise operations

### Enterprise Ready
- 11-phase comprehensive validation covering all scenarios
- Real-world deployment patterns and examples
- Performance optimization tools and monitoring
- Enterprise security for production Kafka clusters
- Bulletproof error handling with circuit breaker patterns

### Unprecedented Validation
- Most validated package in the Dagster ecosystem
- Performance proven: 1,199 msgs/sec peak throughput
- Stability proven: 100% success rate under stress
- Security proven: Complete credential and network validation
- Enterprise proven: Exceptional rating across all dimensions

## Repository

GitHub: https://github.com/kingsley-123/dagster-kafka-integration

## License

Apache 2.0

---

The most comprehensively validated Kafka integration for Dagster - Version 1.1.2 Enterprise Validation Release with Security Hardening.

Built by Kingsley Okonkwo - Solving real data engineering problems with comprehensive open source solutions.
