Metadata-Version: 2.4
Name: schemaorg2yaml
Version: 1.0.0
Summary: A CLI tool to generate YAML schemas from schema.org vocabulary
Project-URL: Homepage, https://github.com/johanremy/schemaorg2yaml
Project-URL: Repository, https://github.com/johanremy/schemaorg2yaml
Project-URL: Issues, https://github.com/johanremy/schemaorg2yaml/issues
Author-email: Johan REMY <johan.remy@graines-digitales.online>
License-Expression: MIT
License-File: LICENSE
Keywords: cli,json-ld,rdf,schema.org,yaml
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing :: Markup
Requires-Python: >=3.12
Requires-Dist: click>=8.1.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: rdflib>=7.0.0
Requires-Dist: requests>=2.31.0
Provides-Extra: dev
Requires-Dist: black>=23.0.0; extra == 'dev'
Requires-Dist: mypy>=1.7.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Description-Content-Type: text/markdown

# SchemaORG2YAML

[![Build Status](https://github.com/johanremy/schemaorg2yaml/workflows/CI/badge.svg)](https://github.com/johanremy/schemaorg2yaml/actions)
[![Python Version](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

A CLI tool to generate YAML schemas from schema.org vocabulary.

## Overview

SchemaORG2YAML downloads the schema.org vocabulary and generates normalized YAML files describing schema.org types with their properties, inheritance, and metadata. Perfect for developers who need readable and versionable schemas aligned with schema.org for data modeling.

## Features

- 🔄 **Automatic vocabulary download** from schema.org (JSON-LD/TTL/RDF)
- 🧬 **Inheritance resolution** (rdfs:subClassOf) with property merging
- 📄 **YAML export** for individual types or type lists
- 🏷️ **Alias mapping** support for local naming conventions
- 🔍 **Type discovery** with filtering capabilities
- ⚡ **CLI ergonomics** with multiple subcommands
- 🧪 **Comprehensive tests** and CI/CD

## Installation

### Using pipx (recommended)

```bash
pipx install schemaorg2yaml
```

### Using pip

```bash
pip install schemaorg2yaml
```

### From source

```bash
git clone https://github.com/johanremy/schemaorg2yaml.git
cd schemaorg2yaml
pip install -e .
```

## Quick Start

1. **Fetch the schema.org vocabulary**:
   ```bash
   schemaorg2yaml fetch-vocab
   ```

2. **List available types**:
   ```bash
   schemaorg2yaml list-types --filter Person
   ```

3. **Generate YAML schemas**:
   ```bash
   schemaorg2yaml generate --types Person,Article --out ./schema
   ```

4. **Preview a schema**:
   ```bash
   schemaorg2yaml show --type WebPage
   ```

## Usage Examples

### Fetch Vocabulary

```bash
# Download latest schema.org vocabulary
schemaorg2yaml fetch-vocab

# Use custom source
schemaorg2yaml fetch-vocab --source https://schema.org/version/latest/schemaorg-current-https.jsonld --format jsonld

# Force refresh cache
schemaorg2yaml fetch-vocab --force
```

### Generate Schemas

```bash
# Generate specific types
schemaorg2yaml generate --types Person,WebPage --out ./schema

# Generate from file list
schemaorg2yaml generate --file types.txt --out ./schema

# Use aliases mapping
schemaorg2yaml generate --types Person --out ./schema --aliases ./aliases.yaml

# Include ancestor metadata
schemaorg2yaml generate --types Article --out ./schema --include-ancestors

# Minimal output (no descriptions)
schemaorg2yaml generate --types Person --out ./schema --no-descriptions
```

### Explore Types

```bash
# List all available types
schemaorg2yaml list-types

# Filter types by pattern
schemaorg2yaml list-types --filter ".*Page.*"

# Show schema in console
schemaorg2yaml show --type Article
```

## YAML Schema Format

Each generated YAML file follows this stable structure:

```yaml
$schema: "https://example.com/specs/schema-yaml/v1"
id: "schema:Person"
label: "Person"
description: "A person (alive, dead, undead, or fictional)."
parents: ["Thing"]
seeAlso:
  - "https://schema.org/Person"
properties:
  birthDate:
    id: "schema:birthDate"
    label: "birthDate"
    description: "Date of birth."
    range: ["Date", "DateTime", "Text"]
    required: false
    repeated: false
  givenName:
    id: "schema:givenName"
    label: "givenName"
    description: "Given name."
    range: ["Text"]
    required: false
    repeated: false
    aliases: ["first_name"]
```

### Key Features:

- **Sorted properties** alphabetically
- **Inheritance chain** in `parents` field
- **Type ranges** from schema.org preserved
- **Cardinality** information (`required`, `repeated`)
- **Local aliases** for property mapping
- **Stable format** for version control

## Aliases Configuration

Create an `aliases.yaml` file to map schema.org properties to local names:

```yaml
Person:
  givenName: ["first_name", "prenom"]
  familyName: ["last_name", "nom"]
  birthDate: ["date_of_birth"]

WebPage:
  breadcrumb: ["fil_ariane"]
  mainContentOfPage: ["main_content"]
```

## Development

### Setup

```bash
git clone https://github.com/johanremy/schemaorg2yaml.git
cd schemaorg2yaml
pip install -e ".[dev]"
```

### Code Quality

```bash
# Format code
black src/ tests/
ruff check src/ tests/ --fix

# Type checking
mypy src/

# Run tests
pytest
```

### Testing

```bash
# Run all tests
pytest

# With coverage
pytest --cov=src/schemaorg2yaml --cov-report=html

# Specific test file
pytest tests/test_cli.py -v
```

## CLI Reference

### Commands

- `fetch-vocab` - Download/update schema.org vocabulary
- `generate` - Generate YAML files for specified types
- `list-types` - List available schema.org types
- `show` - Display YAML for a type in console

### Global Options

- `--help` - Show help message
- `--version` - Show version information

Run `schemaorg2yaml COMMAND --help` for detailed command options.

## Limitations (v1)

- No runtime data validation
- No reverse generation (YAML → RDF)
- No web UI
- Limited to schema.org vocabulary

## Roadmap

- 🗂️ Versioned vocabulary cache
- 🎯 Profile-based property filtering (Web, SEO, e-commerce)
- 📊 JSON export format
- 📋 Multi-type index generation
- ⚙️ Minimal vs extended property sets

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests
5. Run the test suite
6. Submit a pull request

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

- [Schema.org](https://schema.org/) for the vocabulary
- [RDFLib](https://rdflib.readthedocs.io/) for RDF processing
- [Click](https://click.palletsprojects.com/) for CLI framework