Metadata-Version: 2.3
Name: anyparser-crewai
Version: 0.0.2
Summary: Anyparser CrewAI Integration
License: Apache-2.0
Requires-Python: >=3.9
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Typing :: Typed
Requires-Dist: anyparser-core (>=1.0.1,<2.0.0)
Requires-Dist: crewai (>=0.100.1,<0.101.0)
Project-URL: Homepage, https://github.com/anyparser/anyparser_crewai
Description-Content-Type: text/markdown

# Anyparser CrewAI

https://anyparser.com

**Integrate Anyparser's powerful content extraction capabilities with CrewAI for enhanced AI workflows.** This integration package enables seamless use of Anyparser's document processing and data extraction features within your CrewAI applications, making it easier than ever to build sophisticated AI pipelines.

## Installation

```bash
pip install anyparser-crewai
```

## Setup

Before running the examples, make sure to set your Anyparser API credentials as environment variables:

```bash
export ANYPARSER_API_KEY="your-api-key"
export ANYPARSER_API_URL="https://anyparserapi.com"
```

## Anyparser LangChain Examples

This `examples` directory contains examples demonstrating different ways to use the Anyparser LangChain integration.

```bash
python examples/01_single_file_markdown.py
python examples/02_single_file_json.py
python examples/03_multiple_files_markdown.py
python examples/04_multiple_files_json.py
python examples/05_directory_read.py
python examples/06_ocr_markdown.py
python examples/07_ocr_json.py
python examples/08_web_crawler.py
python examples/09_web_crawler_json.py
```

## Examples

### 1. Single File Processing
- `01_single_file_markdown.py`: Process a single file with markdown output
- `02_single_file_json.py`: Process a single file with JSON output

### 2. Multiple File Processing
- `03_multiple_files_markdown.py`: Process multiple files with markdown output
- `04_multiple_files_json.py`: Process multiple files with JSON output
- `05_directory_read.py`: Load and process all files from a folder (max 5 files)

### 3. OCR Processing
- `06_ocr_markdown.py`: Process images/scans with OCR (markdown output)
- `07_ocr_json.py`: Process images/scans with OCR (JSON output)

### 4. Web Crawling
- `08_web_crawler.py`: Basic web crawling with essential settings (markdown)
- `09_web_crawler_json.py`: Web crawling (JSON)

## Features Demonstrated

### Document Processing
- Different output formats (markdown, JSON)
- Multiple file handling
- Folder processing
- Metadata handling

### Web Crawling
- Basic crawling with depth and scope control
- Advanced URL and content filtering
- Crawling strategies (BFS, LIFO)
- Rate limiting and robots.txt respect

## Notes

- All examples use async/await for better performance
- Error handling is included in all examples
- Each example includes detailed comments explaining the options used
- OCR examples support multiple languages
- Crawler examples demonstrate various filtering and control options

## Features Demonstrated

- Different output formats (markdown, JSON)
- OCR capabilities with language support
- OCR performance presets
- Image extraction
- Table extraction
- Metadata handling
- Error handling
- Async/await usage

## License

Apache-2.0


