Metadata-Version: 2.4
Name: calt-x
Version: 1.1.0
Summary: A library for computational algebra using Transformers
Author-email: Hiroshi Kera <kera.hiroshi@gmail.com>, Yuta Sato <sato.yuta@gmail.com>, Shun Arakawa <shun.arkw@gmail.com>
Project-URL: Source, https://github.com/HiroshiKERA/calt
Project-URL: Issues, https://github.com/HiroshiKERA/calt/issues
Requires-Python: <3.13,>=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: transformers>=4.49.0
Requires-Dist: omegaconf>=2.3.0
Requires-Dist: torch>=2.6.0
Requires-Dist: wandb>=0.15.11
Requires-Dist: accelerate>=0.29.0
Requires-Dist: joblib>=1.5.0
Requires-Dist: sympy>=1.12
Requires-Dist: IPython>=8.18.1
Provides-Extra: kaggle
Requires-Dist: kaggle; extra == "kaggle"
Dynamic: license-file

# CALT: Computer ALgebra with Transformer

[![Documentation](https://img.shields.io/badge/docs-latest-brightgreen.svg)](https://hiroshikera.github.io/calt/)
[![GitHub Pages](https://img.shields.io/badge/GitHub%20Pages-View%20Documentation-blue.svg)](https://hiroshikera.github.io/calt/)

> 📖 **📚 [View Full Documentation](https://hiroshikera.github.io/calt/)**

## Overview

CALT is a simple Python library for learning arithmetic and symbolic computation with a Transformer model (a deep neural model to realize sequence-to-sequence functions). 

It offers a basic Transformer model and training pipeline, and non-experts of deep learning can focus on constructing datasets to train and evaluate the model. 

## 🚀 Quick Start

### Installation

```bash
pip install calt-x
```

### Instance Generation
For minimal usage, users only need to implement an instance generator for their own task. For example:

```python
def int_sum_generator(seed, N=5, lb=-10, ub=10):
    random.seed(seed)

    # get N random integers from [lb, ub]
    problem = [random.randint(lb, ub) for _ in range(N)]
    answer = sum(problem)

    return problem, answer
```

Feeding the generator to `DataPipeline` generates training and evaluation sets. The `data.yaml` gives a full control over the generation process. 
```python
cfg = OmegaConf.load("configs/data.yaml")
pipeline = DatasetPipeline.from_config(
    cfg.dataset,
    instance_generator=int_sum_generator
)
pipeline.run()
```

### Training Script
Then, a short script implement the training and evalutation through `IOPipeline`, `ModelPipeline`, and `TrainerPipeline`. The config file `train.yaml` (and associated `lexer.yaml`) gives full control over the training setup. 
```python
cfg = OmegaConf.load("configs/train.yaml")
io_pipeline = IOPipeline.from_config(cfg.data)
io_dict = io_pipeline.build()

model = ModelPipeline.from_io_dict(cfg.model, io_dict).build()
trainer_pipeline = TrainerPipeline.from_io_dict(cfg.train, model, io_dict).build()

trainer_pipeline.train()
trainer_pipeline.save_model()
trainer_pipeline.evaluate_and_save_generation()
```

### Examples
See `examples/` directory. 

### For users without a local GPU

If you do not have a local GPU, you can still try CALT in two ways:

1. **Run the demo on Google Colab**  
   Use the demo notebook from your browser:
   <https://colab.research.google.com/github/HiroshiKERA/calt/blob/dev/examples/demos/minimal_demo.ipynb>

2. **Use remote jobs on Kaggle**  
   Submit and monitor training jobs from your local terminal using `calt remote ...`.
   See the remote job documentation:
   <https://hiroshikera.github.io/calt/remote/>


## Citation

If you use CALT in your project, please cite our paper:

```bibtex
@misc{kera2025calt,
  title={CALT: A Library for Computer Algebra with Transformer},
  author={Hiroshi Kera and Shun Arawaka and Yuta Sato},
  year={2025},
  archivePrefix={arXiv},
  eprint={2506.08600}
}
```
> Note: The current arXiv preprint is based on the previous version of CALT. The update will come soon. 

