Metadata-Version: 2.3
Name: YESciEval
Version: 0.1.0
Summary: YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering.
License: MIT
Author: Hamed Babaei Giglou
Author-email: hamedbabaeigiglou@gmail.com
Requires-Python: >=3.10,<4.0.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Dist: numpy
Requires-Dist: openai
Requires-Dist: pandas
Requires-Dist: peft
Requires-Dist: pre-commit
Requires-Dist: pydantic
Requires-Dist: torch
Requires-Dist: transformers
Project-URL: Homepage, https://yescieval.readthedocs.io/
Project-URL: Repository, https://github.com/sciknoworg/YESciEval/
Description-Content-Type: text/markdown

<div align="center">
  <img src="images/logo.png" width="50%" height="30%"/>
</div>

<div align="center">


[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit)](https://github.com/pre-commit/pre-commit)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)


</div>


## 📋 What is the YESciEval?


Large Language Models (LLMs) drive scientific question-answering on modern search engines, yet their evaluation robustness remains underexplored. We introduce **YESciEval**, an open-source framework that combines fine-grained rubric-based assessment with reinforcement learning to mitigate optimism bias in LLM evaluators. The framework is presented as f ollows:


We release multidisciplinary scienceQ&A datasets, including adversarial variants, with evaluation scores from multiple LLMs. Independent of proprietary models and human feedback, our approach enables scalable, cost-free evaluation. By advancing reliable LLM-as-a-judge models, this work supports AI alignment and fosters robust, transparent evaluation essential for scientific inquiry and artificial general intelligence.

## 📃 License

This work is licensed under a [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT).





