Metadata-Version: 2.4
Name: cafga
Version: 0.0.4
Summary: CafGa is a library that facilitates creating and evaluating grouped-attribution explanations.
Author-email: Alan Boyle <aboyle@student.ethz.ch>
Keywords: LLM,XAI,NLP,salience,attribution
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: BSD License
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: pip
Requires-Dist: matplotlib>=3.10.0
Requires-Dist: numpy>=2
Requires-Dist: pandas>=2
Requires-Dist: black>=25
Requires-Dist: python-dotenv>=1.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: pydantic>=2.10
Requires-Dist: sacremoses>=0.1.1
Requires-Dist: openai>=1.63.0
Requires-Dist: shap==0.46.0
Requires-Dist: nltk>=3.8

# CafGa (**C**ustom **a**ssignments **f**or **G**roup **a**ttribution)

## Installation

CafGa can be installed through PyPI using 

```
pip install cafga
```

If you installed CafGa from the repository run:

```
pip install -r requirements.txt
```

Note that some of the extra functionality requires further installations:

1. CafGa provides two jupyter widgets. The edit widget allows one to visually edit assignments and the display widget displays the attributions generated by the explanation. To use these please follow the instructions in the 'Demo Instructions.md' file. 

2. CafGa offers a predefined ChatGPT model. To use it you need to place a .env file with your API key in your working directory. 

## Using CafGa

The following provides an explanation of the main functions of cafga. To see an example of how to use cafga please look at the demo. 

To begin using CafGa, start by importing CafGa creating a cafga object:

`from cafga.cafga import CAFGA`

`cafga = CAFGA(model = 'your_model')`

The model parameter is where you pass the model you want to explain. To allow for parallelization in how your model generates predictions (e.g. by batching) cafga sends lists of inputs to your model instead of single inputs. Thus, the function that implements your model should take a list of strings as input and output either a list of strings or a list of floats as output (i.e. a list containing one output for every input). 

Once cafga is instantiated the typical usage of cafga runs proceeds in three steps: Explanation, Evaluation, and Visualisation.

### 1. Explanation

To generate an explanation run the explain function on the instantiated cafga object:

`explanation = cafga.explain(params)`

There are two way of using the explain functions. 

Firstly, you can pass the string you want to get an explanation for without segmenting it into the individual parts that you want to get attributions for. In this case you need to provide the name of the predefined attribution method ('word', 'sentence', 'syntax-parse') that you want to use. 

Secondly, you can provide your own segmentation of the input by using the `segmented_input` parameter. In this case you will also need to provide the assignments of input segment to group with the `input_assignments` parameter. Specifically, the `input_assignments[i] = g_i` should be the index of the group that `input_segments[i]` belongs to. 

### 2. Evaluation

Once an explanation object has been generated you can pass it on to the evaluation function:

`evaluation = cafga.evaluate(explanation, params)`

The two forms of evaluation currently supported are deletion (going from all features present to no features present) and insertion (going from no features present to all features present), which can be indicated by the `direction` parameter. The resulting evaluation accordinlgy contains the array of difference values computed as part of the perturbation curve. 

### 3. Visualisation

Finally, the perturbation curve generated by the evaluation can be visualised using the visualisation function:

`cafga.visualize_evaluation(evaluated_explanations, params)`

Since you may want to plot the aggregate over many evaluations the visualisation functions takes in a list of evaluations as input. The two forms of aggregation currently supported are equal width binning and linear interpolation. 
