Metadata-Version: 2.1
Name: GABRIEL-ratings
Version: 0.1.3
Summary: THE GABRIEL library for numerical analysis of texts in the social sciences.
Home-page: https://github.com/elliottmokski/GABRIEL-distribution
Author: Hemanth Asirvatham and Elliott Mokski
Author-email: elliottpmokski@gmail.com
License: UNKNOWN
Description: # The Generalized Attribute Based Ratings Information Extraction Library (GABRIEL)
        
        ## Description
        
        GABRIEL is a simple Python framework built on top of LLMs like ChatGPT to simplify quantitative text analysis in the social sciences.
        
        IMPORTANT: Follow this Colab tutorial notebook for the easiest setup guide: https://colab.research.google.com/drive/1tshfY-2al7asU7pTFvFFg1n4NSvLXtZg?usp=sharing
        
        The full documentation is below.
        
        ## Installation
        
        The new Python library replaces the previous API and dramatically simplifies the use of the package. Installation is extremely simple using pip.
        
        Before you install our package, we require that you install the OpenAI library. Open your terminal or command prompt and run:
        
        ```bash
        pip install openai
        ```
        
        Once you have installed OpenAI, install GABRIEL using 
        ```bash
        pip install gabriel-ratings
        ``` 
        
        ## Use
        ### Simple ratings framework
        
        The main way to get ratings from GABRIEL is using the Archangel class. The class requires an OpenAI api-key for instantiation. We strongly recommend you store the key as an environment variable. To create an Archangel object, use the following syntax. 
        
        ```python
        from GABRIEL.Archangel import Archangel
        combined_assistant = Archangel(your_api_key)
        ```
        
        Once you create the object, you can run a simple ratings framework through the *rate_texts* function. You must supply a list of the texts to rate, *texts*; an *attributes_dict*, where the keys are your attributes, and the values are the definitions; an object category (*object_category*) and an attribute category (*attribute_category*). You can also specify a specific OpenAI model for your call, using the *model* parameter (the default is GPT-3.5-turbo). See below for the full list of parameters, and more detailed descriptions.
        
        The simplest ratings call, which returns a Pandas dataframe, is just:
        
        ```python
        ratings = archangel.rate_texts(texts, attributes_dict= attribute_dict, save_folder = 'path_to_your_folder', file_name = 'your_file_name.csv', attribute_category = your_attribute_category, object_category = your_object_category)
        ```
        
        ### Features 
        
        The Archangel class comes with a number of easy to use features to help you run your code. 
        - parallelization: the library parallelizes API calls to dramatically speed up running time. We configure this by default.
        - cost estimates: we provide a very rough cost estimate of each run when you begin the call, based on the model and texts you input. 
        - auto-saving: the class will auto-save your results to a CSV at each iteration, as long as you provide a valid path.  
        
        ### Preset classes
        
        To simplify the task of choosing your hyperparameters, we provide two default options: 
        - 'mazda': cheap, fast, and reliable. Uses GPT-3.5-turbo, with text truncation to 9500 words to allow for prompts. Runs 50 queries in parallel.  
        - 'tesla': expensive. Uses GPT-4-turbo, with 30 parallel queries. Not recommended due to cost. 
        
        ### Function parameters
        
        The full list of parameters for the function is as follows. 
        
        - **`search_axis_1`** (mandatory): A list containing the texts being evaluated. For example, `search_axis_1 = ['fed statement 1', 'fed statement 2', ...]`.
        - **`object_category`** (mandatory): A string, simply describing the category of objects being evaluated. For example, `object_category = 'Fed Statements'` or `'Short Stories'`.
        - **`attribute_category`** (mandatory): The category of attributes being evaluated. For example, `attribute_category = 'emotions'` or `'tastes'`.
        - **`attributes`** (optional): A list containing the desired attributes for evaluation. If this is not specified, the model will generate **`n_search_attributes`** attributes itself. For example, `n_search_attributes = ['optimism', 'negativity', 'concern about unemployment']`.
        - **`n_search_attributes`** (optional): An integer containing the number of attributes to generate if no attributes were specified. Defaults to 5. For example, `n_search_attributes = 10`.
        - **`descriptions`** (optional): A list of descriptions for the attributes if **`attributes`** are provided explicitly. Otherwise, the descriptions will be generated by the model. For example, `descriptions = ['The happy attribute refers to a positive emotional state characterized by feelings of joy, contentment, and satisfaction.', 'The "sad" attribute is a feeling of sorrow, unhappiness, or distress. It is an emotional state characterized by a low mood and a sense of loss or disappointment.']`.
        - **`object_clarification`** (optional): A string providing further clarification on the objects (e.g., a string of comma-separated examples).
        - **`attribute_clarification`** (optional): A string providing further clarification on the attributes (e.g., a string of comma-separated examples). For use in the generation of attributes.
        - **`use_classification`** (optional - defaults to False): Toggles whether the model uses a ratings or classification approach.
        - **`classification_clarification`** (optional - only considered when `use_classification = True`): An additional string to provide context on the classification process.
        - **`truncate`** (optional, defaults to True): Whether to truncate the text to the first 10,000 words. This avoids overloading the API token limit (16k tokens for the default model).
        - **`project_probs`** (optional, defaults to False): Whether to project the probabilities from 0 to 100 to a 0 to 1 scale.
        - **`api_key`** (mandatory): Your OpenAI API key.
        - **`model`** (optional): Backend model, default = `gpt-3.5-turbo-1106`.
        - **`seed`** (optional, RECOMMENDED): Set a seed for cross-run replicability. For instance, `seed = 0`.
        
        ## Citation
        
        Please cite the project using: 
        
        The Generalized Attribute Based Ratings Information Extraction Library (GABRIEL). Hemanth Asirvatham and Elliott Mokski (2023). https://github.com/elliottmokski/GABRIEL-distribution. 
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Requires-Python: >=3.6
Description-Content-Type: text/markdown
