Metadata-Version: 2.1
Name: LlmEmbeddingXrVizualization
Version: 0.1.8
Summary: A package for visualizing embeddings spaces from Hugging Face models
Home-page: https://github.com/rmr327/LlmEmbeddingXrVizualization
Author: Rakeen Rouf & Akalpit
Author-email: rakeenrouf@gmail.com
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: Cython<4.0.0,>=3.0.11
Requires-Dist: Jinja2<4.0.0,>=3.1.3
Requires-Dist: MarkupSafe<3.0.0,>=2.1.5
Requires-Dist: PyYAML<7.0.0,>=6.0.2
Requires-Dist: bpy<5.0.0,>=4.0.0
Requires-Dist: certifi<2025.0.0,>=2024.8.30
Requires-Dist: charset-normalizer<4.0.0,>=3.4.0
Requires-Dist: click<9.0.0,>=8.1.7
Requires-Dist: contourpy<2.0.0,>=1.3.1
Requires-Dist: cycler<1.0.0,>=0.12.1
Requires-Dist: exceptiongroup<2.0.0,>=1.2.2
Requires-Dist: filelock<4.0.0,>=3.13.1
Requires-Dist: fonttools<5.0.0,>=4.55.0
Requires-Dist: fsspec<2025.0.0,>=2024.2.0
Requires-Dist: huggingface-hub<1.0.0,>=0.26.3
Requires-Dist: idna<4.0.0,>=3.10
Requires-Dist: iniconfig<3.0.0,>=2.0.0
Requires-Dist: joblib<2.0.0,>=1.4.2
Requires-Dist: kiwisolver<2.0.0,>=1.4.7
Requires-Dist: llvmlite<1.0.0,>=0.43.0
Requires-Dist: mathutils<4.0.0,>=3.3.0
Requires-Dist: matplotlib<4.0.0,>=3.9.3
Requires-Dist: mpmath<2.0.0,>=1.3.0
Requires-Dist: networkx<4.0.0,>=3.2.1
Requires-Dist: numba<1.0.0,>=0.60.0
Requires-Dist: numpy<2.0.0,>=1.26.3
Requires-Dist: packaging<25.0,>=24.2
Requires-Dist: pandas<3.0.0,>=2.2.3
Requires-Dist: pillow<11.0.0,>=10.2.0
Requires-Dist: plotly<6.0.0,>=5.24.1
Requires-Dist: pluggy<2.0.0,>=1.5.0
Requires-Dist: pynndescent<1.0.0,>=0.5.13
Requires-Dist: pyparsing<4.0.0,>=3.2.0
Requires-Dist: pytest<9.0.0,>=8.3.3
Requires-Dist: python-dateutil<3.0.0,>=2.9.0.post0
Requires-Dist: pytz<2025.0,>=2024.2
Requires-Dist: regex<2025.0.0,>=2024.11.6
Requires-Dist: requests<3.0.0,>=2.32.3
Requires-Dist: safetensors<1.0.0,>=0.4.5
Requires-Dist: scikit-learn<2.0.0,>=1.5.2
Requires-Dist: scipy<2.0.0,>=1.14.1
Requires-Dist: six<2.0.0,>=1.16.0
Requires-Dist: sympy<2.0.0,>=1.13.1
Requires-Dist: tenacity<10.0.0,>=9.0.0
Requires-Dist: threadpoolctl<4.0.0,>=3.5.0
Requires-Dist: tokenizers<1.0.0,>=0.20.3
Requires-Dist: tomli<3.0.0,>=2.2.1
Requires-Dist: torch
Requires-Dist: torchaudio
Requires-Dist: torchvision
Requires-Dist: tqdm<5.0.0,>=4.67.1
Requires-Dist: transformers<5.0.0,>=4.46.3
Requires-Dist: typing-extensions<5.0.0,>=4.9.0
Requires-Dist: tzdata<2025.0,>=2024.2
Requires-Dist: umap-learn<1.0.0,>=0.5.6
Requires-Dist: urllib3<3.0.0,>=2.2.3
Requires-Dist: zstandard<1.0.0,>=0.23.0

# LlmEmbeddingXrVizualization
[![Python package](https://github.com/rmr327/LlmEmbeddingXrVizualization/actions/workflows/python-package.yml/badge.svg)](https://github.com/rmr327/LlmEmbeddingXrVizualization/actions/workflows/python-package.yml)

A package for visualizing Large Language Model (LLM) embedding spacese from Hugging Face models with just the model name as input!

Inspired by the belief that data should be experienced, not just viewed, we're bridging the gap between 2D plots and spatial understanding in the LLM embeddings space. The fundamental limitation of 2D screens - trying to compress three dimensions into two - has always forced us to sacrifice either information or clarity. Our platform breaks free from these constraints, transforming raw datasets into immersive XR visualizations using nothing but the name of the model from Hugging Face. Every visualization is accessible on your Meta Quest XR Headsets. We're not just plotting data - we're creating a new way to discover insights through spatial exploration, one that respects the true dimensionality of our data.

Each word/sentece embedding is meticulously positioned in virtual space, ensuring perfect spatial accuracy and true-to-scale representation. This precision becomes particularly powerful when visualizing LLM embedding spaces - allowing users to physically explore how concepts are related within these models. By walking through the three-dimensional embedding space, researchers can intuitively verify if semantically similar concepts cluster together and identify unexpected relationships that traditional 2D visualizations might miss.

## Installation

```bash
pip install LlmEmbeddingXrVizualization
```

## Usage
```bash
llm-embedding-viz --help
```
![image](https://github.com/user-attachments/assets/4586bcf0-1d03-441d-9cda-cb4a7f6a43c0)

```bash
llm-embedding-viz
```
![image](https://github.com/user-attachments/assets/0ed8ddd7-be71-4724-b25e-90c53a100e8c)

> example website to open the generated 3d object ('.dae file').

![image](https://github.com/user-attachments/assets/8da4f88a-72ce-46c2-b699-048fb0d8d1d5)

> example experience on meta quest 3

![PlotVerseXR_Trailer](https://github.com/user-attachments/assets/7c76cee8-7476-45ec-b482-6213618176d0)

![PlotVerseXR_Trailer (1)](https://github.com/user-attachments/assets/26903be9-2e82-4421-98bb-ca8adfb96157)



```bash
llm-embedding-viz --model_name "distilbert/distilbert-base-uncased-finetuned-sst-2-english" -c path_to_ur_labels_domains.csv -r isomap -s"
```
> The csv file must have 'domains' and 'words' columns.

![image](https://github.com/user-attachments/assets/02749a52-cb1c-460b-8393-2ef347f65a70)

> generated plot for -s flag

![image](https://github.com/user-attachments/assets/1c332560-e9f8-463a-be2c-095c77f77a1c)

## References
This idea started in a Hacathon: https://devpost.com/software/plotversexr.

Generative Ai such as Github Copilot and Chat GPT was used extensively in this project. 

Duke University Xplainable Ai Class: AIPI 590.



