Metadata-Version: 2.1
Name: aegypti
Version: 0.0.8
Summary: Solve the Triangle-Free Problem for an undirected graph represented by a Boolean adjacency matrix given in a file.
Home-page: https://github.com/frankvegadelgado/finlay
Author: Frank Vega
Author-email: vega.frank@gmail.com
License: MIT License
Project-URL: Source Code, https://github.com/frankvegadelgado/finlay
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Environment :: Web Environment
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.10
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=2.2.1
Requires-Dist: scipy>=1.15.0

# Triangle-Free Solver

![Honoring the Memory of Carlos Juan Finlay (Pioneer in the research of yellow fever)](docs/finlay.jpg)

This work builds upon [The Triangle Finding Problem](https://www.researchgate.net/publication/387698746_The_Triangle_Finding_Problem).

# Triangle-Free Problem

The Triangle-Free problem is a fundamental decision problem in graph theory. Given an undirected graph, the problem asks whether it's possible to determine if the graph contains no triangles (cycles of length 3). In other words, it checks if there exists a configuration where no three vertices are connected by edges that form a closed triangle.

This problem is important for various reasons:

- **Graph Analysis:** It's a basic building block for more complex graph algorithms and has applications in social network analysis, web graph analysis, and other domains.
- **Computational Complexity:** It serves as a benchmark problem in the study of efficient algorithms for graph properties. While the naive approach has a time complexity of $O(n^3)$, there are more efficient algorithms with subcubic complexity.

Understanding the Triangle-Free problem is essential for anyone working with graphs and graph algorithms.

## Problem Statement

Input: A Boolean Adjacency Matrix $M$.

Question: Does $M$ contain no triangles?

Answer: True / False

### Example Instance: 5 x 5 matrix

|        | c0    | c1  | c2    | c3  | c4  |
| ------ | ----- | --- | ----- | --- | --- |
| **r0** | 0     | 0   | 1     | 0   | 1   |
| **r1** | 0     | 0   | 0     | 1   | 0   |
| **r2** | **1** | 0   | 0     | 0   | 1   |
| **r3** | 0     | 1   | 0     | 0   | 0   |
| **r4** | **1** | 0   | **1** | 0   | 0   |

A matrix is represented in a text file using the following string representation:

```
00101
00010
10001
01000
10100
```

This represents a 5x5 matrix where each line corresponds to a row, and '1' indicates a connection or presence of an element, while '0' indicates its absence.

_Example Solution:_

Triangle Found (4, 0, 2): In Rows 2 & 4 and Columns 0 & 2

---

# Our Algorithm - Runtime $O(n + m)$

## The algorithm explanation:

We detect triangles in a graph using a depth-first search (DFS) and a coloring scheme. During the DFS traversal, each visited node assigns unique, consecutive integer colors to its uncolored neighbors. A triangle exists if two adjacent nodes share two colored neighbors, and the colors assigned to these shared neighbors differ by exactly two.

## Runtime Analysis:

1. _Depth-First Search (DFS)_: A standard depth-first search (DFS) on a graph with $\mid V \mid$ vertices and $\mid E \mid$ edges has a time complexity of $O(\mid V \mid + \mid E \mid)$, where $\mid \ldots \mid$ represents the cardinality (e.g., $n = \mid V \mid$ and $m = \mid E \mid$). This is because in the worst case, we visit every vertex and explore every edge.
2. _Coloring and Checking for Color Difference:_ In the Depth-First Search (DFS), each node performs either color assignment or a constant-time check of color differences with its neighbors. Because this operation is executed for every vertex during the DFS traversal, the overall computational complexity remains equivalent to the standard DFS algorithm's worst-case running time.
3. _Overall Runtime:_ The combined depth-first search, coloring, and checking process has a time complexity of $O(\mid V \mid + \mid E \mid)$.

# Compile and Environment

## Install Python >=3.8.

## Install Aegypti's Library and its Dependencies with:

```bash
pip install aegypti
```

---

# Execute

---

1. Go to the package directory to use the benchmarks:

```bash
git clone https://github.com/frankvegadelgado/finlay.git
cd finlay
```

2. Execute the script:

```bash
triangle -i .\benchmarks\testMatrix1.txt
```

utilizing the `triangle` command provided by Aegypti's Library to execute the Boolean adjacency matrix `finlay\benchmarks\testMatrix1.txt`. The file `testMatrix1.txt` represents the example described herein. We also support .xz, .lzma, .bz2, and .bzip2 compressed .txt files.

## The console output will display:

```
testMatrix1.txt: Triangle Found (4, 0, 2)
```

which implies that the Boolean adjacency matrix `finlay\benchmarks\testMatrix1.txt` contains a triangle combining the coordinates `(4, 0, 2)`.

## Finding All Triangles - Runtime $O(n + m)$

The `-a` flag enables the discovery of all triangles within the graph.

**Example:**

```bash
triangle -i .\benchmarks\testMatrix2.txt -a
```

**Output:**

```
testMatrix2.txt: Triangles Found (3, 4), (2, 8), (1, 8), (3, 10), (0, 3), (0, 2), (0, 1)
```

When multiple triangles exist, the output provides a list of their vertices.

**Vertex Representation:**

Each vertex pair in the list represents a triangle, possibly with repetitions. The remaining vertex for each triangle can be determined by referring to the adjacency matrix.

### Runtime Analysis (Find All Triangles)

The runtime complexity remains **O(n + m)**, where:

- **n** represents the number of nodes in the graph.
- **m** represents the number of edges in the graph.

This is because the algorithm primarily relies on a Depth-First Search (DFS) traversal of the graph.

# Command Options

To display the help message and available options, run the following command in your terminal:

```bash
triangle -h
```

This will output:

```
usage: triangle [-h] -i INPUTFILE [-a] [-b] [-l] [--version]

Solve the Triangle-Free Problem for an undirected graph represented by a Boolean adjacency matrix given in a file.

options:
  -h, --help            show this help message and exit
  -i INPUTFILE, --inputFile INPUTFILE
                        input file path
  -a, --all             identify all triangles, represented by pairs of vertices
  -b, --bruteForce      enable comparison with a brute-force approach using matrix multiplication
  -l, --log             enable file logging
  --version             show program's version number and exit
```

This output describes all available options.

---

A command-line tool, `test_triangle`, has been developed for testing algorithms on randomly generated, large sparse matrices. It accepts the following options:

```
usage: test_triangle [-h] -d DIMENSION [-n NUM_TESTS] [-s SPARSITY] [-a] [-b] [-w] [-l] [--version]

The Finlay Testing Application.

options:
  -h, --help            show this help message and exit
  -d DIMENSION, --dimension DIMENSION
                        an integer specifying the dimensions of the square matrices
  -n NUM_TESTS, --num_tests NUM_TESTS
                        an integer specifying the number of tests to run
  -s SPARSITY, --sparsity SPARSITY
                        sparsity of the matrices (0.0 for dense, close to 1.0 for very sparse)
  -a, --all             identify all triangles, represented by pairs of vertices
  -b, --bruteForce      enable comparison with a brute-force approach using matrix multiplication
  -w, --write           write the generated random matrix to a file in the current directory
  -l, --log             enable file logging
  --version             show program's version number and exit
```

**This tool is designed to benchmark algorithms for sparse matrix operations.**

It generates random square matrices with configurable dimensions (`-d`), sparsity levels (`-s`), and number of tests (`-n`). While a comparison with a brute-force matrix multiplication approach is available, it's recommended to avoid this for large datasets due to performance limitations. Additionally, the generated matrix can be written to the current directory (`-w`), and file logging can be enabled (`-l`) to record the test results.

# Code

- Python code by **Frank Vega**.

# Complexity

```diff
+ We propose an O(n + m) algorithm to solve the Triangle-Free Problem.
+ We can also identify all triangles defined by two vertices in O(n + m) time.
+ This algorithm provides multiple of applications to other computational problems in combinatorial optimization and computational geometry.
```

# License

- MIT.
