Metadata-Version: 2.1
Name: TXMeans
Version: 0.1.1
Summary: parameter-free clustering algorithm
Author: Riccardo Guidotti, Anna Monreale, Mirco Nanni, Fosca Giannotti, Dino Pedreschi
License: GPL
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Operating System :: OS Independent
Requires-Python: >=3
Description-Content-Type: text/markdown
License-File: LICENSE

# TX-Means

TX-Means is a parameter-free clustering algorithm able to efficiently partitioning transactional data in a completely automatic way.
TX-Means is designed for the case where clustering must be applied on a massive number of different datasets, for instance when a large set of users need to be analyzed individually and each of them has generated a long history of transactions.

In this repository we provide the source code of TX-Means, the clustering algorithm competitors and the dataset used in
> Riccardo Guidotti, Anna Monreale, Mirco Nanni, Fosca Giannotti, Dino Pedreschi *"Clustering Individual Transactional Data for Masses of Users"*, KDD 2017, 2017, Halifax, NS, Canada

Please cite the paper above if you use our code or dataets.

### Where to get it

The source code is currently hosted on GitHub at: https://github.com/riccotti/TX-Means

#### How to install

    pip install TXMeans

#### How to import (some examples)

    from TXMeans.txmeans import TXmeans
    from TXMeans.util import count_items, remap_items, sample_size (Util functions)
    from TXMeans.util import basket_list_to_bitarray, basket_bitarray_to_list (Converting(Reverting) to(from) bitarray)
    from TXMeans.datamanager import read_uci_data (Convert the data in nice basket format)
    from TXMeans.validation_measures import delta_k, purity, normalized_mutual_info_score (Measure of Validation)
    from TXMeans.util import jaccard_bitarray

##### Requirements:

- python >= 3 
- numpy >= 1.10.1
- pandas >= 0.18.1
- scipy >= 0.17.1
- bitarray >= 0.8.1
- Java >= 8.1
