Metadata-Version: 2.1
Name: Tweetl
Version: 0.0.7
Summary: A package to do everything from getting tweets to pre-processing
Home-page: https://github.com/deepblue-ts/Tweetl
Author: deepblue
Author-email: main@deepbluets.page
License: MIT
Keywords: natural language processing,twitter,cleansing
Platform: UNKNOWN
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Description-Content-Type: text/markdown
Requires-Dist: as (==0.1)
Requires-Dist: certifi (==2020.4.5.1)
Requires-Dist: chardet (==3.0.4)
Requires-Dist: emoji (==0.5.4)
Requires-Dist: idna (==2.9)
Requires-Dist: mojimoji (==0.0.11)
Requires-Dist: neologdn (==0.4)
Requires-Dist: numpy
Requires-Dist: oauthlib (==3.1.0)
Requires-Dist: pandas
Requires-Dist: pd (==0.0.1)
Requires-Dist: PySocks (==1.7.1)
Requires-Dist: python-dateutil (==2.8.1)
Requires-Dist: pytz (==2020.1)
Requires-Dist: requests (==2.23.0)
Requires-Dist: requests-oauthlib (==1.3.0)
Requires-Dist: six (==1.14.0)
Requires-Dist: tweepy (==3.8.0)
Requires-Dist: urllib3 (==1.25.9)

# Tweetl
By using Tweetl, you can simplify the steps from getting tweets to pre-processing them.
If you don't have twitter API key, you can get it [here](https://developer.twitter.com/en).

This package help you to ・・・
+ get tweets with the target name and any keywords.
+ pre-processes the following list.
  + remove hashtags, URLs, pictographs, mentions, image strings and RT.
  + unify characters (uppercase to lowercase, halfwidth forms to fullwidth forms).
  + replace number to zero.
  + remove duplicates (because they might be RT.)

## Installation
```
pip install Tweetl
```
## Usage
### Getting Tweets
Create an instance of the 'GetTweet' Class.
```
import Tweetl

# your api keys
consumer_api_key = "xxxxxxxxx"
consumer_api_secret_key = "xxxxxxxxx"
access_token = "xxxxxxxxx"
access_token_secret = "xxxxxxxxx"

# create an instance
tweet_getter = Tweetl.GetTweet(
                    consumer_api_key,
                    consumer_api_secret_key, 
                    access_token, 
                    access_token_secret
                )
```
#### With target name
You can collect tweets of the target if you use 'get_tweets_target' method and set the target's name not inclueded '@'. Then it returns collected tweets as DataFrame type. And you can specify the number of tweets.
```
# get 1000 tweets of @Deepblue_ts
df_target = tweet_getter.get_tweets_target("Deepblue_ts", 1000)
df_target.head()
```
<img width="939" alt="スクリーンショット 2020-05-22 14 33 39" src="https://user-images.githubusercontent.com/37981348/82634800-b27fa480-9c39-11ea-9420-8952717823fb.png">

#### With any keywords
You can also get tweets about any keywords if you use 'get_tweets_keyword' method and set any one. And you can specify the number of tweets.
```
# get 1000 tweets about 'deep learning'
df_keyword = tweet_getter.get_tweets_keyword("deep learning", 1000)
```

### Cleansing Tweets
Create an instance of the 'CleansingTweets' Class. And using 'cleansing_df' method, you can pre-processing tweets. You can select columns that you want to cleanse. The default is only text-colmn.

```
# create an instance
tweet_cleanser = Tweetl.CleansingTweets()
cols = ["text", "user_description"]
df_clean = tweet_cleanser.cleansing_df(df_keyword, subset_cols=cols)
```

## Author
[deepblue](https://deepblue-ts.co.jp/)

## License
This software is released under the MIT License, see [LICENSE](https://github.com/deepblue-ts/Tweetl/blob/master/LICENSE).


