Text Mining: Connecting & Collecting Data



Application Developer

  • klik pada nama aplikasi yang sudah Anda buat. Misalkan pada contoh adalah MyProject2018.

setup_twitter_oauth()

  • merupakan fungsi yang digunakan untuk membuat authentication connection dari package R ke sesi Twitter
  • semua informasi tentang nilai parameter yang dibutuhkan didapatkan dari tab Keys and Access Tokens
    • consumer_key/ API key
    • consumer_secret/ API secret
    • access_token
    • access_secret
















  • Program R

library(twitteR) library(ROAuth) consumer_key <- 'abcdefghijklmnopqrstuvwxyz' #isi dgn consumer key/API key consumer_secret <- 'abcdefghijklmnopqrstuvwx' #isi dgn consumer secret/API secret access_token <- 'abcdefghijklmnopqrstuvwxyz' #isi dengan access token access_secret <- 'abcdefghijklmnopqrstuvwxy' #isi dengan access secret
setup_twitter_oauth(consumer_key,consumer_secret,access_token,access_secret)

Mengambil Data Twitter

  • Fungsi searchTwitter(keywords,n) untuk mencari keywords sebanyak n kali
  • banyaknya tweet bisa dicari dengan fungsi length()
>#pakRay Note: Contoh kita ambil 3 tweets dari #Indonesia sebanyak 10 Tweets
> tweetsA <- searchTwitter("#Indonesia", n = 10) > tweetsA

[[1]] [1] "enansetisna2: RT @FOXSportsAsia: &gt; Team's dismal performances\n&gt;Mixing duties as North Sumatra Governor and PSSI chairman\n\n@PSSI chairman Edy Rahmayadi ha…" [[2]] [1] "dimasrzqif: RT @FOXSportsAsia: &gt; Team's dismal performances\n&gt;Mixing duties as North Sumatra Governor and PSSI chairman\n\n@PSSI chairman Edy Rahmayadi ha…" [[3]] [1] "Widyandaru2: RT @FOXSportsAsia: &gt; Team's dismal performances\n&gt;Mixing duties as North Sumatra Governor and PSSI chairman\n\n@PSSI chairman Edy Rahmayadi ha…" [[4]] [1] "tuhrava: RT @FOXSportsAsia: &gt; Team's dismal performances\n&gt;Mixing duties as North Sumatra Governor and PSSI chairman\n\n@PSSI chairman Edy Rahmayadi ha…" [[5]] [1] "kantorjy: Hotel - Surrounded by mountains, Nantou County https://t.co/dNyE7NECmM #Indonesia #Hongkong" [[6]] [1] "hamidmln: RT @FOXSportsAsia: &gt; Team's dismal performances\n&gt;Mixing duties as North Sumatra Governor and PSSI chairman\n\n@PSSI chairman Edy Rahmayadi ha…" [[7]] [1] "Medianyudana: RT @FOXSportsAsia: &gt; Team's dismal performances\n&gt;Mixing duties as North Sumatra Governor and PSSI chairman\n\n@PSSI chairman Edy Rahmayadi ha…" [[8]] [1] "dirmdhn_: RT @FOXSportsAsia: &gt; Team's dismal performances\n&gt;Mixing duties as North Sumatra Governor and PSSI chairman\n\n@PSSI chairman Edy Rahmayadi ha…" [[9]] [1] "rickyhusnidzaky: RT @FOXSportsAsia: &gt; Team's dismal performances\n&gt;Mixing duties as North Sumatra Governor and PSSI chairman\n\n@PSSI chairman Edy Rahmayadi ha…" [[10]] [1] "gungsed: RT @FOXSportsAsia: &gt; Team's dismal performances\n&gt;Mixing duties as North Sumatra Governor and PSSI chairman\n\n@PSSI chairman Edy Rahmayadi ha…"

> length(tweetsA) [1] 10

Konversi dari Tweets ke DataFrame R

  • Fungsi twListToDF(tweets) untuk konversi dari tweets ke DataFrame R
  • Contoh hasil tweetsA dikonversi menjadi dataframe tweetsA.df dimana terdiri dari 10 observations dan 16 variabel. 
  • Struktur variabel-variabel ini dibuat spesial untuk twitter.
># pakRay note: convert tweets to a data frame >tweetsA.df <- twListToDF(tweetsA) >str(tweetsA.df)

'data.frame': 10 obs. of 16 variables: $ text : chr "RT @FOXSportsAsia: &gt; Team's dismal performances\n&gt;Mixing duties as North Sumatra Governor and PSSI chairm"| __truncated__ "RT @FOXSportsAsia: &gt; Team's dismal performances\n&gt;Mixing duties as North Sumatra Governor and PSSI chairm"| __truncated__ "RT @FOXSportsAsia: &gt; Team's dismal performances\n&gt;Mixing duties as North Sumatra Governor and PSSI chairm"| __truncated__ "RT @FOXSportsAsia: &gt; Team's dismal performances\n&gt;Mixing duties as North Sumatra Governor and PSSI chairm"| __truncated__ ... $ favorited : logi FALSE FALSE FALSE FALSE FALSE FALSE ... $ favoriteCount: num 0 0 0 0 0 0 0 0 0 0 $ replyToSN : logi NA NA NA NA NA NA ... $ created : POSIXct, format: "2018-11-18 09:35:15" "2018-11-18 09:34:59" ... $ truncated : logi FALSE FALSE FALSE FALSE FALSE FALSE ... $ replyToSID : logi NA NA NA NA NA NA ... $ id : chr "1064089642754928640" "1064089574664634368" "1064089559158317056" "1064089541433118720" ... $ replyToUID : logi NA NA NA NA NA NA ... $ statusSource : chr "<a href=\"http://twitter.com/download/android\" rel=\"nofollow\">Twitter for Android</a>" "<a href=\"http://twitter.com/download/android\" rel=\"nofollow\">Twitter for Android</a>" "<a href=\"http://twitter.com/download/android\" rel=\"nofollow\">Twitter for Android</a>" "<a href=\"https://mobile.twitter.com\" rel=\"nofollow\">Twitter Lite</a>" ... $ screenName : chr "enansetisna2" "dimasrzqif" "Widyandaru2" "tuhrava" ... $ retweetCount : num 2393 2393 2393 2393 0 ... $ isRetweet : logi TRUE TRUE TRUE TRUE FALSE TRUE ... $ retweeted : logi FALSE FALSE FALSE FALSE FALSE FALSE ... $ longitude : logi NA NA NA NA NA NA ... $ latitude : logi NA NA NA NA NA NA ...


Menampilkan Tweets

> #pakRay note: menampilkan tweets tertentu misalkan kedua dan ketiga > for (i in c(2:3)) { cat(paste0("[", i, "] ")) writeLines(strwrap(tweetsA.df$text[i], 60))}
[2] RT @FOXSportsAsia: &gt; Team's dismal performances &gt;Mixing duties as North Sumatra Governor and PSSI chairman @PSSI chairman Edy Rahmayadi ha… [3] RT @FOXSportsAsia: &gt; Team's dismal performances &gt;Mixing duties as North Sumatra Governor and PSSI chairman @PSSI chairman Edy Rahmayadi ha…



No comments:

Post a Comment