EPPS 6302: Assignment 3

Analysis of Twitter Data: Biden-Xi Summit


This assignment aims to analyze Twitter data related to the President Biden and Xi summit in November 2021. We will explore public sentiment, popular hashtags, and user engagement through various text analysis techniques.

Data Loading and Preparation

Warning: package 'quanteda' was built under R version 4.2.3
Warning in .recacheSubclasses(def@className, def, env): undefined subclass
"pcorMatrix" of class "xMatrix"; definition not updated
Warning in .recacheSubclasses(def@className, def, env): undefined subclass
"pcorMatrix" of class "mMatrix"; definition not updated
Warning in .recacheSubclasses(def@className, def, env): undefined subclass
"pcorMatrix" of class "replValueSp"; definition not updated
Package version: 3.3.1
Unicode version: 13.0
ICU version: 69.1
Parallel computing: 12 of 12 threads used.
See https://quanteda.io for tutorials and examples.
Warning: package 'quanteda.textmodels' was built under R version 4.2.3
Warning: package 'quanteda.textplots' was built under R version 4.2.3
Warning: package 'ggplot2' was built under R version 4.2.3
Warning: package 'knitr' was built under R version 4.2.3
Warning: package 'tidyverse' was built under R version 4.2.3
Warning: package 'tibble' was built under R version 4.2.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.0     ✔ stringr   1.5.0
✔ forcats   1.0.0     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
# Load the data
summit <- read_csv("https://raw.githubusercontent.com/datageneration/datamethods/master/textanalytics/summit_11162021.csv")
Rows: 14520 Columns: 90
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (50): screen_name, text, source, reply_to_screen_name, hashtags, symbol...
dbl  (26): user_id, status_id, display_text_width, reply_to_status_id, reply...
lgl  (10): is_quote, is_retweet, quote_count, reply_count, ext_media_type, q...
dttm  (4): created_at, quoted_created_at, retweet_created_at, account_create...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Display the data
user_id status_id created_at screen_name text source display_text_width reply_to_status_id reply_to_user_id reply_to_screen_name is_quote is_retweet favorite_count retweet_count quote_count reply_count hashtags symbols urls_url urls_t.co urls_expanded_url media_url media_t.co media_expanded_url media_type ext_media_url ext_media_t.co ext_media_expanded_url ext_media_type mentions_user_id mentions_screen_name lang quoted_status_id quoted_text quoted_created_at quoted_source quoted_favorite_count quoted_retweet_count quoted_user_id quoted_screen_name quoted_name quoted_followers_count quoted_friends_count quoted_statuses_count quoted_location quoted_description quoted_verified retweet_status_id retweet_text retweet_created_at retweet_source retweet_favorite_count retweet_retweet_count retweet_user_id retweet_screen_name retweet_name retweet_followers_count retweet_friends_count retweet_statuses_count retweet_location retweet_description retweet_verified place_url place_name place_full_name place_type country country_code geo_coords coords_coords bbox_coords status_url name location description url protected followers_count friends_count listed_count statuses_count favourites_count account_created_at verified profile_url profile_expanded_url account_lang profile_banner_url profile_background_url profile_image_url
1.375230e+18 1.460702e+18 2021-11-16 20:10:23 DSJ78992721 Breaking News: US President Biden & communist china leader, xi jinpig, pledged at a virtual summit to improve cooperation, but china offered no major breakthroughs. Twitter for iPhone 144 NA NA NA FALSE TRUE 0 7 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1329215021641961472 GundamNorthrop en NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1.460498e+18 Breaking News: US President Biden & communist china leader, xi jinpig, pledged at a virtual summit to improve cooperation, but china offered no major breakthroughs. 2021-11-16 06:40:33 Twitter Web App 18 7 1.329215e+18 GundamNorthrop Northrop Gundam 376 519 27968 NA NA FALSE NA NA NA NA NA NA | | ||||||| https://twitter.com/DSJ78992721/status/1460701806644715521 Wilson Edwards NYC UC Berkeley Econ ’04, Investment Banking, Corgis, Providing 4 my beautiful family, free speech, fighting 4 those w/o a voice, & drinking CCP Troll Tears https://t.co/QdxZ3bZuEd FALSE 219 1017 1 5360 5280 2021-03-25 23:36:23 FALSE https://t.co/QdxZ3bZuEd http://wumao.com NA https://pbs.twimg.com/profile_banners/1375230026120003585/1616716661 NA http://pbs.twimg.com/profile_images/1375230497257713664/aKjtKvRF_normal.jpg
2.600418e+08 1.460702e+18 2021-11-16 20:10:17 bradhooperarch https://t.co/rKRzwyIvcy

Peter Dutton so disappointed |Twitter for iPad | 71| NA| NA|NA |FALSE |TRUE | 0| 7|NA |NA |NA |NA |theage.com.au/world/north-am… |https://t.co/rKRzwyIvcy |https://www.theage.com.au/world/north-america/my-old-friend-biden-xi-begin-superpower-summit-on-friendly-note-20211116-p599cj.html?btis |NA |NA |NA |NA |NA |NA |NA |NA |289822163 |annschof_ann |en | NA|NA |NA |NA | NA| NA| NA|NA |NA | NA| NA| NA|NA |NA |NA | 1.460695e+18|https://t.co/rKRzwyIvcy

Peter Dutton so disappointed |2021-11-16 19:44:16 |Twitter for iPad | 16| 7| 2.898222e+08|annschof_ann |ann schofield✳️✳️ | 2416| 3240| 19496|Brunswick Victoria Australia |Stoic. Doubled vaxxed. Reader, writer, educator, streamer of long form drama, living on Wurundjeri land. ☕️🐀💉 |FALSE |NA |NA |NA |NA |NA |NA || || |||||||| |https://twitter.com/bradhooperarch/status/1460701781915103232 |Brad Hooper 🔴⚪️💙 |Australia |Architect and urban designer from Central Victoria. |https://t.co/gVltBbOihc |FALSE | 2279| 1911| 159| 328138| 147877|2011-03-03 02:47:37 |FALSE |https://t.co/gVltBbOihc |http://bradhooperarchitecture.com |NA |https://pbs.twimg.com/profile_banners/260041830/1609978990 |http://abs.twimg.com/images/themes/theme6/bg.gif |http://pbs.twimg.com/profile_images/1174623153663827968/B-VnzPwM_normal.jpg | | 3.004364e+09| 1.460702e+18|2021-11-16 20:10:10 |scarecrow1113 |[Recap] Biden urges ‘guardrails’ against conflict in virtual Xi summit

https://t.co/ezK97cFBG4 |Twitter Web App | 105| NA| NA|NA |FALSE |TRUE | 0| 5|NA |NA |NA |NA |hongkongfp.com/2021/11/16/bid… |https://t.co/ezK97cFBG4 |https://hongkongfp.com/2021/11/16/biden-urges-guardrails-against-conflict-in-virtual-xi-summit |NA |NA |NA |NA |NA |NA |NA |NA |3071162052 |hkfp |en | NA|NA |NA |NA | NA| NA| NA|NA |NA | NA| NA| NA|NA |NA |NA | 1.460625e+18|[Recap] Biden urges ‘guardrails’ against conflict in virtual Xi summit

https://t.co/ezK97cFBG4 |2021-11-16 15:04:45 |Buffer | 4| 5| 3.071162e+09|hkfp |Hong Kong Free Press HKFP | 376450| 20| 49620|Hong Kong |Non-profit, impartial Hong Kong news. Backed by readers, governed by an ethics code, 100% independent & no paywall. Contact: https://t.co/nMZjhaI1MT |TRUE |NA |NA |NA |NA |NA |NA || || |||||||| |https://twitter.com/scarecrow1113/status/1460701749266685952 |liuwailing |NA |NA |NA |FALSE | 160| 19| 1| 62255| 31839|2015-01-31 05:50:42 |FALSE |NA |NA |NA |NA |http://abs.twimg.com/images/themes/theme1/bg.png |http://pbs.twimg.com/profile_images/1205454452787793920/fAawRZ_7_normal.jpg | | 3.004364e+09| 1.460328e+18|2021-11-15 19:24:04 |scarecrow1113 |U.S. President Joe Biden and his Chinese counterpart Xi Jinping are expected to discuss damage control, rather than resolution of key differences, as the main focus of their video summit later today Washington time.

https://t.co/iUNHwMvsAO |Twitter Web App | 140| NA| NA|NA |FALSE |TRUE | 0| 5|NA |NA |NA |NA |NA |NA |NA |NA |NA |NA |NA |NA |NA |NA |NA |16561457 |RadioFreeAsia |en | NA|NA |NA |NA | NA| NA| NA|NA |NA | NA| NA| NA|NA |NA |NA | 1.460266e+18|U.S. President Joe Biden and his Chinese counterpart Xi Jinping are expected to discuss damage control, rather than resolution of key differences, as the main focus of their video summit later today Washington time.

https://t.co/iUNHwMvsAO |2021-11-15 15:16:45 |TweetDeck | 5| 5| 1.656146e+07|RadioFreeAsia |Radio Free Asia | 58042| 24| 25162|Washington, DC |Delivering reliable, uncensored news and providing an open forum for citizens in Asian countries that restrict media, free press and free speech. RT≠Endorsement |TRUE |NA |NA |NA |NA |NA |NA || || |||||||| |https://twitter.com/scarecrow1113/status/1460327759855587329 |liuwailing |NA |NA |NA |FALSE | 160| 19| 1| 62255| 31839|2015-01-31 05:50:42 |FALSE |NA |NA |NA |NA |http://abs.twimg.com/images/themes/theme1/bg.png |http://pbs.twimg.com/profile_images/1205454452787793920/fAawRZ_7_normal.jpg | | 1.361768e+18| 1.460493e+18|2021-11-16 06:22:29 |Internl_Leaks |#BREAKING Biden opens virtual summit with China’s Xi from White House

#BreakingNews #Biden #China #Usa |Twitter for Android | 103| NA| NA|NA |FALSE |FALSE | 0| 3|NA |NA |BREAKING|BreakingNews|Biden|China|Usa |NA |NA |NA |NA |NA |NA |NA |NA |NA |NA |NA |NA |NA |NA |en | NA|NA |NA |NA | NA| NA| NA|NA |NA | NA| NA| NA|NA |NA |NA | NA|NA |NA |NA | NA| NA| NA|NA |NA | NA| NA| NA|NA |NA |NA |NA |NA |NA |NA |NA |NA || || |||||||| |https://twitter.com/Internl_Leaks/status/1460493456330006529 |International Leaks |World |Geopolitics - Foreign Policy - International Leaks YouTube - https://t.co/tktmC4b4Cs Facebook - https://t.co/Bq4Qz4501m |NA |FALSE | 3513| 0| 51| 19150| 269|2021-02-16 20:04:26 |FALSE |NA |NA |NA |https://pbs.twimg.com/profile_banners/1361768369100181506/1631424178 |NA |http://pbs.twimg.com/profile_images/1437134165497942016/vFB2QGW4_normal.jpg | | 1.361768e+18| 1.460702e+18|2021-11-16 20:09:36 |Internl_Leaks |#BREAKING Biden opens virtual summit with China’s Xi from White House

#BreakingNews #Biden #China #Usa |Twitter for Android | 122| NA| NA|NA |FALSE |TRUE | 0| 3|NA |NA |BREAKING|BreakingNews|Biden|China|Usa |NA |NA |NA |NA |NA |NA |NA |NA |NA |NA |NA |NA |1361768369100181506 |Internl_Leaks |en | NA|NA |NA |NA | NA| NA| NA|NA |NA | NA| NA| NA|NA |NA |NA | 1.460493e+18|#BREAKING Biden opens virtual summit with China’s Xi from White House

#BreakingNews #Biden #China #Usa |2021-11-16 06:22:29 |Twitter for Android | 0| 3| 1.361768e+18|Internl_Leaks |International Leaks | 3513| 0| 19150|World |Geopolitics - Foreign Policy - International Leaks YouTube - https://t.co/tktmC4b4Cs Facebook - https://t.co/Bq4Qz4501m |FALSE |NA |NA |NA |NA |NA |NA || || |||||||| |https://twitter.com/Internl_Leaks/status/1460701607910264832 |International Leaks |World |Geopolitics - Foreign Policy - International Leaks YouTube - https://t.co/tktmC4b4Cs Facebook - https://t.co/Bq4Qz4501m |NA |FALSE | 3513| 0| 51| 19150| 269|2021-02-16 20:04:26 |FALSE |NA |NA |NA |https://pbs.twimg.com/profile_banners/1361768369100181506/1631424178 |NA |http://pbs.twimg.com/profile_images/1437134165497942016/vFB2QGW4_normal.jpg |

Latent Semantic Analysis (LSA)

Latent Semantic Analysis helps us understand underlying themes.

sum_twt = summit$text
toks = tokens(sum_twt)
sumtwtdfm <- dfm(toks)

# Perform LSA
sum_lsa <- textmodel_lsa(sumtwtdfm)

# Plot LSA results
lsa_dim1 <- sum_lsa$docs[,1]
lsa_dim2 <- sum_lsa$docs[,2]

plot(lsa_dim1, lsa_dim2, xlab = "LSA Dimension 1", ylab = "LSA Dimension 2", main = "LSA of Summit Tweets", pch = 20)

Hashtag Analysis

Analyzing the most popular hashtags.

tweet_dfm <- tokens(sum_twt, remove_punct = TRUE) %>%
tag_dfm <- dfm_select(tweet_dfm, pattern = "#*")
# Extract top 50 hashtags
top_hashtags <- topfeatures(tag_dfm, 25)

# Convert to data frame for plotting
df_hashtags <- data.frame(
  hashtag = names(top_hashtags),
  frequency = top_hashtags

# Plot the bar graph
ggplot(df_hashtags, aes(x = reorder(hashtag, frequency), y = frequency)) +
  geom_bar(stat = "identity") +
  coord_flip() +  # Flips the axes for easier reading
  labs(title = "Top 25 Hashtags", x = "Hashtag", y = "Frequency") +

Hashtag Network Analysis

Visualizing how different hashtags are related.

tweet_dfm <- tokens(sum_twt, remove_punct = TRUE) %>%

# Create a network plot of hashtags
tag_fcm <- fcm(dfm_select(tweet_dfm, pattern = "#*"))
toptag <- names(topfeatures(dfm_select(tweet_dfm, pattern = "#*"), 50))

topgat_fcm <- fcm_select(tag_fcm, pattern = toptag)
textplot_network(topgat_fcm, min_freq = 50, edge_alpha = 0.8, edge_size = 5)

User Mention Analysis

Analyzing the most frequently mentioned users.

# Extract user mentions
user_dfm <- dfm_select(tweet_dfm, pattern = "@*")
topuser <- names(topfeatures(user_dfm, 50))
user_fcm <- fcm(user_dfm)
user_fcm <- fcm_select(user_fcm, pattern = topuser)
textplot_network(user_fcm, min_freq = 10, edge_color = "#e05248", edge_alpha = 0.8, edge_size = 5)