Generating Fake Trump Tweets with LSTM

source
1188946
[212] 6
[212, 6] 92
[212, 6, 92] 26
[212, 6, 92, 26] 380
[212, 6, 92, 26, 380] 1079
[212, 6, 92, 26, 380, 1079] 25
array([[ 212,    0,    0, ...,    0,    0,    0],
[ 212, 6, 0, ..., 0, 0, 0],
[ 212, 6, 92, ..., 0, 0, 0],
...,
[4996, 403, 104, ..., 0, 0, 0],
[4996, 403, 104, ..., 0, 0, 0],
[4996, 403, 104, ..., 0, 0, 0]])
MemoryError: Unable to allocate 255. GiB for an array with shape (1188946, 57668) and data type float32
democrat senators are doing a great job . i am notdemocratic states , the democrats are not going to be a total disaster .republican senators have a great job for the great state of texas . he will be a great governor ! #maga #kag and , @senatorheitkamp. and , others , the people gop senators must stop the flights from the united states . obama ’s campaign is a total disaster . biden has been a total disaster . i will be back soon ! #maga #kag #tcot @foxbusiness oh well , i ’m not going to be a total mess .
republican senators are working hard to get the job done in the senate . we have a great state and , great healthcare ! we need strong borders and crime !obama is a disaster for the people . he is a disaster . he is a great guy . he is a winner . he is a winner . he is a winner . he is a winner . he is a great guy and a great guy . he will be missed !

bernie sanders is lying to the people of the united states . he is a total mess . he is a total mess . he is a total mess . he is a total mess ! he is a total mess ! he is a corrupt politician ! a total witch hunt ! no collusion , no obstruction . the dems don ’t want to do it . he is a corrupt politician ! he is a corrupt politician ! he is a corrupt politician ! he is strong on crime , borders , and , the enemy of the people !
democrats stole election results . they are a disgrace to our country , and , we will win ! gop senators are working hard on the border crisis . the dems are trying to take over the border . they are now trying to take away our laws . biden will bring back our country , and we are going to win the great state of texas . we need you in a second election .
  1. Garbage in — garbage out. 90% of success stems from good data. A more careful preprocessing can be done. For instance, you can try to remove hashtags since I found that predictions always go into a “vicious circle” of hashtags when the model doesn’t know what to predict. It will simply output a ton a unrelated hashtags, which obviously doesn’t have a lot of value. Another thing to try would be drop tweets with too low or too high length.
  2. Model architecture. I was hoping to achieve better results with a deeper NN with less units, but apparently shallower, wider NN worked better for me. You can experiment with the # of layers, # of units, and dropout rate.
  3. Replace Embedding layer with the actual word embeddings trained on your dataset, such as Glove or Word2vec.

--

--

--

CS PhD @ LSU. Passionate about statistics, ML, and NLP.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Understanding Linear Regression with Python

Face Mask Detection

Know 10 Cool APIs in Machine Learning

Top 5 Classification Algorithms You’ll Actually Use In Life

“Just Point It”: Machine Learning on iOS with Pose Estimation + OCR Using Core ML and ML Kit

Machine Learning: an overview

The Magic of Reactive Supervision

Two-Stream Convolutional Networks for Action Recognition in Videos

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Miroslav Tushev

Miroslav Tushev

CS PhD @ LSU. Passionate about statistics, ML, and NLP.

More from Medium

MULTIVARIATE TIME SERIES FORECASTING USING LSTM

Univariate Time Series Forecasting ML Project | Time Series Generator | Sales Dataset

Introduction of Graph Convolutional Network (GCN) & Quick Implementation

Univariate Time Series Forecasting using FBProphet