INDEX
Explanations
mentions of individuals or references to social media interactions
New Auto-Interp
Negative Logits
LabelTagHelper
-0.63
ỡng
-0.60
Roskov
-0.60
mybatisplus
-0.57
actionTypes
-0.57
brainly
-0.56
ioutil
-0.56
nahilalakip
-0.56
jss
-0.55
gynhyrchwyd
-0.55
POSITIVE LOGITS
tweeted
1.19
tweets
1.14
tweet
1.13
tweeting
1.09
1.07
1.05
Tweets
1.00
Tweet
0.99
0.96
0.94
Activations Density 0.082%