INDEX
Explanations
mentions of social media platforms like Facebook and Twitter
New Auto-Interp
Negative Logits
sole
-0.75
defects
-0.71
unborn
-0.68
hani
-0.68
recoil
-0.68
uties
-0.68
idates
-0.67
abilities
-0.65
older
-0.65
emen
-0.64
POSITIVE LOGITS
1.20
Youtube
1.15
1.15
1.14
1.12
YouTube
1.10
Internet
1.09
1.07
acebook
1.06
Tumblr
1.05
Activations Density 0.242%