INDEX
Explanations
social media apps
mentions of the messaging application WhatsApp
New Auto-Interp
Negative Logits
ci
-0.83
ively
-0.68
irk
-0.68
ura
-0.63
uct
-0.63
interpreted
-0.62
ado
-0.62
eline
-0.62
honored
-0.61
ae
-0.61
POSITIVE LOGITS
Whats
4.20
whats
2.17
Lets
1.37
Pastebin
1.29
yip
1.15
thats
1.06
1.00
20439
0.99
VK
0.98
wast
0.98
Activations Density 0.029%