INDEX
Explanations
intense emotions and aggressive language
New Auto-Interp
Negative Logits
toppers
-0.64
?).
-0.58
?')
-0.58
uhi
-0.57
tzmann
-0.56
насеље
-0.56
vieles
-0.56
Portály
-0.55
вроде
-0.55
XmlAccessorType
-0.54
POSITIVE LOGITS
🖕
0.79
motherfucker
0.73
autorytatywna
0.68
bitch
0.67
dares
0.64
fucker
0.63
insol
0.63
黙
0.61
fucking
0.60
fuck
0.60
Activations Density 0.150%