INDEX
Explanations
instances of the word "we" and related forms indicating collective action or sentiment
New Auto-Interp
Negative Logits
.cx
-0.15
etus
-0.15
Butt
-0.14
stal
-0.14
asse
-0.14
emiz
-0.14
upuncture
-0.14
zz
-0.14
bjerg
-0.14
ass
-0.13
POSITIVE LOGITS
believe
0.21
estimate
0.18
zes
0.17
Believe
0.16
view
0.16
ivo
0.15
-bel
0.15
ÑĢоз
0.15
believes
0.15
bel
0.15
Activations Density 0.091%