INDEX
Explanations
references to global and societal issues
New Auto-Interp
Negative Logits
afone
-0.17
ruba
-0.16
uristic
-0.15
strup
-0.15
влад
-0.14
ifo
-0.14
akit
-0.14
837
-0.14
ulet
-0.14
odings
-0.14
POSITIVE LOGITS
-wide
0.23
itself
0.22
wide
0.20
experienced
0.17
overall
0.16
collectively
0.16
isti
0.16
wide
0.15
/system
0.15
te
0.15
Activations Density 0.126%