INDEX
Explanations
discourse about the implications and seriousness of violence and societal issues
New Auto-Interp
Negative Logits
éĮĦ
-0.13
Ñģвоими
-0.13
ader
-0.12
adero
-0.12
μεν
-0.12
ierung
-0.12
adies
-0.12
iem
-0.12
Afterwards
-0.12
Due
-0.12
POSITIVE LOGITS
blanket
0.20
few
0.20
I
0.20
Taken
0.19
absent
0.19
Taken
0.18
taken
0.17
such
0.17
plenty
0.17
context
0.17
Activations Density 0.380%