INDEX
Explanations
terms related to advocacy and support for various causes
New Auto-Interp
Negative Logits
idar
-0.17
ald
-0.17
æµģ
-0.15
quo
-0.15
ish
-0.14
anou
-0.14
icamente
-0.14
lement
-0.14
缮
-0.14
út
-0.14
POSITIVE LOGITS
against
0.21
ise
0.18
Against
0.17
ur
0.16
against
0.15
groups
0.15
ruz
0.15
ilon
0.15
roles
0.15
/support
0.15
Activations Density 0.023%