INDEX
Explanations
concepts related to advocacy and support for specific social and political issues
New Auto-Interp
Negative Logits
gue
-0.18
esub
-0.17
ouro
-0.16
ibur
-0.15
operands
-0.15
stru
-0.15
OLER
-0.15
_bd
-0.15
Buen
-0.15
DAQ
-0.15
POSITIVE LOGITS
951
0.15
ż
0.15
bor
0.14
киÑĪ
0.14
rik
0.14
heimer
0.14
IDI
0.14
endid
0.14
ishi
0.14
idi
0.14
Activations Density 0.668%