INDEX
Explanations
references to government, state functions, and critiques of the social state
New Auto-Interp
Negative Logits
reh
-0.17
leigh
-0.16
inho
-0.15
aeda
-0.15
506
-0.15
oppable
-0.14
ecast
-0.14
-generic
-0.13
ottom
-0.13
ingen
-0.13
POSITIVE LOGITS
AMA
0.19
onya
0.17
ema
0.16
vertime
0.15
EMA
0.15
ÄĻż
0.14
avern
0.14
TEM
0.14
Allies
0.14
erno
0.14
Activations Density 0.174%