INDEX
Explanations
references to political figures and their affiliations
New Auto-Interp
Negative Logits
mvc
-0.16
ä¸Ī
-0.14
crowned
-0.14
ÑĩÑĥ
-0.14
Installer
-0.14
amer
-0.13
archy
-0.13
ogui
-0.13
æħ¢
-0.13
receptive
-0.13
POSITIVE LOGITS
Coal
0.21
Mand
0.20
mandates
0.20
mand
0.20
Lists
0.19
coal
0.18
lists
0.18
candid
0.18
proportional
0.18
abst
0.18
Activations Density 0.030%