INDEX
Explanations
prominent political figures and their affiliations
New Auto-Interp
Negative Logits
Kingdom
-0.16
onomy
-0.16
OTA
-0.15
itas
-0.14
OWER
-0.14
legen
-0.14
\Bundle
-0.14
ابÙĩ
-0.13
kingdom
-0.13
crown
-0.13
POSITIVE LOGITS
ifar
0.19
ë¹Į
0.16
osto
0.15
iban
0.15
Harm
0.15
iken
0.14
overe
0.14
isoft
0.14
ibold
0.14
_BORDER
0.14
Activations Density 0.048%