INDEX
Explanations
words related to governance and political entities
New Auto-Interp
Negative Logits
er
-0.23
eron
-0.18
erne
-0.17
ól
-0.17
erro
-0.17
erer
-0.16
erap
-0.16
å¹ķ
-0.16
erp
-0.15
erot
-0.15
POSITIVE LOGITS
itch
0.20
adia
0.19
antage
0.19
ascular
0.18
à¥įह
0.17
aled
0.17
ariate
0.16
olution
0.16
ell
0.16
anni
0.16
Activations Density 0.026%