INDEX
Explanations
terms related to political positions or stances
New Auto-Interp
Negative Logits
aeda
-0.15
isay
-0.15
ácil
-0.15
pedia
-0.15
ôte
-0.15
ween
-0.15
lena
-0.14
767
-0.14
ongyang
-0.14
amedi
-0.14
POSITIVE LOGITS
.FontStyle
0.15
IBUT
0.15
STA
0.15
ãĤ«ãĥ«
0.14
.xtext
0.14
criptor
0.14
ee
0.14
Kom
0.14
exemplary
0.14
æŁľ
0.14
Activations Density 0.007%