INDEX
Explanations
names of prominent individuals and their relationships in political contexts
New Auto-Interp
Negative Logits
endor
-0.16
Č
-0.15
ocab
-0.15
cia
-0.15
avor
-0.14
otas
-0.14
å»·
-0.14
fos
-0.14
-val
-0.14
gis
-0.13
POSITIVE LOGITS
ÑģÑĤи
0.15
rites
0.14
μÎŃνα
0.14
éĢĨ
0.14
»
0.14
.NULL
0.14
izr
0.13
idence
0.13
#$
0.13
ayed
0.13
Activations Density 0.005%