INDEX
Explanations
names of political figures and entities
names of people and related officials
New Auto-Interp
Negative Logits
espie
-0.65
hirt
-0.64
cale
-0.63
aminer
-0.61
taboola
-0.57
enhagen
-0.57
advertising
-0.57
mot
-0.57
ierrez
-0.53
COURT
-0.53
POSITIVE LOGITS
ĪĴ
0.85
ulhu
0.65
TBD
0.57
querque
0.53
Shogun
0.51
Miko
0.51
INAL
0.50
Ribbon
0.50
ر
0.49
AIDS
0.48
Activations Density 0.586%