INDEX
Explanations
names of individuals, particularly those associated with leadership or authority
New Auto-Interp
Negative Logits
kla
-0.17
عاÙĨ
-0.16
apis
-0.16
uesta
-0.15
END
-0.15
ONA
-0.14
áp
-0.14
Millis
-0.14
_endpoint
-0.13
pair
-0.13
POSITIVE LOGITS
utz
0.20
nu
0.16
ieren
0.15
lez
0.15
haf
0.15
ridged
0.15
ustom
0.14
GLOBALS
0.14
iggers
0.14
ADED
0.14
Activations Density 0.010%