INDEX
Explanations
references to specific individuals and their affiliations or titles
New Auto-Interp
Negative Logits
ÙĴÙģ
-0.16
ffe
-0.15
adin
-0.14
اÙĪÛĮ
-0.14
wel
-0.14
úi
-0.14
uristic
-0.14
pcion
-0.14
seedu
-0.13
udded
-0.13
POSITIVE LOGITS
.pa
0.15
ills
0.15
arez
0.14
ulan
0.14
åİħ
0.13
/Page
0.13
utherland
0.13
zzo
0.13
ker
0.13
alc
0.12
Activations Density 0.344%