INDEX
Explanations
titles or roles associated with individuals, particularly in formal or authoritative contexts
New Auto-Interp
Negative Logits
aeda
-0.18
upt
-0.17
ίγ
-0.15
titled
-0.15
imest
-0.15
ieber
-0.14
unnamed
-0.14
_GENER
-0.14
åı«
-0.14
-widget
-0.14
POSITIVE LOGITS
who
0.23
whom
0.22
whose
0.19
who
0.17
from
0.14
nearest
0.14
gh
0.14
.k
0.13
himself
0.13
Ïģιά
0.13
Activations Density 0.103%