INDEX
Explanations
names of individuals or entities, specifically focusing on the last names
mentions of specific surnames or individuals' names
New Auto-Interp
Negative Logits
ãĥ¼ãĥĨ
-0.83
士
-0.82
ãĤ¼ãĤ¦ãĤ¹
-0.82
ãĤ½
-0.79
é¾įåĸļ士
-0.75
[|
-0.73
ISH
-0.73
ILCS
-0.72
mental
-0.70
EED
-0.70
POSITIVE LOGITS
lyak
0.93
Pes
0.92
berman
0.87
achy
0.84
kens
0.83
ollah
0.83
waters
0.78
ablishment
0.77
gom
0.77
ky
0.76
Activations Density 0.013%