INDEX
Explanations
specific names and positions associated with individuals or entities
New Auto-Interp
Negative Logits
himself
-0.32
Himself
-0.24
his
-0.20
his
-0.18
seinen
-0.18
sám
-0.17
seiner
-0.15
ä»ĸçļĦ
-0.14
его
-0.14
seine
-0.14
POSITIVE LOGITS
alike
0.45
respectively
0.42
respective
0.30
ê°ģê°ģ
0.28
themselves
0.25
åĪĨåĪ«
0.24
ÑģооÑĤвеÑĤ
0.22
sowie
0.21
两人
0.21
serta
0.20
Activations Density 0.153%