INDEX
Explanations
possessive forms indicating ownership or association
New Auto-Interp
Negative Logits
å°ı说
-0.21
æĥħ
-0.20
人åĵ¡
-0.19
人æ°Ĺ
-0.18
大
-0.18
å¿ĥ
-0.18
人
-0.17
’s
-0.17
人类
-0.17
人åijĺ
-0.17
POSITIVE LOGITS
own
0.26
Own
0.21
ÂĢÂĻ
0.21
gotta
0.19
ÂĿ
0.19
own
0.18
'
0.18
-eye
0.18
sake
0.17
been
0.17
Activations Density 0.709%