INDEX
Explanations
possessive forms indicating ownership or relation
New Auto-Interp
Negative Logits
å¿ĥ
-0.19
å°ı说
-0.18
人åĵ¡
-0.18
æĥħ
-0.17
大
-0.17
å£°éŁ³
-0.17
人æ°Ĺ
-0.17
æĥħåĨµ
-0.17
人
-0.16
éĹ®é¢ĺ
-0.16
POSITIVE LOGITS
own
0.38
Own
0.26
'
0.26
own
0.25
ÂĿ
0.24
ÂĢÂĻ
0.24
Own
0.23
entire
0.22
newest
0.22
latest
0.21
Activations Density 0.341%