INDEX
Explanations
phrases related to possession or ownership
New Auto-Interp
Negative Logits
çļĦè¯Ŀ
-0.18
ãĤĪãģĨãģª
-0.18
ä¸Ģ
-0.16
fully
-0.16
with
-0.16
大
-0.16
Äijây
-0.16
ãģĬ
-0.15
alone
-0.15
orem
-0.15
POSITIVE LOGITS
sorts
0.41
course
0.32
/from
0.29
0.27
course
0.25
-course
0.23
each
0.23
these
0.23
/by
0.23
this
0.22
Activations Density 1.766%