INDEX
Explanations
sentences with subject-verb agreements and actions taken by collective subjects
New Auto-Interp
Negative Logits
my
-0.37
GenerationType
-0.37
myself
-0.35
me
-0.33
是个
-0.33
larımız
-0.32
என
-0.32
víctima
-0.32
والذي
-0.32
是個
-0.31
POSITIVE LOGITS
themselves
1.22
Their
1.16
their
1.14
themselves
1.13
their
1.09
Their
1.05
loro
1.02
mereka
0.98
they
0.97
他們的
0.97
Activations Density 1.311%