INDEX
Explanations
constructions involving possessive forms of 'is' and 'are'
New Auto-Interp
Negative Logits
’s
-0.26
’m
-0.19
å£°éŁ³
-0.18
’n
-0.18
‘s
-0.17
ä¸ĢäºĽ
-0.17
大
-0.16
latter
-0.16
æĥħåĨµ
-0.16
ologically
-0.15
POSITIVE LOGITS
been
0.42
got
0.32
gotta
0.29
been
0.29
not
0.29
Been
0.29
BEEN
0.28
gonna
0.27
/'
0.24
got
0.22
Activations Density 0.293%