INDEX
Explanations
contractions and possessive forms related to individuals
New Auto-Interp
Negative Logits
’s
-0.23
ä¸ĢäºĽ
-0.18
latter
-0.18
æĥħåĨµ
-0.17
‘s
-0.17
’m
-0.16
å£°éŁ³
-0.16
’n
-0.15
大
-0.15
ously
-0.15
POSITIVE LOGITS
been
0.40
got
0.33
gotta
0.29
Been
0.28
been
0.28
BEEN
0.27
gonna
0.26
not
0.25
got
0.23
Got
0.22
Activations Density 0.321%