INDEX
Explanations
subject pronouns and references to people
New Auto-Interp
Negative Logits
ToObject
-0.17
ợ
-0.15
anol
-0.15
ÌĢ
-0.14
ersh
-0.14
ÑĥÑģÑĤ
-0.14
én
-0.14
è¦ģ
-0.14
141
-0.14
ForResult
-0.14
POSITIVE LOGITS
loose
0.32
into
0.26
Loose
0.25
alone
0.25
know
0.23
be
0.22
slide
0.22
onto
0.20
slip
0.19
Slide
0.18
Activations Density 0.056%