INDEX
Explanations
conditional phrases indicating uncertainty or inquiry
New Auto-Interp
Negative Logits
ä¸įäºĨ
-0.20
Neither
-0.20
neither
-0.19
keine
-0.17
nowhere
-0.16
keinen
-0.16
Neither
-0.16
anst
-0.16
=no
-0.16
NEVER
-0.16
POSITIVE LOGITS
/how
0.51
there
0.31
indeed
0.30
anyone
0.26
anybody
0.24
any
0.23
maybe
0.23
/if
0.22
anything
0.21
there
0.21
Activations Density 0.066%