INDEX
Explanations
various punctuation marks and their associated contexts
New Auto-Interp
Negative Logits
ãģ¾ãģŁ
-0.16
.jp
-0.16
áh
-0.14
ordinate
-0.14
Bias
-0.13
ãĥ¼ãĥī
-0.13
ká
-0.13
olt
-0.13
:
-0.13
ement
-0.13
POSITIVE LOGITS
why
0.24
how
0.23
an
0.20
what
0.19
ä¸Ģç§į
0.18
a
0.18
Part
0.17
How
0.17
Why
0.17
aka
0.15
Activations Density 0.115%