INDEX
Explanations
punctuation marks and their associated frequency
New Auto-Interp
Negative Logits
ãi
-0.17
eyse
-0.15
urr
-0.14
eca
-0.14
avras
-0.13
«ĺ
-0.13
oftware
-0.13
Brains
-0.13
avior
-0.12
yses
-0.12
POSITIVE LOGITS
there
0.26
indeed
0.24
after
0.20
Indeed
0.20
there
0.19
hence
0.19
they
0.19
it
0.19
during
0.19
this
0.19
Activations Density 0.110%