INDEX
Explanations
punctuation marks and related formatting characters
New Auto-Interp
Negative Logits
I
-0.58
(
-0.51
we
-0.49
is
-0.49
by
-0.48
let
-0.48
·
-0.48
↵
-0.48
su
-0.47
you
-0.47
POSITIVE LOGITS
tartalomajánló
1.10
виправивши
0.99
itſelf
0.85
ostavi
0.84
ſche
0.83
متعلقه
0.82
endpush
0.81
CURIAM
0.81
ſever
0.76
expandindo
0.75
Activations Density 0.019%