INDEX
Explanations
punctuation marks and formatting symbols in the text
New Auto-Interp
Negative Logits
stb
-0.58
мәкал
-0.58
omiast
-0.56
honom
-0.55
(?)
-0.55
(%)
-0.55
ddelweddau
-0.55
,....
-0.54
IMA
-0.54
--------------
-0.54
POSITIVE LOGITS
they
1.19
hence
1.14
namely
1.12
although
1.06
the
1.05
albeit
1.01
namely
1.01
not
1.00
hence
1.00
they
0.96
Activations Density 0.232%