INDEX
Explanations
punctuation marks and sentence-ending symbols
New Auto-Interp
Negative Logits
Č
-0.16
atial
-0.14
edb
-0.14
yasal
-0.13
ibrated
-0.13
;;;;;;
-0.13
IDS
-0.13
APP
-0.13
n
-0.13
_↵↵
-0.12
POSITIVE LOGITS
})(
0.19
),
0.17
",
0.16
gem
0.16
eks
0.15
)[
0.15
eval
0.15
},{0.15
rips
0.15
ÏĥÏī
0.14
Activations Density 0.093%