INDEX
Explanations
indications of success in responses or messages
success indicator
New Auto-Interp
Negative Logits
“
-0.34
-0.33
(
-0.32
<h2>
-0.32
"
-0.31
the
-0.30
Bats
-0.30
about
-0.30
[
-0.30
<bos>
-0.30
POSITIVE LOGITS
success
1.08
SUCCESS
1.02
Success
1.02
SUCCESS
1.01
Success
0.99
success
0.95
SuccessListener
0.94
uccess
0.87
sucess
0.87
成功
0.87
Activations Density 0.013%