INDEX
Explanations
long sentences
Tokens marking conversation structure and speaker/header metadata (e.g., header IDs, role labels like user/assistant, and named-speaker tokens).
New Auto-Interp
Negative Logits
comple
-0.06
ient
-0.06
according
-0.06
tantal
-0.06
skeptical
-0.06
animate
-0.06
oky
-0.06
SVM
-0.06
Expect
-0.06
かり
-0.06
POSITIVE LOGITS
بق
0.07
Σχ
0.07
"") ↵
0.07
.Middle
0.07
:date
0.07
.SetFloat
0.07
Blocked
0.06
.KeyCode
0.06
") ↵
0.06
çarp
0.06
Activations Density 0.059%