INDEX
Explanations
Question-answering
The neuron flags tokens in the user‐provided “behavior” example sentences (i.e. the actual scenario description) rather than the surrounding instructions or choices.
New Auto-Interp
Negative Logits
acles
-0.07
IPLE
-0.07
UNIX
-0.07
homicide
-0.06
لن
-0.06
LIVE
-0.06
Unix
-0.06
aptic
-0.06
.Ui
-0.06
epochs
-0.06
POSITIVE LOGITS
.scrollView
0.06
частини
0.06
ViewChild
0.06
_far
0.06
-guid
0.06
Chavez
0.06
продукт
0.06
orns
0.06
Dumbledore
0.06
downstream
0.05
Activations Density 0.011%