INDEX
Explanations
multiple choice options
The neuron fires on the appearance of the answer choice “C” token.
New Auto-Interp
Negative Logits
,get
-0.07
сю
-0.07
,就是
-0.07
migrate
-0.06
described
-0.06
sse
-0.06
__(/*!
-0.06
IED
-0.06
pued
-0.06
eighth
-0.06
POSITIVE LOGITS
Venom
0.07
ouses
0.06
gauss
0.06
uang
0.06
nodeList
0.06
ектив
0.06
ฤด
0.06
commons
0.06
aleza
0.06
target
0.06
Activations Density 0.002%