INDEX
Explanations
questions and prompts
This neuron activates on list‐style “What are … that …” questions—i.e. interrogative requests like “What are popular/leading/important X that …?”
metrics related to the evaluation of generative question-answering tasks.
New Auto-Interp
Negative Logits
olume
-0.07
subtract
-0.07
filer
-0.07
cloth
-0.06
six
-0.06
%#
-0.06
izando
-0.06
岁
-0.06
ساب
-0.06
fov
-0.06
POSITIVE LOGITS
_DESTROY
0.06
/modules
0.06
(%)
0.06
[root
0.06
elseif
0.06
.REG
0.06
DateFormat
0.06
[mid
0.06
postupně
0.06
Woj
0.06
Activations Density 0.114%