INDEX
Explanations
following
The neuron strongly activates on the phrase “the following” (as in “Which of the following…”) in question prompts.
New Auto-Interp
Negative Logits
bose
-0.07
dispatcher
-0.06
Jal
-0.06
Como
-0.06
_con
-0.06
Wer
-0.06
荣
-0.06
(formData
-0.06
repeats
-0.06
�
-0.06
POSITIVE LOGITS
바라
0.07
bringing
0.06
ViewGroup
0.06
fortress
0.06
reading
0.06
-directory
0.06
하우
0.06
sentencing
0.06
청
0.06
disappearing
0.06
Activations Density 0.005%