INDEX
Explanations
suggestions and attempts
The neuron responds to words expressing personal intention or effort (e.g. “attempt,” “intent,” “committed,” “believe”).
New Auto-Interp
Negative Logits
Doch
-0.07
Recognition
-0.07
.↵↵↵↵↵↵↵↵↵↵↵↵
-0.06
Mp
-0.06
Mü
-0.06
اینتر
-0.06
�
-0.06
錯
-0.06
ambre
-0.06
.apps
-0.06
POSITIVE LOGITS
($.
0.06
"#
0.06
ČR
0.06
üçüncü
0.06
Batch
0.06
#
0.06
ість
0.06
lawn
0.06
equival
0.05
expr
0.05
Activations Density 0.061%