INDEX
Explanations
advice/warnings
This neuron activates on imperative or warning language—phrases giving instructions or urging action (e.g. “pay attention,” “run away,” “watch out”).
New Auto-Interp
Negative Logits
gae
-0.07
Airport
-0.06
deer
-0.06
.tick
-0.06
.numericUpDown
-0.06
unsuccessful
-0.06
रव
-0.06
Price
-0.06
Иванов
-0.06
пов
-0.06
POSITIVE LOGITS
aşam
0.07
uib
0.07
Reduce
0.07
>,
0.07
.bo
0.07
バ
0.07
يب
0.06
inhab
0.06
سب
0.06
(argc
0.06
Activations Density 0.046%