INDEX
Explanations
reasonable
This neuron detects moderate performance qualifiers—especially the word “reasonable” (and similar hedging adjectives) describing acceptable specs.
New Auto-Interp
Negative Logits
μο
-0.07
58
-0.06
Frid
-0.06
Radius
-0.06
utterly
-0.06
matched
-0.06
glyc
-0.06
commerc
-0.06
FormGroup
-0.06
pointless
-0.06
POSITIVE LOGITS
.rev
0.07
|$
0.06
ja
0.06
'])){
↵0.06
iVar
0.06
الوطني
0.06
stoup
0.06
.'↵↵
0.06
provocative
0.06
untu
0.06
Activations Density 0.046%