INDEX
Explanations
This neuron detects positive, subjective evaluative words and intensifiers (e.g., “awesome,” “incredible,” “great,” “fun”).
New Auto-Interp
Negative Logits
iconductor
-0.07
tz
-0.07
orneys
-0.06
Rendering
-0.06
.setX
-0.06
دی
-0.06
question
-0.06
uild
-0.06
CTX
-0.06
cı
-0.06
POSITIVE LOGITS
股
0.07
misses
0.06
Eş
0.06
ありがとう
0.06
outrageous
0.06
bella
0.06
"user
0.06
育
0.06
seemed
0.06
sessions
0.06
Activations Density 0.035%