INDEX
Explanations
This neuron detects words that signal positive change, improvement, or encouragement (e.g., make, give, transform, boost, keep).
New Auto-Interp
Negative Logits
FUL
-0.07
X
-0.06
署
-0.06
Mikhail
-0.06
']),↵
-0.06
_AVAILABLE
-0.06
ningen
-0.06
.StretchImage
-0.06
сто
-0.06
bakan
-0.06
POSITIVE LOGITS
onChangeText
0.07
těz
0.06
ayo
0.06
شود
0.06
Stage
0.06
อย
0.06
female
0.06
__)↵
0.06
इत
0.06
izzle
0.06
Activations Density 0.136%