INDEX
Explanations
challenges
This neuron detects terms referring to difficulties or obstacles (e.g., “challenge(s)” and “hardship”).
New Auto-Interp
Negative Logits
Rice
-0.07
Rut
-0.07
purple
-0.07
Pear
-0.07
foot
-0.07
oot
-0.07
_ot
-0.07
UInt
-0.07
egret
-0.07
(rot
-0.07
POSITIVE LOGITS
challenge
0.14
Challenge
0.13
challenging
0.11
Challenge
0.11
challenge
0.10
challenges
0.10
Challenges
0.09
Challenger
0.09
challenged
0.08
Chase
0.08
Activations Density 0.021%