INDEX
Explanations
This neuron activates on the word “shy.”
New Auto-Interp
Negative Logits
sponsored
-0.07
music
-0.07
Less
-0.07
.RES
-0.07
Res
-0.06
दम
-0.06
rent
-0.06
evidence
-0.06
beers
-0.06
runtime
-0.06
POSITIVE LOGITS
')}}"
0.08
(PDO
0.07
aba
0.07
Shrine
0.07
のような
0.06
ABA
0.06
<Character
0.06
씨
0.06
Daisy
0.06
<thead
0.06
Activations Density 0.005%