INDEX
Explanations
The neuron specifically detects occurrences of the phrase “it’s good.”
New Auto-Interp
Negative Logits
capsule
-0.06
assez
-0.06
امه
-0.06
"}}>↵
-0.06
bedrooms
-0.06
Palo
-0.06
%;↵
-0.06
cps
-0.06
tespit
-0.06
�
-0.06
POSITIVE LOGITS
Рё
0.07
interle
0.06
vé
0.06
0.06
:".$
0.06
spep
0.06
將
0.06
Kenneth
0.06
облас
0.06
الإ
0.06
Activations Density 0.004%