INDEX
Explanations
The neuron activates on mentions of “ease of use,” i.e. phrases highlighting usability.
New Auto-Interp
Negative Logits
theres
-0.07
něn
-0.07
तरफ
-0.06
่าเป
-0.06
خواست
-0.06
erable
-0.06
autogenerated
-0.06
lavish
-0.06
`s
-0.06
ник
-0.06
POSITIVE LOGITS
art
0.07
Bond
0.07
Priority
0.07
업
0.07
Wind
0.07
cost
0.07
Eat
0.07
организации
0.07
Ruth
0.06
stone
0.06
Activations Density 0.005%