INDEX
Explanations
describing properties/usefulness
The neuron fires on promotional or evaluative phrasing—particularly the “making it an ideal choice for…” style of superlative/comparative language.
New Auto-Interp
Negative Logits
removeFromSuperview
-0.07
Fischer
-0.06
Response
-0.06
967
-0.06
panicked
-0.06
zych
-0.06
’i
-0.06
throat
-0.06
junction
-0.06
Compar
-0.06
POSITIVE LOGITS
Tags
0.07
STDERR
0.07
祭
0.06
'><
0.06
いつ
0.06
资
0.06
HOH
0.06
isay
0.06
도별
0.06
totalTime
0.06
Activations Density 0.055%