INDEX
Explanations
folding movement
The neuron fires on words describing folding or stowing actions or positions (e.g. “fold,” “folded,” “stowed,” “flat”).
New Auto-Interp
Negative Logits
Rus
-0.08
merciless
-0.07
notice
-0.07
Independent
-0.07
872
-0.07
, ↵
-0.07
873
-0.06
402
-0.06
(){
↵-0.06
くん
-0.06
POSITIVE LOGITS
Schedule
0.07
.</
0.07
磁
0.06
unpaid
0.06
Veter
0.06
Voyage
0.06
SEP
0.06
Display
0.06
ató
0.06
embod
0.06
Activations Density 0.010%