INDEX
Explanations
recommend
This neuron activates on words and phrasing that refer to making or giving recommendations (e.g. “recommendation,” “recomendar,” etc.).
New Auto-Interp
Negative Logits
('.')↵-0.07
Dry
-0.06
swapped
-0.06
apyrus
-0.06
функци
-0.06
нім
-0.06
дан
-0.06
Paper
-0.06
:',↵
-0.06
ोश
-0.06
POSITIVE LOGITS
voy
0.06
傷
0.06
gıç
0.06
recommendation
0.06
supplemental
0.06
Eq
0.06
ได
0.06
ักษณ
0.06
фото
0.06
بسبب
0.06
Activations Density 0.012%