INDEX
Explanations
feelings and emotions
The neuron fires on empathetic or reassuring words and short phrases (e.g. “normal,” “important,” “understand,” “can,” “time”) that occur in supportive, consolation‐style responses.
New Auto-Interp
Negative Logits
Perhaps
-0.07
oli
-0.07
PushMatrix
-0.06
jugar
-0.06
.Threading
-0.06
Make
-0.06
potential
-0.06
xml
-0.06
explanation
-0.06
todo
-0.06
POSITIVE LOGITS
STRU
0.06
0.06
hana
0.06
dinosaurs
0.06
setError
0.06
.appspot
0.06
lama
0.06
calle
0.06
탁
0.06
TTY
0.06
Activations Density 0.023%