INDEX
Explanations
The neuron fires on courteous, formulaic expressions of thanks, gratitude, or polite greetings in formal addresses.
New Auto-Interp
Negative Logits
^^
-0.07
olics
-0.06
цион
-0.06
Las
-0.06
vasion
-0.06
خاص
-0.06
vron
-0.06
NUM
-0.06
adge
-0.06
_sec
-0.06
POSITIVE LOGITS
drives
0.07
pela
0.07
Colin
0.06
Snapshot
0.06
_Bl
0.06
]').
0.06
_TRACE
0.06
****************************************************************************
0.06
_mime
0.06
Герм
0.06
Activations Density 0.032%