INDEX
Explanations
satisfaction
The neuron fires on words expressing emotional relief or pleasurable satisfaction.
New Auto-Interp
Negative Logits
factors
-0.07
GEN
-0.07
Factors
-0.07
UTF
-0.07
expected
-0.07
feudal
-0.06
-Saharan
-0.06
,F
-0.06
Abr
-0.06
primitives
-0.06
POSITIVE LOGITS
etcode
0.07
FACE
0.06
Mellon
0.06
ار
0.06
iktidar
0.06
�
0.06
spr
0.06
django
0.06
saved
0.06
amer
0.05
Activations Density 0.006%