INDEX
Explanations
personal risk/sacrifice
This neuron activates on words describing personal cost–benefit trade‐offs (e.g. “personal,” “costs,” “benefit”).
New Auto-Interp
Negative Logits
btc
-0.07
Reach
-0.07
iscopal
-0.06
PROGRAM
-0.06
_DEL
-0.06
Adolescent
-0.06
plunge
-0.06
об
-0.06
conscience
-0.06
_free
-0.06
POSITIVE LOGITS
もしれない
0.06
هناك
0.06
Clause
0.06
/*<<<
0.06
secretly
0.06
τιο
0.06
NamedQuery
0.06
Salem
0.06
EIF
0.06
rowData
0.06
Activations Density 0.052%