INDEX
Explanations
AI versus human
This neuron detects mentions of human-centered, empathy‐ or interpersonal‐related skills and qualities (e.g., judgment, empathy, human, interpersonal, aspects).
New Auto-Interp
Negative Logits
coch
-0.08
dagger
-0.07
beef
-0.06
Ctx
-0.06
speech
-0.06
ptom
-0.06
arr
-0.06
ocop
-0.06
pá
-0.06
CRY
-0.06
POSITIVE LOGITS
TREE
0.07
舰
0.06
DIRECTORY
0.06
United
0.06
intf
0.06
icago
0.06
component
0.06
=true
0.06
ğine
0.06
_genre
0.06
Activations Density 0.025%