INDEX
Explanations
emotions and behaviors related to understanding and relating to others, such as empathy, compassion, and sympathy
New Auto-Interp
Negative Logits
ouver
-0.74
kj
-0.74
bris
-0.71
aer
-0.69
orn
-0.68
heimer
-0.67
corn
-0.65
jong
-0.65
arde
-0.65
umbers
-0.64
POSITIVE LOGITS
towards
0.83
toward
0.79
sympathy
0.77
uncond
0.73
arity
0.72
ISM
0.72
ately
0.71
reciproc
0.70
Towards
0.70
pard
0.67
Activations Density 11.108%