INDEX
Explanations
expressions related to empathy and sympathetic feelings
New Auto-Interp
Negative Logits
ÃŃt
-0.17
oplan
-0.16
ecome
-0.15
ý
-0.15
ebi
-0.15
acci
-0.14
sm
-0.14
iert
-0.14
imore
-0.14
boa
-0.14
POSITIVE LOGITS
sympath
0.28
sympathetic
0.28
towards
0.26
sympathy
0.26
empath
0.24
compass
0.23
empathy
0.23
toward
0.23
etic
0.21
etically
0.21
Activations Density 0.015%