INDEX
Explanations
concepts related to kindness and compassion
New Auto-Interp
Negative Logits
status
-0.36
stör
-0.36
excitement
-0.34
استنادى
-0.34
heiress
-0.33
climbing
-0.33
preco
-0.32
stag
-0.32
prior
-0.32
싶
-0.32
POSITIVE LOGITS
Compassion
1.19
Compassion
1.13
compassionate
1.13
compassion
1.11
kindness
1.01
Kindness
0.98
kindly
0.89
فريبيس
0.88
merciful
0.88
kindness
0.88
Activations Density 0.238%