INDEX
Explanations
positive actions related to helping others and community service
New Auto-Interp
Negative Logits
izzo
-0.20
çķ
-0.17
indh
-0.15
ially
-0.15
ehler
-0.15
ongyang
-0.15
åģ¥
-0.14
izr
-0.14
iture
-0.14
osti
-0.13
POSITIVE LOGITS
service
0.31
philanth
0.29
humanitarian
0.28
charitable
0.27
helping
0.27
community
0.26
-service
0.25
social
0.25
giving
0.25
charity
0.25
Activations Density 0.096%