INDEX
Explanations
expressions of making a positive impact or difference in the community
New Auto-Interp
Negative Logits
볨
-0.14
äºŃ
-0.14
ogn
-0.14
Parl
-0.14
ITOR
-0.14
odo
-0.14
ogany
-0.14
izabeth
-0.13
iasi
-0.13
æĭ¥
-0.13
POSITIVE LOGITS
lives
0.27
Lives
0.25
impact
0.25
impact
0.22
Impact
0.22
benefit
0.21
effect
0.21
Impact
0.20
help
0.19
contribution
0.19
Activations Density 0.321%