INDEX
Explanations
phrases relating to support, inspiration, and helping others
New Auto-Interp
Negative Logits
amedi
-0.16
affle
-0.15
Deal
-0.14
uggle
-0.14
ARGET
-0.14
uisse
-0.14
yz
-0.14
åģ¥
-0.14
udem
-0.14
lé
-0.13
POSITIVE LOGITS
contribution
0.31
Contribution
0.30
contribute
0.29
help
0.28
contrib
0.26
contributes
0.26
Contrib
0.24
contributing
0.24
impact
0.24
Contrib
0.24
Activations Density 0.097%