INDEX
Explanations
phrases related to collaboration and community involvement
New Auto-Interp
Negative Logits
Shock
-0.15
533
-0.15
ilon
-0.15
ë°±
-0.14
elow
-0.14
abor
-0.14
aster
-0.14
Choices
-0.13
tfoot
-0.13
oir
-0.13
POSITIVE LOGITS
like
0.35
lik
0.34
driven
0.27
individuals
0.27
_like
0.26
forward
0.26
Like
0.26
chang
0.25
Like
0.25
committed
0.23
Activations Density 0.253%