INDEX
Explanations
language related to collaboration and community involvement
New Auto-Interp
Negative Logits
oser
-0.17
Boeh
-0.15
Combo
-0.14
jos
-0.14
ihan
-0.14
_COUNTER
-0.14
hait
-0.14
hou
-0.13
rese
-0.13
bern
-0.13
POSITIVE LOGITS
participation
0.32
particip
0.32
involvement
0.28
Participation
0.26
everyone
0.24
involve
0.23
involving
0.23
everybody
0.23
-shared
0.21
community
0.21
Activations Density 0.091%