INDEX
Explanations
requests for assistance or input from others
New Auto-Interp
Negative Logits
amba
-0.15
Citation
-0.15
enheim
-0.15
imli
-0.15
jÃł
-0.15
yne
-0.14
avage
-0.14
olik
-0.14
Binder
-0.14
leve
-0.13
POSITIVE LOGITS
cooperation
0.19
help
0.18
attention
0.17
attention
0.16
opinion
0.16
sey
0.15
input
0.15
éħįåIJĪ
0.15
thoughts
0.14
èĪį
0.14
Activations Density 0.090%