INDEX
Explanations
descriptive verbs that convey perception and appearances
New Auto-Interp
Negative Logits
irl
-0.06
ntax
-0.06
ocale
-0.06
lis
-0.06
maint
-0.06
ç¶Ń
-0.06
anc
-0.06
zos
-0.06
gan
-0.06
esz
-0.06
POSITIVE LOGITS
like
0.28
Like
0.21
Like
0.20
like
0.19
LIKE
0.19
_like
0.18
likes
0.17
như
0.17
wie
0.17
.like
0.17
Activations Density 0.014%