INDEX
Explanations
emphasized adverbs that suggest consistency or frequency
New Auto-Interp
Negative Logits
ãģ¯
-0.25
are
-0.25
's
-0.23
is
-0.22
’s
-0.21
were
-0.20
was
-0.20
ìĿĢ
-0.18
adalah
-0.17
will
-0.16
POSITIVE LOGITS
vÄĽt
0.15
wel
0.15
ANNOT
0.14
sembl
0.14
být
0.14
OLON
0.14
ude
0.13
تا
0.13
ifies
0.13
aled
0.13
Activations Density 0.319%