INDEX
Explanations
expressions of uncertainty or speculation
New Auto-Interp
Negative Logits
Probably
-0.17
zÅĻejmÄĽ
-0.15
probably
-0.15
Apparently
-0.15
probably
-0.15
muht
-0.15
reportedly
-0.15
Basically
-0.14
обÑĭÑĩно
-0.14
licer
-0.14
POSITIVE LOGITS
even
0.26
because
0.22
none
0.22
some
0.22
not
0.22
it
0.21
something
0.20
someone
0.20
more
0.19
they
0.19
Activations Density 0.055%