INDEX
Explanations
the word "Perhaps" as an indicator of speculation or uncertainty
New Auto-Interp
Negative Logits
isoft
-0.17
omer
-0.15
assa
-0.15
licer
-0.15
ujemy
-0.15
arna
-0.15
rane
-0.14
rone
-0.14
_unsigned
-0.14
owi
-0.14
POSITIVE LOGITS
even
0.19
someday
0.18
slightly
0.18
because
0.18
not
0.17
-times
0.16
some
0.16
Äijây
0.15
none
0.15
more
0.15
Activations Density 0.036%