INDEX
Explanations
negations or phrases indicating difficulties and complaints
New Auto-Interp
Negative Logits
sometimes
-0.22
sometimes
-0.22
Sometimes
-0.18
adesh
-0.17
bazen
-0.17
иногда
-0.17
Sometimes
-0.16
ubl
-0.16
occasionally
-0.15
даÑĤ
-0.14
POSITIVE LOGITS
anything
0.26
any
0.26
TOO
0.24
too
0.23
necessarily
0.23
much
0.22
major
0.22
overly
0.22
materially
0.21
terribly
0.21
Activations Density 0.221%