INDEX
Explanations
negative expressions or statements indicating non-compliance or rejection
"If you don't" or "If not"
if not / if don't
New Auto-Interp
Negative Logits
only
-0.66
only
-0.63
zwar
-0.60
навіть
-0.59
даже
-0.59
never
-0.57
even
-0.57
sekali
-0.57
なし
-0.57
even
-0.57
POSITIVE LOGITS
already
0.91
Already
0.77
explicitly
0.76
otherwise
0.74
redan
0.73
expressly
0.72
already
0.71
__':
0.70
ALREADY
0.69
immediately
0.68
Activations Density 0.219%