INDEX
Explanations
variants of the word "or" in various contexts
New Auto-Interp
Negative Logits
NEVER
-0.17
neither
-0.17
Never
-0.16
Never
-0.16
ç¦
-0.15
Neither
-0.15
Neither
-0.15
never
-0.15
_NE
-0.14
inski
-0.14
POSITIVE LOGITS
not
0.48
not
0.38
Not
0.35
Not
0.34
_not
0.28
.not
0.27
otherwise
0.26
not
0.26
NOT
0.25
-not
0.24
Activations Density 0.051%