INDEX
Explanations
conditional clauses or phrases that indicate exceptions
New Auto-Interp
Negative Logits
ANY
-0.24
ä»»ä½ķ
-0.17
EVER
-0.17
_ANY
-0.17
anytime
-0.17
SHOULD
-0.16
even
-0.16
nawet
-0.16
_any
-0.16
emplates
-0.15
POSITIVE LOGITS
absolutely
0.30
specifically
0.29
explicitly
0.27
Absolutely
0.25
absolute
0.24
expressly
0.24
explicit
0.23
explicit
0.23
somehow
0.22
specific
0.22
Activations Density 0.321%