INDEX
Explanations
phrases that convey conditional statements or exceptions
New Auto-Interp
Negative Logits
326
-0.16
uma
-0.16
809
-0.15
ajar
-0.14
826
-0.14
ummies
-0.14
dera
-0.14
Lair
-0.14
urdy
-0.14
quets
-0.14
POSITIVE LOGITS
otherwise
0.24
specifically
0.23
otherwise
0.22
explicitly
0.21
OTHERWISE
0.19
expressly
0.18
Otherwise
0.18
specific
0.18
explicit
0.18
especÃŃf
0.18
Activations Density 0.041%