INDEX
Explanations
exceptions or contrasts in a context
the word "except" and its variations, highlighting exceptions or contrasts in statements
New Auto-Interp
Negative Logits
pires
-0.66
Flavoring
-0.64
natureconservancy
-0.62
intel
-0.56
llah
-0.56
utra
-0.54
ked
-0.54
parsed
-0.54
ULT
-0.54
urity
-0.53
POSITIVE LOGITS
ional
1.48
insofar
1.06
ing
1.06
maybe
0.90
perhaps
0.86
for
0.81
possibly
0.74
arus
0.71
ably
0.71
ed
0.70
Activations Density 0.031%