INDEX
Explanations
phrases indicating contrast or exception within a sentence
phrases that indicate negation or limitations in assertions
New Auto-Interp
Negative Logits
ilage
-0.75
iol
-0.72
Circuit
-0.63
iology
-0.61
iov
-0.61
reperto
-0.60
Royale
-0.60
ulas
-0.59
manif
-0.59
Union
-0.59
POSITIVE LOGITS
necessarily
0.88
whatsoever
0.79
nor
0.75
bothered
0.71
necess
0.69
condone
0.68
endorse
0.67
xus
0.65
ndra
0.65
icably
0.64
Activations Density 0.044%