INDEX
Explanations
negations or denials in statements
phrases indicating negation or denial
New Auto-Interp
Negative Logits
éĥ
-0.79
éĹĺ
-0.73
interstitial
-0.71
çīĪ
-0.68
Turns
-0.67
è¿
-0.64
WAY
-0.63
å¥
-0.63
Tours
-0.63
ãĤ¼ãĤ¦ãĤ¹
-0.63
POSITIVE LOGITS
necessarily
1.44
icably
1.27
icable
1.16
withstanding
1.03
yet
1.01
exactly
1.01
necess
0.95
epad
0.91
orious
0.91
eworthy
0.91
Activations Density 0.205%