INDEX
Explanations
negations or contradictions in statements
negations or phrases indicating something is not true or not allowed
New Auto-Interp
Negative Logits
å¥
-0.80
éĹĺ
-0.76
Tours
-0.67
Laws
-0.66
éĥ
-0.65
kamp
-0.64
èĪ
-0.64
ership
-0.64
Might
-0.64
çļ
-0.63
POSITIVE LOGITS
necessarily
1.34
icably
1.24
icable
1.21
epad
1.03
exactly
1.02
eworthy
1.01
withstanding
1.01
orious
0.96
yet
0.96
bothered
0.93
Activations Density 0.155%