INDEX
Explanations
negative contractions, particularly "doesn't."
New Auto-Interp
Negative Logits
toBe
-0.71
varit
-0.61
wären
-0.60
թվական
-0.58
wares
-0.57
aternary
-0.56
اعد
-0.55
seront
-0.54
ریح
-0.53
sebelumnya
-0.53
POSITIVE LOGITS
do
0.92
DO
0.91
httphttps
0.88
does
0.86
Does
0.85
DOES
0.84
DOES
0.82
does
0.81
didst
0.80
ും
0.77
Activations Density 0.239%