INDEX
Explanations
contractions of "is not"
negations or phrases indicating absence or lack
New Auto-Interp
Negative Logits
towed
-0.89
facult
-0.72
Jinn
-0.69
Powered
-0.65
transformed
-0.65
doomed
-0.64
guided
-0.64
Antar
-0.63
tricked
-0.62
entertained
-0.62
POSITIVE LOGITS
't
1.13
dayName
0.78
xus
0.77
AMY
0.75
fore
0.73
eway
0.73
ileged
0.72
nox
0.71
advertisement
0.70
DEN
0.70
Activations Density 0.032%