INDEX
Negative Logits
attack
-1.06
Attack
-0.91
attack
-0.91
Attack
-0.82
يتيمه
-0.79
attacked
-0.72
ataque
-0.71
ATTACK
-0.70
attaque
-0.68
ConstraintMaker
-0.66
POSITIVE LOGITS
distractions
0.66
æl
0.60
distraction
0.59
mysteries
0.59
cancellations
0.57
controversies
0.56
drills
0.56
Inquiries
0.56
accidents
0.56
nthetic
0.56
Activations Density 0.124%