INDEX
Explanations
instances of the word "two" and its variants
New Auto-Interp
Negative Logits
ly
-1.01
ñores
-0.70
argint
-0.69
ainfi
-0.68
er
-0.66
čaj
-0.66
vectorielles
-0.66
enfans
-0.66
mer
-0.65
vaisselle
-0.64
POSITIVE LOGITS
Према
0.97
Two
0.86
dozen
0.82
CreateTagHelper
0.82
醐
0.81
Two
0.81
CWE
0.81
abetes
0.79
two
0.78
Twee
0.77
Activations Density 0.123%