INDEX
Explanations
expressions of uniqueness or distinctiveness
New Auto-Interp
Negative Logits
censura
-0.61
fós
-0.58
scold
-0.56
ędzy
-0.56
ladite
-0.55
inves
-0.55
ConstraintMaker
-0.53
shi
-0.53
devriez
-0.53
InlineData
-0.53
POSITIVE LOGITS
unique
3.17
unique
2.96
Unique
2.89
Unique
2.84
UNIQUE
2.75
UNIQUE
2.50
uniques
2.41
uniqueness
2.38
uniquely
2.25
unieke
2.20
Activations Density 0.056%