INDEX
Explanations
phrases that emphasize inclusion or membership in a group or category
New Auto-Interp
Negative Logits
humaines
-0.66
ſche
-0.66
($)
-0.64
lorette
-0.63
thalene
-0.62
Philo
-0.60
sandero
-0.60
lepto
-0.58
Sola
-0.57
Arno
-0.57
POSITIVE LOGITS
including
1.42
Including
1.36
including
1.29
Including
1.28
INCLUDING
1.26
INCLUDING
1.24
waaronder
1.16
einschließlich
1.11
включая
1.04
incluindo
1.01
Activations Density 0.113%