INDEX
Explanations
phrases related to limitations and restrictions in processes or decisions
New Auto-Interp
Negative Logits
iele
-0.16
auga
-0.15
619
-0.15
uzzi
-0.14
rase
-0.14
lassian
-0.14
_tA
-0.14
785
-0.14
.Mult
-0.14
ersed
-0.14
POSITIVE LOGITS
vos
0.16
udy
0.14
بÙĦ
0.14
ennifer
0.14
Jersey
0.14
neither
0.14
Ree
0.14
ãĥĥãĥĹ
0.14
erot
0.14
ith
0.13
Activations Density 0.134%