INDEX
Explanations
phrases indicating completeness or totality
New Auto-Interp
Negative Logits
اÙģØª
-0.16
ebi
-0.15
jem
-0.15
lenme
-0.15
est
-0.15
esti
-0.15
лав
-0.14
ziel
-0.14
oc
-0.14
jen
-0.14
POSITIVE LOGITS
/full
0.38
ledged
0.27
erton
0.27
filled
0.27
eren
0.25
fled
0.25
-full
0.25
(full
0.24
full
0.24
-scale
0.23
Activations Density 0.053%