INDEX
Explanations
phrases indicating completeness or totality
New Auto-Interp
Negative Logits
jte
-0.17
jen
-0.16
ebi
-0.16
jem
-0.16
ansen
-0.15
zsche
-0.15
ziel
-0.14
ennes
-0.14
zelf
-0.14
ML
-0.14
POSITIVE LOGITS
erton
0.32
fled
0.30
blown
0.30
ledged
0.29
eren
0.29
/full
0.29
-length
0.28
-scale
0.28
filled
0.28
ständ
0.28
Activations Density 0.057%