INDEX
    Explanations

    phrases indicating completeness or totality

    New Auto-Interp
    Negative Logits
    jte
    -0.17
    jen
    -0.16
    ebi
    -0.16
    jem
    -0.16
    ansen
    -0.15
    zsche
    -0.15
    ziel
    -0.14
    ennes
    -0.14
    zelf
    -0.14
    ML
    -0.14
    POSITIVE LOGITS
    erton
    0.32
     fled
    0.30
     blown
    0.30
    ledged
    0.29
    eren
    0.29
    /full
    0.29
    -length
    0.28
    -scale
    0.28
    filled
    0.28
    ständ
    0.28
    Act Density 0.057%

    No Known Activations