INDEX
    Explanations

    phrases indicating completeness or totality

    New Auto-Interp
    Negative Logits
    اÙģØª
    -0.16
    ebi
    -0.15
    jem
    -0.15
    lenme
    -0.15
    est
    -0.15
    esti
    -0.15
    лав
    -0.14
    ziel
    -0.14
    oc
    -0.14
    jen
    -0.14
    POSITIVE LOGITS
    /full
    0.38
    ledged
    0.27
    erton
    0.27
    filled
    0.27
    eren
    0.25
     fled
    0.25
    -full
    0.25
    (full
    0.24
    full
    0.24
    -scale
    0.23
    Act Density 0.053%

    No Known Activations