INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    the
    0.60
    1
    0.51
    i
    0.46
    f
    0.44
    ul
    0.43
    0
    0.43
    st
    0.42
    sthe
    0.38
    an
    0.38
    se
    0.37
    POSITIVE LOGITS
    -
    0.53
     będzie
    0.39
    URE
    0.39
    ON
    0.37
    0.37
     ADR
    0.37
    пить
    0.36
     будет
    0.36
    城区
    0.36
    AY
    0.35
    Act Density 0.114%

    No Known Activations