INDEX
    Explanations

    the word "behind" and, to a lesser extent, words related to publishing material

    New Auto-Interp
    Negative Logits
     behind
    -2.50
    behind
    -2.27
    Behind
    -2.11
     Behind
    -2.06
     BEHIND
    -2.03
     derrière
    -1.94
     detrás
    -1.75
     dietro
    -1.72
    <bos>
    -1.46
     bakom
    -1.45
    POSITIVE LOGITS
    }`}
    0.53
     Starting
    0.51
     omge
    0.47
     zase
    0.47
     labdar
    0.47
     voet
    0.46
    vyk
    0.44
    ışık
    0.44
    dom
    0.43
     Altman
    0.43
    Act Density 2.545%

    No Known Activations