INDEX
    Explanations

    specific phrases or structures indicating requirements, conditions, or criteria

    New Auto-Interp
    Negative Logits
    наÑĩе
    -0.15
    iscard
    -0.14
    rawn
    -0.13
    огод
    -0.13
    others
    -0.12
    thing
    -0.12
    /REC
    -0.12
    &S
    -0.12
    ÐIJÑĢÑħÑĸв
    -0.12
    ojis
    -0.12
    POSITIVE LOGITS
     following
    1.22
    following
    1.03
     Following
    0.93
    Following
    0.85
    以ä¸ĭ
    0.82
     seguint
    0.81
     siguientes
    0.81
     below
    0.80
     siguiente
    0.75
     suiv
    0.69
    Act Density 0.255%

    No Known Activations