INDEX
    Explanations

    common English language

    New Auto-Interp
    Negative Logits
    ائية
    -0.08
    (identity
    -0.07
    authority
    -0.07
    ;"↵
    -0.07
     sanitized
    -0.06
    OG
    -0.06
    :“
    -0.06
     ruins
    -0.06
    <<(
    -0.06
     Gy
    -0.06
    POSITIVE LOGITS
     находится
    0.07
    σμο
    0.07
    ptic
    0.06
    click
    0.06
     situation
    0.06
    reference
    0.06
     sticking
    0.06
    0.06
     epoll
    0.06
     ситуа
    0.06
    Act Density 0.000%

    No Known Activations