INDEX
    Explanations

    place names and initialisms

    New Auto-Interp
    Negative Logits
     It
    0.85
    AT
    0.77
    t
    0.70
    re
    0.68
    an
    0.68
     for
    0.68
     that
    0.68
     it
    0.66
    that
    0.64
    et
    0.64
    POSITIVE LOGITS
     powied
    0.68
     были
    0.68
    ปี
    0.68
    ,.
    0.66
     dolayı
    0.61
    ;
    0.60
     написа
    0.59
     περιο
    0.59
    ز
    0.59
     σε
    0.58
    Act Density 0.766%

    No Known Activations