INDEX
    Explanations

    English language and related concepts

    New Auto-Interp
    Negative Logits
    '
    0.79
     are
    0.64
     anos
    0.64
    ara
    0.63
    0.61
     ia
    0.57
    UR
    0.56
    ارين
    0.54
     azienda
    0.54
     elle
    0.52
    POSITIVE LOGITS
    อังกฤษ
    1.01
     English
    0.91
    English
    0.91
     Анг
    0.84
     ENGLISH
    0.82
     Englisch
    0.80
     영어
    0.78
     Englishman
    0.78
     inglés
    0.77
    ENGLISH
    0.76
    Act Density 0.023%

    No Known Activations