INDEX
    Explanations

    following specific words

    New Auto-Interp
    Negative Logits
    ují
    0.45
    ल्ड
    0.44
     Prefeitura
    0.43
    бу
    0.41
     крем
    0.40
     curfew
    0.40
     パー
    0.39
    χει
    0.39
     excepción
    0.38
    ぶり
    0.38
    POSITIVE LOGITS
     objectively
    0.44
     അദ്ദേഹ
    0.40
     Obwohl
    0.38
     geboren
    0.38
    External
    0.37
     analogous
    0.37
     কীভাবে
    0.36
     نسل
    0.35
     descubre
    0.35
    ega
    0.35
    Act Density 0.015%

    No Known Activations