INDEX
    Explanations

    references to historical figures or religious leaders

    New Auto-Interp
    Negative Logits
     çoÄŁ
    -0.16
     DeÄŁer
    -0.15
     hâl
    -0.15
    ataire
    -0.15
    deÅŁ
    -0.14
     бли
    -0.14
    abaj
    -0.14
    rej
    -0.14
    REQ
    -0.14
    evice
    -0.14
    POSITIVE LOGITS
     Ankara
    0.20
     Pam
    0.19
     Asian
    0.18
    Asia
    0.18
     Asia
    0.18
     asian
    0.18
     Kale
    0.18
     Batman
    0.17
    غاز
    0.17
    Batman
    0.17
    Act Density 0.020%

    No Known Activations