INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Innoc
    -0.07
     Nur
    -0.07
     Sad
    -0.07
     oo
    -0.06
    $l
    -0.06
     зуб
    -0.06
     بور
    -0.06
     numa
    -0.06
    Lua
    -0.06
     Kale
    -0.06
    POSITIVE LOGITS
     West
    0.14
    West
    0.12
     WEST
    0.12
     East
    0.09
     west
    0.09
    -West
    0.09
    EST
    0.09
     Batı
    0.08
    Western
    0.08
    WEST
    0.08
    Act Density 0.028%

    No Known Activations