INDEX
    Explanations

    Foreign languages

    New Auto-Interp
    Negative Logits
    Controls
    -0.06
    hon
    -0.06
    เดอร
    -0.06
     Disposable
    -0.06
    _services
    -0.06
    Copying
    -0.06
     gnome
    -0.06
     adolescents
    -0.06
     coronary
    -0.06
    Philadelphia
    -0.06
    POSITIVE LOGITS
     és
    0.07
     Karma
    0.07
     salope
    0.06
    ahu
    0.06
     donner
    0.06
     damn
    0.06
     Legal
    0.06
     огля
    0.06
     bakeca
    0.06
     mio
    0.06
    Act Density 0.170%

    No Known Activations