INDEX
    Explanations

    specifically

    New Auto-Interp
    Negative Logits
     pais
    -0.08
    -0.08
    -0.08
     jail
    -0.07
     সময়
    -0.07
    -0.07
     killings
    -0.07
     ausreichend
    -0.07
     эти
    -0.07
    -0.07
    POSITIVE LOGITS
     Hels
    0.08
     أنه
    0.07
     brasil
    0.07
    unate
    0.07
     Toscana
    0.07
     Бож
    0.07
     правило
    0.07
     Stéph
    0.07
    ších
    0.07
     brighten
    0.07
    Act Density 0.020%

    No Known Activations