INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.09
     wa
    -0.08
     ва
    -0.07
    wa
    -0.07
     Sax
    -0.07
    Bills
    -0.07
     paredes
    -0.07
     tyres
    -0.07
    _bill
    -0.07
     headquartered
    -0.07
    POSITIVE LOGITS
     auss
    0.08
     discriminatory
    0.08
     deficiência
    0.08
    .colors
    0.08
     drought
    0.08
     gestalt
    0.08
     Dul
    0.07
     padrões
    0.07
     disability
    0.07
     distinctive
    0.07
    Act Density 0.003%

    No Known Activations