INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     breakthroughs
    -0.08
     empresários
    -0.08
     investidores
    -0.07
     satisfait
    -0.07
    rote
    -0.07
     Accountability
    -0.07
    Até
    -0.07
     ఎవ
    -0.07
     desejos
    -0.07
    (trim
    -0.07
    POSITIVE LOGITS
     aptly
    0.10
     literally
    0.10
    ,就是
    0.09
     obviously
    0.09
     Literally
    0.08
    яд
    0.08
     évidemment
    0.08
    -таки
    0.08
    就是
    0.08
    י
    0.08
    Act Density 0.032%

    No Known Activations