INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     refusé
    -0.63
     cumplido
    -0.60
     officielles
    -0.54
     varones
    -0.53
     retrouve
    -0.53
     déclaré
    -0.52
     affirmé
    -0.52
     peindre
    -0.51
     tenues
    -0.51
     récupérer
    -0.50
    POSITIVE LOGITS
    <bos>
    0.84
     handling
    0.71
     it
    0.63
     collecting
    0.63
    HtmlAttribute
    0.63
     owning
    0.62
     accessing
    0.61
    TagMode
    0.60
     employing
    0.60
     choosing
    0.60
    Act Density 0.003%

    No Known Activations