INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     yara
    -0.08
     yum
    -0.08
     attorney
    -0.08
    observ
    -0.07
     sare
    -0.07
     yaran
    -0.07
     Cerc
    -0.07
    submit
    -0.07
     ceremony
    -0.07
     fromage
    -0.07
    POSITIVE LOGITS
    608
    0.07
     quibus
    0.07
     Rue
    0.07
     absol
    0.07
    Ratio
    0.07
     ratio
    0.07
     trener
    0.07
    63
    0.07
     comparaison
    0.07
    ോധ
    0.07
    Act Density 0.002%

    No Known Activations