INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     HIT
    -0.09
     Woo
    -0.09
     HIS
    -0.08
     saved
    -0.08
    hibition
    -0.08
     HO
    -0.07
    HO
    -0.07
    .bill
    -0.07
     forex
    -0.07
     reverse
    -0.07
    POSITIVE LOGITS
     aparência
    0.09
    0.09
     kosmet
    0.09
     Naked
    0.08
     ಪದ
    0.08
    0.08
     drunken
    0.08
     지방
    0.08
    大陆
    0.08
    0.08
    Act Density 0.004%

    No Known Activations