INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -selection
    -0.07
     آر
    -0.07
     Oregon
    -0.07
     universe
    -0.07
     gelecek
    -0.06
     مغ
    -0.06
     specular
    -0.06
     Goodman
    -0.06
     Writer
    -0.06
     Ether
    -0.06
    POSITIVE LOGITS
    NewProp
    0.07
    /at
    0.07
     accom
    0.06
     begs
    0.06
    0.06
     Pavel
    0.06
    sharp
    0.06
     mak
    0.06
    =name
    0.06
    iệp
    0.06
    Act Density 0.063%

    No Known Activations