INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    وغ
    -0.06
     merge
    -0.06
     majet
    -0.06
    bedo
    -0.06
    ايد
    -0.06
     anders
    -0.06
    aes
    -0.06
     Merge
    -0.06
     Marino
    -0.06
     μπορού
    -0.06
    POSITIVE LOGITS
     čin
    0.07
    imp
    0.07
    IMP
    0.07
    ints
    0.06
    spin
    0.06
     حر
    0.06
    work
    0.06
    	DECLARE
    0.06
     stup
    0.06
    0.06
    Act Density 0.001%

    No Known Activations