INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    lamaz
    -0.07
     موجب
    -0.07
    Adventure
    -0.06
    _languages
    -0.06
     pitches
    -0.06
    byn
    -0.06
    -0.06
    tım
    -0.06
    /report
    -0.06
     dword
    -0.06
    POSITIVE LOGITS
    ouce
    0.06
    scaled
    0.06
     Carly
    0.06
     시즌
    0.06
     Came
    0.06
    менно
    0.06
    func
    0.06
    Beauty
    0.06
     fringe
    0.06
     textured
    0.06
    Act Density 0.008%

    No Known Activations